How to Read An Excel File in Pandas – With Examples

Pandas allow you to read excel files with a single line of code.

You can read an excel file in Pandas using the pd.read_excel() method.

Basic Example

  • Use the pd.read_excel() method to read an excel file in Pandas
  • The first sheet in the excel file will be read if no sheet name is specified
import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx"
)

df
  • The excel file is read, and a dataframe is created.

In this tutorial, you’ll learn how to read an excel file and handle header information in the excel file.

Using read_excel()

The read_excel() method in Pandas reads an excel file.

  • It accepts the file path as a parameter
  • It reads the excel and creates a dataframe out of the data in excel

Code

The following code reads the excel file textExcel.xlsx and assigns it to the dataframe df.

import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx"
)

df

Dataframe Will Look Like

Column 1Column 2
0C1 Row1C2 Row1
1C1 Row2C2 Row2
2C1 Row3C2 Row3
3C1 Row4C2 Row 4
4C1 Row5C2 Row 5

This is how the read_excel() method is used to read the excel file using Pandas.

You can select rows from the dataframe to check its data.

Reading Specific Sheet in Excel Using Read_excel()

To read a specific sheet in an excel file

  • Specify the sheet name using the sheet_Name parameter
  • Or pass the sheet index if your sheets are ordered, or pass the sheet name directly.

To read multiple sheets from excel in one line, read How To Read Excel With Multiple Sheets In Pandas?

Code

import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx", sheet_name = 'MY_Sheet_1'
)

df

Read Excel File With Header

The header parameter in the read_excel() method allows you to read an excel file with a Header in Pandas.

  • This is an optional parameter; if this parameter is not specified, it’ll default to 0.
  • The first row in the excel file will be considered Headers.

Code

import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx", header =0)

df

The excel file will be read, and a dataframe will be created with the header information.

Dataframe Will Look like

Column 1Column 2
0C1 Row1C2 Row1
1C1 Row2C2 Row2
2C1 Row3C2 Row3
3C1 Row4C2 Row 4
4C1 Row5C2 Row 5

This is how you can handle the header information using the header parameter in the read_excel() method.

Once the files are read, you can Rename Column in pandas if you want to use custom column headers.

Read Excel File – Skip Rows

This section will teach you how to skip rows while reading an excel file using Pandas.

The skiprows parameter skips the desired number of rows.

  • This is an optional parameter, and it defaults to 0 when not specified.
  • No rows will be skipped when this parameter is not used.

Code

import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx", skiprows =2)

df

The below dataframe shows that the first two rows are skipped.

Dataframe Will Look Like

C1 Row2C2 Row2
0C1 Row3C2 Row3
1C1 Row4C2 Row 4
2C1 Row5C2 Row 5

This is how you can skip rows while reading excel files in Pandas.

Read Excel File with OpenPyxl

In this section, you’ll learn how to use the OpenPyxl engine to read the excel file in pandas.

  • The Pandas version greater than 1.2.0 uses the openpyxl library by default.
  • When using the earlier version of Pandas (Less than 1.2.0), it uses the xlrd engine to read the excel files. This xlrd engine supports only reading the .xls files.

When you read a .xlsx file using the read_excel() method, it’ll throw xlrd.biffh.XLRDError: Excel xlsx file; not supported Error. In this case, you need to specify the engine openpyxl to read the .xlsx files.

Code

Install the library openpyxl engine using the following statement.

%pip install openpyxl

Prefixing the % in PIP lets you update the packages directly from Jupyter.

Output

    Requirement already satisfied: openpyxl in /opt/anaconda3/lib/python3.9/site-packages (3.0.9)
    Requirement already satisfied: et-xmlfile in /opt/anaconda3/lib/python3.9/site-packages (from openpyxl) (1.1.0)
    Note: you may need to restart the kernel to use updated packages.

The package will be installed successfully.

You can use this to read an xlsx file with an older pandas version.

The following code demonstrates how to use the openpyxl engine to read the testExcel.xlsx files.

import pandas as pd

df = pd.read_excel(
     "testExcel.xlsx",
     engine='openpyxl')

The excel file will be read successfully, and the dataframe df will be created.

Additional Resources

Leave a Comment