Pandas provide methods to read different file formats using a single line of code. When reading an xlsx file, rd.biffh.XLRDError: Excel xlsx file; not supported error might occur.
You can solve xlrd.biffh.XLRDError: Excel xlsx file; not supported Error by updating the Pandas library version to the latest version.
Now let us see the reason for this error and how to solve it.
Reason For the Error
The Pandas cannot open an Excel (.xlsx) file when you use the read_excel()
method available in the Pandas library version earlier than V1.2.0.
Because the versions older than 1.2.0 internally uses the xlrd library to read the excel files. The xlrd
library supports ONLY reading the .xls
files. It doesn’t support reading the .xlsx
or .xlsm
files any longer.
Hence, you need to use a different approach to solve this problem.
Table of Contents
Solution 1 — Upgrading Pandas To the Latest Version
You can solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported
error by upgrading the Pandas version to the latest version.
By default, the latest version uses the openpyxl library. This library supports reading the .xlsx
file and .xlsm
files.
To know the current Pandas library version, use the below snippet.
Code
pd.__version__
Output
'1.1.2'
You can upgrade the Pandas library to the latest version using the below statement.
Code
%pip install --upgrade pandas
Prefixing the %
in PIP lets you update the packages directly from Jupyter.
Output
Successfully installed pandas-1.4.2
Note: you may need to restart the kernel to use updated packages.
The Pandas library is upgraded to the latest version, and also the dependent libraries are updated.
With the latest library, you can use the read_excel() method directly to read the excel files.
Use the code below to read the xlsx file or xlsm file.
Code
import pandas as pd
df = pd.read_excel(
"sample.xlsx”
)
The file will be read, and the dataframe will be populated. You can print the dataframe to see the values in the excel file.
Code
df
Dataframe Will Look Like
Column 0 | Column 1 | Column 2 | |
---|---|---|---|
0 | NaN | NaN | NaN |
1 | NaN | NaN | NaN |
2 | NaN | NaN | NaN |
3 | NaN | NaN | NaN |
This is how you need to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported Error.
Solution 2 — Using Openpyxl Engine
You can use the openpyxl engine to read the xlsx file.
This is an alternate way to solve the 8xlrd.biffh.XLRDError: Excel xlsx file; not supported* Error.
If you do not want to upgrade the Pandas library to the latest version, you shall use this solution. However, it is always recommended to use the latest library versions to avoid security threats to your application.
Install the openpyxl
library using the below snippet.
Code
%pip install openpyxl
Prefixing the %
in PIP lets you install the packages directly from Jupyter.
Output
Successfully installed openpyxl v3.0.9
Note: you may need to restart the kernel to use updated packages.
Now, specify the openpyxl
engine in the pandas read_excel()
method to read the xlsx file or an xlsm file.
The code below demonstrates how to specify the openpyxl
engine in the read_excel()
method.
Code
df = pd.read_excel(
"sample.xlsm",
engine='openpyxl')
The file will be read successfully, and you can print the dataframe to see the values.
df
Conclusion
Pandas cannot open an Excel (.xlsx) file using the read_excel()
method when you’re using the Pandas version earlier than V1.2.0.
You’ve learned how to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported error by updating the pandas library or using the alternate engine openpyxl
.
It is also recommended to update the libraries to the latest versions to avoid any security threats to your application.