When cleaning data for machine learning, you need to find if any value is NaN in the dataset.
You can check if any value is NaN in pandas dataframe using df.isna().values.any() statement.
In this tutorial, you’ll learn how to check if any value is NaN in a pandas dataframe.
If You’re in Hurry…
You can use the below statement to check if any value is NaN in the entire pandas dataframe.
Code
df.isna().values.any()
Output
True
If You Want to Understand Details, Read on…
Dataset may contain missing values. Missing values are denoted using pd.Nat
or None
.
You can check if any value is Nan in the Pandas dataframe using the isna()
method or isnull()
method. There are no difference between the isna() and isnull() methods. Both do the same job. Let us see how to use these methods in different use cases.
If you would like to count the NaN values in the pandas dataframe, read How To Count Nan Values In Pandas Dataframe.
Table of Contents
Sample Dataframe
This is the sample dataframe used throughout the tutorial.
It contains,
- Rows with values for all columns
- Rows with Empty or Missing Data for each column
- Rows with Empty or Missing data for all columns
- One Duplicate row
- One column in the sample dataframe is of the
float
type.
Code
import pandas as pd
data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],
"Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],
"No_Of_Units":[5,5, 10, 20, 20, 8,pd.NaT],
"Available_Quantity":[5,6,10,"Not Available","Not Available", pd.NaT,pd.NaT],
"Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
}
df = pd.DataFrame(data)
df = df.astype({"Unit_Price": float})
df
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.000 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.000 | 5 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
4 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |
6 | NaT | NaN | NaT | NaT | NaT |
You’ll use this dataframe to check if any value is missing.
Using isna()
You can use the isna() method to check if any value in the dataframe is missing.
It returns a mask of True
or False
for each cell of the dataframe based on the missing value.
True
denotes missing valuesFalse
denotes the available values
Code
df.isna()
Each cell will have a value of True
or False
.
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | False | False | False | False | False |
1 | False | False | False | False | False |
2 | False | False | False | False | False |
3 | False | False | False | False | False |
4 | False | False | False | False | False |
5 | False | False | False | True | False |
6 | True | True | True | True | True |
Using isnull()
You can use the isnull() method to check if any value in the dataframe is missing. isnull()
is also similar to isna()
method.
It also returns a mask of True
or False
for each cell of the dataframe based on the missing value.
True
denotes missing valuesFalse
denotes the available values
Code
df.isnull()
Each cell will have a value of True
or False
.
Dataframe Will Look like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | False | False | False | False | False |
1 | False | False | False | False | False |
2 | False | False | False | False | False |
3 | False | False | False | False | False |
4 | False | False | False | False | False |
5 | False | False | False | True | False |
6 | True | True | True | True | True |
In different use cases, you’ll see how to use the isna()
method or isnull()
method.
Check if Any Value is NaN in Single Column
You can use the isnull()
method with the any() method to check if any value in the specific column is null or not.
If ANY of the values is missing, it returns a single True
.
Code
The code below demonstrates how to check if there are any missing values in the column Unit_Price.
df['Unit_Price'].isnull().values.any()
Since the Unit_Price column contains missing values, you’ll see the output True
.
Output
True
Check if Any Value is NaN in Multiple Columns
You can use the isnull()
method with the any() method to check if any values in the multiple columns are null or not.
You need to pass the multiple columns as a list and it selects the subset of those specific columns. Then the isna()
method checks if any value is missing in those particular columns.
If ANY of the values is missing, it returns a single True
.
df[['Unit_Price','product_name']].isna().values.any()
Since the columns Unit_Price and product_name contain missing values, you’ll see the output True
.
Output
True
Check if Any value is NaN in Entire Dataframe
You can apply the isna()
and the any()
method directly to the dataframe df
to check if any value is NaN in the entire dataframe.
Code
The code below demonstrates how to check if any value is missing the entire dataframe using the isna()
and any()
methods.
df.isna().values.any()
Since the dataframe has some missing values, you’ll see the output True
.
Output
True
Find Rows with NaN in a column
In this section, you’ll learn how to select rows with missing values in a specific column.
You can select the subset of the specific column and apply the isna()
method. This will return a mask that denotes the rows with missing values. Then using the mask, the rows will be retrieved.
Code
The code below demonstrates how to select rows with missing values in the column Available_Quantity.
Code
df[df['Available_Quantity'].isna()]
There are two rows where the Available_Quantity column has missing values. Those two rows will be selected and displayed.
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
5 | Speakers | 250.5 | 8 | NaT | 01/05/2021 |
6 | NaT | NaN | NaT | NaT | NaT |
Conclusion
You’ve learned how to check if any value is NaN in a Pandas dataframe. You’ve also learned how to check if any specific column has a missing value or if the entire dataframe contains a missing value.