Pandas dataframe stores values in a row and column format, and some data may be missing in the dataset.
You can count NaN values in Pandas dataframe using the df.isna() method.
NaN
values are also known as missing values. It is also denoted as None
.
If You’re in Hurry…
The below code demonstrates how to count the NaN
values in Column 1 of the dataframe df
.
Code
df['Column 1'].isna().sum()
Output
3
If You Want to Understand Details, Read on…
While cleaning up the data, you must count the NaN
values to decide if the columns shall be dropped. Because if there are more NaN
values, it will not have a meaningful impact during the ML model creation.
You’ll create a sample dataframe and use the isna()
method to count NaN
values or missing values in the pandas dataframe.
There is also another method called isnull()
. Read isna() vs isnull() in detail.
If you want to count number of missing values in the dataframe, read How to check if any value is NaN in a Pandas DataFrame.
Table of Contents
Sample Dataframe
To demonstrate the counting of NaN
values, first, create a dataframe with the NaN
values.
There are three columns, and each column contains a few NaN
values.
import pandas as pd
import numpy as np
data = {'Column 1': [1,2,np.nan,4,5,np.nan,None],
'Column 2': [1,2,np.nan,4,np.nan,np.nan,None],
'Column 3': [1,2,None,4,5,None,None]
}
df = pd.DataFrame(data,columns=['Column 1','Column 2','Column 3'])
df
Dataframe Will Look Like
Column 1 | Column 2 | Column 3 | |
---|---|---|---|
0 | 1.0 | 1.0 | 1.0 |
1 | 2.0 | 2.0 | 2.0 |
2 | NaN | NaN | NaN |
3 | 4.0 | 4.0 | 4.0 |
4 | 5.0 | NaN | 5.0 |
5 | NaN | NaN | NaN |
6 | NaN | NaN | NaN |
Now, you’ll use this dataframe and count the NaN
values.
Count Nan Values in Column
In this section, you’ll count the NaN
values in a single column using the isna() method.
The isna()
method returns the same sized boolean object indicating if the item is missing value or not.
Then, you can sum the object to get the total number of missing values using the sum()
function.
The below code demonstrates how to count the NaN
values in Column 1 of the dataframe df
.
Code
df['Column 1'].isna().sum()
Output
3
Count Nan Values in Multiple Columns
In this section, you’ll count the NaN
values in a Multiple columns using the isna() method.
You must pass the columns as a list to the isna()
method. It returns the same sized boolean object indicating if the item is missing value or not.
Then, you can sum the object to get the total number of missing values using the sum()
function.
The below code demonstrates how to count the NaN
values in Column 1 and Column 2 of the dataframe df
.
Code
df[['Column 1', 'Column 2']].isna().sum()
Output
Column 1 3
Column 2 4
dtype: int64
Count NaN Values in Every Column Of Dataframe
In this section, you’ll count the NaN
values in each column the isna() method.
You can directly call the isna()
method in the dataframe object. It returns the same sized boolean object indicating if the item is missing value or not.
Then, you can sum the object to get the total number of missing values using the sum()
function.
The below code demonstrates how to count the NaN
values in each column of the dataframe df
.
Code
df.isna().sum()
You’ll see the below output.
The number of missing values in each column is displayed.
Output
Column 1 3
Column 2 4
Column 3 3
dtype: int64
Count NaN values in Entire Dataframe
In this section, you’ll count the NaN
values in entire dataframe using the isna() method.
You can directly call the isna()
method in the dataframe object. It returns the same sized boolean object indicating if the item is missing value or not.
Then, you can sum the object to get the total number of missing values in each column and again invoke the sum()
function to count the total number of missing values.
The below code demonstrates how to count the NaN
values in each column of the dataframe df
and again sum it to obtain the total number of missing values in the entire dataframe.
Code
df.isna().sum().sum()
Output
10
Count Nan Value in a specific row
In this section, you’ll learn how to count the NaN values in a specific row of the dataframe.
You must select the desired row of the dataframe using the loc
attribute and use the isna()
method and sum()
to count the missing values. It’ll return the missing values in each column.
Again invoke the sum()
function to calculate the total NaN
values in the complete row.
The below code demonstrates how to count the NaN
value in a specific row.
Code
df.loc[[4]].isna().sum().sum()
Output
1
Count Rows with Nan Values
In this section, you’ll learn how to count the number of rows with NaN
values.
You can use the isna()
method to check if the value is missing and use the any(axis=1)
method to check if any of the value is missing on axis 1. Axis 1 denotes the row axis.
Then you can use the sum()
function to calculate the total number of rows with NaN
values.
The below code demonstrates how to count the number of rows with NaN values in the dataframe.
Code
df.isna().any(axis=1).sum()
You’ll see output 4 as four rows in the dataframe contains missing values.
Output
4
Conclusion
To summarise, you’ve learned how to count the Nan values in the Columns.
You’ve learned how to count the missing value in each column or every column of the pandas dataframe.