Pandas Dataframe allows you to store data in rows and columns format.
You can drop rows from Pandas dataframe based on column values using df.drop(df.index[df[‘Column name’] == ‘value’], inplace=True) statement.
Basic Example
df.drop(df.index[df['Column_name'] == 'Column_value'], inplace=True)
df
It deletes the rows in which the column Column_name has the value of Column_value.
This tutorial teaches you the different methods to drop rows from pandas dataframe based on column values.
Sample Dataframe
You’ll create a pandas dataframe using the following code.
It contains rows with different types of values. You can use it to demonstrate the dropping of the columns.
import pandas as pd
data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],
"Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],
"No_Of_Units":[5,6, 10, 20, 20, 8,0],
"Available_Quantity":[5,6,10,10,"Not Available", pd.NaT,pd.NaT],
"Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
}
df = pd.DataFrame(data)
df
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.000 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.000 | 6 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | 10 | 09/18/2021 |
4 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |
6 | NaT | NaN | 0 | NaT | NaT |
Drop Rows Based On a Column Value
The drop() method is used to delete rows from the dataframe.
To drop rows based on a column value,
- Fetch the index of the rows with column values using the
index
attribute - Pass the
inplace=True
to perform a drop in the same dataframe object.
Code
The following code demonstrates how to select the indexes of the rows containing Available_quantity = Not Available.
df.drop(df.index[df['Available_Quantity'] == 'Not Available'], inplace=True)
df
Dataframe Will Look Like
All the rows with Not Available values are dropped from the dataframe.
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.000 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.000 | 6 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | 10 | 09/18/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |
6 | NaT | NaN | 0 | NaT | NaT |
Drop Rows Where Column Value is in List
To drop rows where the column value is in a list, use the drop()
method and index
attribute.
- Use the isin() method to fetch the rows with values available in the list
- Pass the rows from the
isin()
method to the index attribute, and it’ll return the indexes of the rows - Pass those indexes to the
drop()
method, and those rows will be dropped from the dataframe.
Code
df.drop(df.index[df['No_Of_Units'].isin([8,10])], inplace = True)
df
Dataframe Will Look Like
The rows with column values 8
and 10
for the column Number of units are deleted.
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.00 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.00 | 6 | 6 | 4/23/2021 |
3 | CPU | 10000.55 | 20 | 10 | 09/18/2021 |
4 | CPU | 10000.55 | 20 | Not Available | 09/18/2021 |
6 | NaT | NaN | 0 | NaT | NaT |
Drop Rows Where Column Value Not In List
This section teaches you how to drop rows where the column values are NOT available in the list.
- Use the tilde
~
operator and isin() method to fetch the rows with values that are NOT available in the list - Pass the rows from the
~
andisin()
method to the index attribute, and it’ll return the indexes of the rows - Pass those indexes to the
drop()
method, and those rows will be dropped from the dataframe.
Code
df.drop(df.index[~df['No_Of_Units'].isin([8,10])], inplace = True) df
Dataframe Will Look Like
The rows with column values other than 8
and 10
for the column Number of units are deleted.
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |
Drop Rows Where Column Value is NaN
NaN
denotes the missing values.
To drop the rows with NaN
values,
- Use the notnull() method to select the rows with a column that doesn’t have any missing value.
- Pass the
notnull()
result to the dataframe again to select the subset of the rows that don’t contain missing values.
Code
The following code demonstrates how to delete the rows from the dataframe with NaN
values in the column Available_Quantity.
df = df[df.Available_Quantity.notnull()]
df
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.000 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.000 | 6 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | 10 | 09/18/2021 |
4 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
Drop Rows Where Column Value is Zero
To delete rows where the column value is zero,
- Select the rows where the no_of_units is Zero
- Using the
index
attribute, select the index of those rows - Using the
drop()
method, drop those indexes
Code
df.drop(df.index[df['No_Of_Units']==0], inplace=True)
df
Dataframe Will Look Like
The rows with Column values Zero are deleted.
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.000 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.000 | 6 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | 10 | 09/18/2021 |
4 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |
Drop Rows Based On Multiple Column Values
This section teaches you to drop rows based on multiple column values.
- Combine the multiple conditions using the
&
operator - Select the index of the rows matching the specific conditions.
- Drop using the
drop()
method
Code
The following code demonstrates how to drop rows based on multiple conditions and column values.
df.drop(df[(df['Unit_Price'] >400) & (df['Unit_Price'] < 600)].index, inplace=True)
df
Dataframe Will Look Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
1 | Mouse | 200.000 | 6 | 6 | 4/23/2021 |
2 | Monitor | 5000.235 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.550 | 20 | 10 | 09/18/2021 |
4 | CPU | 10000.550 | 20 | Not Available | 09/18/2021 |
5 | Speakers | 250.500 | 8 | NaT | 01/05/2021 |