How to Select Rows From Pandas Dataframe Based on Column Values? – Definitive Guide

Pandas dataframe allows you to store data in rows and column format.

You can select rows from Pandas Dataframe based on Column values using df.loc[df[‘No_Of_Units’] == 5] statement.

In this tutorial, you’ll learn how to select rows from Pandas dataframe based on column values. Using single condition, multiple conditions or selecting rows with column values starting with a specific string.

If You’re in Hurry…

You can use the loc attribute of the dataframe to select rows based on the column value.

Code

The code below demonstrates how to select rows with the value 5 for the column No_Of_Units.

df.loc[df['No_Of_Units'] == 5]

If You Want to Understand Details, Read on…

You’ll see how to select rows based on column value in detail.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

It contains,

  • Rows with values for all columns
  • Rows with Empty or Missing Data for each column
  • Rows with Empty or Missing data for all columns
  • One Duplicate row
  • One column in the sample dataframe is of the float type.
import pandas as pd

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],

        "Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],

        "No_Of_Units":[5,5, 10, 20, 20, 8,pd.NaT],

        "Available_Quantity":[5,6,10,"Not Available","Not Available", pd.NaT,pd.NaT],

        "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
       }

df = pd.DataFrame(data)

df = df.astype({"Unit_Price": float})

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021
3CPU10000.55020Not Available09/18/2021
4CPU10000.55020Not Available09/18/2021
5Speakers250.5008NaT01/05/2021
6NaTNaNNaTNaTNaT

Select Rows Where Column value is equal to Specific Value

In this section, you’ll learn how to select rows where a column value is equal to a specific value using the loc attribute.

The loc attribute allows you to select a subset of rows based on the row or column labels.

The condition df['No_Of_Units'] == 5 creates a Mask for each row with True and False values if the condition pass. Based on this mask, the loc attribute will select the rows from the dataframe where the Mask is True.

Code

The code below demonstrates how to select rows from the dataframe with the value 5 for the column No_Of_Units.

df.loc[df['No_Of_Units'] == 5]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.05511/5/2021
1Mouse200.0564/23/2021

This is how you can select rows from pandas dataframe based on a single condition.

Select Rows based on Column value in a List

In this section, you’ll learn how to select rows where a column value is in a list of values using the isin() method and the loc attribute.

The condition df['No_Of_Units'].isin([5,10])] creates a Mask for each row with True where the column is 5 or 10 and False for other values. Based on this mask, the loc attribute will select the rows from the dataframe where the Mask is True.

Code

The code below demonstrates how to select rows from the dataframe where the column values are in the list [5,10].

df.loc[df['No_Of_Units'].isin([5,10])]

Dataframe Will Look like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021

This is how you can select rows from Pandas dataframe based on values in the list of values.

Select Rows based on Multiple Conditions With OR

In this section, you’ll learn how to select rows based on Multiple conditions with OR conditions.

The OR operator is denoted by the single pipe | symbol.

The rows that pass any one of the multiple conditions are masked to True and selected by the loc attribute.

Code

The code below demonstrates how to select rows from Pandas dataframe based on multiple conditions and the OR operator.

df.loc[(df['No_Of_Units'] ==8) | (df['No_Of_Units'] ==10)]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
2Monitor5000.235101008/21/2021
5Speakers250.5008NaT01/05/2021

This is how you can select rows from pandas dataframe with multiple conditions and an OR operator.

Select Rows based on Multiple Conditions With AND

In this section, you’ll learn how to select rows based on Multiple conditions with AND conditions.

The AND operator is denoted by the single & symbol.

The rows that pass all the conditions specified are masked to True and selected by the loc attribute.

Code

The code below demonstrates how to select rows from Pandas dataframe based on multiple conditions and the AND operator.

df.loc[(df['No_Of_Units'] >= 8) & (df['No_Of_Units'] <= 15)]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
2Monitor5000.235101008/21/2021
5Speakers250.5008NaT01/05/2021

This is how you can select rows from pandas dataframe with multiple conditions and an OR operator.

Select Rows where column Value Starts with

In this section, you’ll learn how to select rows from the pandas dataframe where column values start with a specific string.

You can use the pandas series startswith() method to select rows where column values start with a specific value.

It also accepts another mandatory parameter, na. You can assign a boolean value that needs to be used for elements that are not a proper String. For example, NaN or None type missing Values.

Code

The code below demonstrates selecting the rows with the product name starting with the capital M.

df[df.product_name.str.startswith('M', na=False)]
  • na=False denotes that elements that are not proper String must be ignored.

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021

Conclusion

You’ve learned how to select rows from Pandas dataframe based on column values.

Also, you’ve learned how to use single conditions and multiple conditions and select rows with values based on a list of values.

You May Also Like

Leave a Comment