How to Select Rows From Pandas Dataframe Based on Column Values? – Definitive Guide

Pandas dataframe allows you to store data in rows and column format.

You can select rows from Pandas Dataframe based on Column values using df.loc[df[‘No_Of_Units’] == 5] statement.

Basic Example

The code below demonstrates how to select rows with the value 5 for the column No_Of_Units.

df.loc[df['No_Of_Units'] == 5]

In this tutorial, you’ll learn how to select rows from the Pandas dataframe based on column values. Using single condition, multiple conditions or selecting rows with column values starting with a specific string.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

It contains,

  • Rows with values for all columns
  • Rows with Empty or Missing Data for each column
  • Rows with Empty or Missing data for all columns
  • One Duplicate row
  • One column in the sample dataframe is of the float type.
import pandas as pd

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],

        "Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],

        "No_Of_Units":[5,5, 10, 20, 20, 8,pd.NaT],

        "Available_Quantity":[5,6,10,"Not Available","Not Available", pd.NaT,pd.NaT],

        "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
       }

df = pd.DataFrame(data)

df = df.astype({"Unit_Price": float})

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021
3CPU10000.55020Not Available09/18/2021
4CPU10000.55020Not Available09/18/2021
5Speakers250.5008NaT01/05/2021
6NaTNaNNaTNaTNaT

Select Rows Where Column value is equal to the Specific Value

The loc attribute allows you to select a subset of rows based on the row or column labels.

  • Condition df['No_Of_Units'] == 5 creates a Mask for each row with True and False values if the condition pass
  • The loc attribute will select the rows from the dataframe where the Mask is True.

Code

The code below demonstrates how to select rows from the dataframe with the value 5 for column No_Of_Units.

df.loc[df['No_Of_Units'] == 5]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.05511/5/2021
1Mouse200.0564/23/2021

This is how you can select rows from pandas dataframe based on a single condition.

Select Rows based on Column value in a List

In this section, you’ll learn how to select rows where a column value is in a list of values using the isin() method and the loc attribute.

  • Condition df['No_Of_Units'].isin([5,10])] creates a Mask for each row with True where the column is 5 or 10 and False for other values.
  • The loc attribute will select the rows from the dataframe where the Mask is True.

Code

The code below demonstrates how to select rows from the dataframe where the column values are in the list [5,10].

df.loc[df['No_Of_Units'].isin([5,10])]

Dataframe Will Look like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021

This is how you can select rows from Pandas dataframe based on values in the list of values.

Select Rows based on Multiple Conditions With OR

To select rows based on Multiple conditions,

  • Combine conditions using OR operator
  • The OR operator is denoted by the single pipe | symbol.

The rows that pass any one of the multiple conditions are masked to True and selected by the loc attribute.

Code

The code below demonstrates how to select rows from Pandas dataframe based on multiple conditions and the OR operator.

df.loc[(df['No_Of_Units'] ==8) | (df['No_Of_Units'] ==10)]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
2Monitor5000.235101008/21/2021
5Speakers250.5008NaT01/05/2021

This is how you can select rows from pandas dataframe with multiple conditions and an OR operator.

Select Rows based on Multiple Conditions With AND

To select rows based on Multiple conditions

  • Combine multiple conditions using AND operator
  • The AND operator is denoted by the single & symbol.

The rows that pass all the conditions specified are masked to True and selected by the loc attribute.

Code

The code below demonstrates how to select rows from Pandas dataframe based on multiple conditions and the AND operator.

df.loc[(df['No_Of_Units'] >= 8) & (df['No_Of_Units'] <= 15)]

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
2Monitor5000.235101008/21/2021
5Speakers250.5008NaT01/05/2021

This is how you can select rows from pandas dataframe with multiple conditions and an OR operator.

Select Rows where column Value Starts with

In this section, you’ll learn how to select rows from the pandas dataframe where column values start with a specific string.

  • Use the pandas series startswith() method to select rows where column values start with a specific value.

It also accepts another mandatory parameter, na.

  • Assign a boolean value to be used for elements that are not proper strings
  • For example, NaN or None type missing Values.

Code

The code below demonstrates selecting the rows with the product name starting with the capital M.

df[df.product_name.str.startswith('M', na=False)]
  • na=False denotes that elements that are not proper String must be ignored.

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021

Additional Resources

Leave a Comment