How To Filter Rows By Column Values In a List in Pandas Dataframe – With Examples

During data wrangling, you may need to filter rows by column values in a list.

Answer: df[df[‘Ccolumn_name‘].isin([‘V1’,’V2’])]

This tutorial teaches you the different methods to filter rows by column values in a list in Pandas Dataframe and their appropriate use cases.

Sample Dataframe

Create the following dataframe with four columns, namely Reg_no, First Name, Last Name and Country_Code.

import pandas as pd 

# List of Tuples
users = [ (1,'Shivam', 'Pandey', 'India', 1),
             (2, 'Kumar', 'Ram' , 'India', 1 ),
         (3, 'Felix','John' , 'Germany', 2 ),
         (4, 'Michael','Arndtz' , 'Germany', 2 ),
         (5, 'Praveen','Dinakar' , 'USA', 3 )
              ]

#Create a DataFrame object
df = pd.DataFrame(users, 
                    columns = ['RegNo.','First Name' , 'Last Name', 'Country', 'Country_Code']
                    ) 

df

DataFrame Will Look Like

RegNo.First NameLast NameCountryCountry_Code
01ShivamPandeyIndia1
12KumarRamIndia1
23FelixJohnGermany2
34MichaelArndtzGermany2
45PraveenDinakarUSA3

Using isin

The isin() function checks each element in the dataframe is available in the list of values and returns a dataframe with booleans.

To filter rows by column values in a list,

  • Use the column name and isin() function along with the list of values.
  • Pass the resultant dataframe again to the df object. The rows that contain a True boolean value will be filtered and displayed.

Use this method when you want to select rows using a list of values.

Code

The following code filters the rows with country_code values 1 and 3.

df[df['Country_Code'].isin([1,3])]

DataFrame Will Look Like

RegNo.First NameLast NameCountryCountry_Code
01ShivamPandeyIndia1
12KumarRamIndia1
45PraveenDinakarUSA3

Using Query

The query() function queries the dataframe using a boolean expression. You can use the query() to filter rows from the dataframe using an SQL-like query.

To filter rows by column values in a list, you can pass the list of values using the in clause in the query method. For example, column_name in (list_of_values).

Use this method when you want to select rows using the SQL like expression.

Code

The following code demonstrates how to select the rows with country_code in 1 and 3 and the query method.

df.query('Country_Code in (1,3)')

DataFrame Will Look Like

RegNo.First NameLast NameCountryCountry_Code
01ShivamPandeyIndia1
12KumarRamIndia1
45PraveenDinakarUSA3

Using Query and @List object

In some cases, you might have the list object with values in it.

In this case, you can specify the list name in the query method using the @ annotation instead of explicitly specifying the list elements.

Use this method when you want to use the already available list object in the query method.

Code

The following code demonstrates how to use the list_name directly in the query() method.

lst = [1,3]

df.query('Country_Code in @lst')

DataFrame Will Look Like

RegNo.First NameLast NameCountryCountry_Code
01ShivamPandeyIndia1
12KumarRamIndia1
45PraveenDinakarUSA3

Using LOC

The pandas dataframes are indexed by default. It allows you to identify/select rows easily and get and set subsets of data in the dataframe.

If you want to filter rows from the dataframe multiple times based on a single column, then it is useful and efficient to set that specific column as an index of the dataframe.

  • You can set the index of the dataframe using the set_index() method.
  • Once the index is set, you can select/filter rows from the dataframe using the loc[] attribute and list of values. It selects the rows with that specific indexes.

Use this method when you want to select rows from the dataframe multiple times using the same column name.

Code

The following codes set the Country_code as the index and select the rows with country_code 1 and 2 as an index.

df.set_index('Country_Code', inplace=True)

df.loc[[1,2]]

DataFrame Will Look Like

RegNo.First NameLast NameCountry
Country_Code
11ShivamPandeyIndia
12KumarRamIndia
23FelixJohnGermany
24MichaelArndtzGermany

Filter Rows by Column Values Not in A list

To filter rows by column value that is not in the list, you can use the isin() method and the ~ tilde symbol that will inverse the result by filtering the values that are not in the list.

Code

The following code demonstrates how to use the ~ tilde symbol to filter rows by column value that is not in the list.

df[~df['Country_Code'].isin([1,3])]

DataFrame Will Look Like

RegNo.First NameLast NameCountryCountry_Code
23FelixJohnGermany2
34MichaelArndtzGermany2

Additional Resources

Leave a Comment