Pandas allow you to select a subset of rows based on column or row values.
You can use a list of values to select rows from the pandas dataframe using the df[df[‘column_name’].isin([list_of_values])] statement.
Basic Example
import pandas as pd
# List of Tuples
users = [ ('Shivam', 'Pandey', 'India', 1),
('Kumar', 'Ram' , 'US', 2 ),
('Felix','John' , 'Germany', 3 ),
('Michael','John' , 'India', 1 ),
]
#Create a DataFrame object
df = pd.DataFrame( users,
columns = ['First Name' , 'Last Name', 'Country', 'Country_Code']
)
df[df['Country_Code'].isin([1, 2])]
The rows with country_code 1 and 2 are selected and displayed.
Dataframe Will Look Like
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
0 | Shivam | Pandey | India | 1 |
1 | Kumar | Ram | US | 2 |
3 | Michael | John | India | 1 |
This tutorial teaches you how to use a list of values to select rows from the Pandas dataframe.
Sample Dataframe
First, let us create a sample dataframe used for demonstration purposes.
import pandas as pd
# List of Tuples
users = [ ('Shivam', 'Pandey', 'India', 1),
('Kumar', 'Ram' , 'US', 2 ),
('Felix','John' , 'Germany', 3 ),
('Michael','John' , 'India', 1 ),
]
#Create a DataFrame object
df = pd.DataFrame( users,
columns = ['First Name' , 'Last Name', 'Country', 'Country_Code']
)
df
Dataframe Will Look Like
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
0 | Shivam | Pandey | India | 1 |
1 | Kumar | Ram | US | 2 |
2 | Felix | John | Germany | 3 |
3 | Michael | John | India | 1 |
Using isin
You can use the isin() method to use a list of values to select rows from the dataframe.
The condition df['Country_Code'].isin([1, 2])]
creates a Mask for each row with
True
where the Country_Code is1
or2
- False for other country_codes.
Based on these masks, the df[ ]
will return the rows where the mask is True
.
Code
The following code will return the rows from the dataframe where the country_code is in 1
or 2
.
df[df['Country_Code'].isin([1, 2])]
Dataframe Will Look Like
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
0 | Shivam | Pandey | India | 1 |
1 | Kumar | Ram | US | 2 |
3 | Michael | John | India | 1 |
Using Query
The query()
method queries the coloums of the dataframe with a boolean expression.
- It returns the rows where the expression evaluates to
True
. - Similar to using SQL query statements.
This method will throw an error when the column name contains space.
Code
df.query('Country_Code in [1, 2]')
Dataframe Will Look Like
The rows with country codes 1 and 2 are selected and displayed.
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
0 | Shivam | Pandey | India | 1 |
1 | Kumar | Ram | US | 2 |
3 | Michael | John | India | 1 |
Select Rows With Values Not In the list
To select rows with values that are not present in the list,
- use the
isin()
method and the~
operator to specifyNOT IN
.
It’ll return the rows that don’t match the values specified in the list.
In other words, it is also known as excluding a list of rows.
Code
The following code selects rows with country codes, not in 1
and 2
.
df[~df['Country_Code'].isin([1, 2])]
Dataframe Will look Like
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
2 | Felix | John | Germany | 3 |
Select Rows By Index In List Of Values
To select rows using a list of its indexes,
- Use the
iloc
statement. - Pass the list of indexes to the
iloc
, and it’ll return those rows.
An error will be raised if the specified indexes are out of range.
Code
The following code demonstrates how to select rows by the index in the values list.
df.iloc[[1,3]]
Dataframe Will Look Like
The rows with indexes 1
and 3
are selected and displayed.
First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|
1 | Kumar | Ram | US | 2 |
3 | Michael | John | India | 1 |