How to Select Rows From Pandas Dataframe – Complete Guide

Pandas Dataframe stores data in a two-dimensional format. You may need to select rows from Dataframe for various data analysis purposes.

You can select rows from Pandas Dataframe using df.loc[0:] statement.

In this tutorial, you’ll learn how to select rows from the pandas dataframe using the loc, iloc, head(), and tail() methods.

If You’re in Hurry…

You can use the below code to select rows from Pandas Dataframe.

df.loc[0:]

If You Want to Understand Details, Read on…

In this tutorial, you’ll learn how to select rows from pandas dataframe using loc[] property, iloc[] property, and How to select rows using head() method and tail() methods.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

It contains,

  • Rows with values for all columns
  • Rows with Empty or Missing Data for each column
  • Rows with Empty or Missing data for all columns
  • One Duplicate row
  • One column in the sample dataframe is of the float type.
import pandas as pd

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],
        "Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],
        "No_Of_Units":[5,5, 10, 20, 20, 8,pd.NaT],
        "Available_Quantity":[5,6,10,"Not Available","Not Available", pd.NaT,pd.NaT],
        "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
       }

df = pd.DataFrame(data)

df = df.astype({"Unit_Price": float})

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021
3CPU10000.55020Not Available09/18/2021
4CPU10000.55020Not Available09/18/2021
5Speakers250.5008NaT01/05/2021
6NaTNaNNaTNaTNaT

Now, let’s discuss how to select these different types of rows in different situations.

Using loc Attribute

You’ll select all the rows from the dataframe in this section.

You’ll use the loc property of the dataframe. It allows you to access a group of rows and columns from the dataframe.

It is primarily label-based. This means it’ll access the rows based on the index columns.

To select all rows, you can use 0:, which means starting from 0 to the end of the dataframe.

Snippet

#select all rows
df.loc[0:]

The snippet returns all the rows from the dataframe with all the columns.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.0005511/5/2021
1Mouse200.000564/23/2021
2Monitor5000.235101008/21/2021
3CPU10000.55020Not Available09/18/2021
4CPU10000.55020Not Available09/18/2021
5Speakers250.5008NaT01/05/2021
6NaTNaNNaTNaTNaT

This is how you can access rows from the dataframe without condition.

Select Rows based on Condition using loc

You can select rows from pandas dataframe based on condition using the loc[] attribute.

Range to the loc[] attribute can be generated by using the condition. For example, to select the range where a column has a value of 5, you can use df['Column_name'] == 5

Use the below snippet to select the rows where the column No_Of_Units has the value 5.

Snippet

df.loc[df['No_Of_Units'] == 5]

Dataframe has two rows where the value for the column No_Of_Units is 5. Hence it’ll return those two rows.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.05511/5/2021
1Mouse200.0564/23/2021

Using iloc Attribute

In this section, you’ll select rows from Dataframe Based on the row Index. You can do this by using the iloc[] attribute of the dataframe.

iloc[] is used to select a subset of rows based on its index position.

You can pass the index of the rows as a list. Rows with those indexes will be displayed.

It’d raise IndexError if a requested index is out-of-bounds or if it’s not available.

The index is 0 based.

Use the below snippet to select the second and fourth rows of the dataframe.

Snippet

df.iloc[[1,3]]

Since the index is 0 based, the second and fourth rows of the dataframe will be selected.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
1Mouse200.00564/23/2021
3CPU10000.5520Not Available09/18/2021

Using Head() Method

Using the head() function, you can print the rows from the beginning of the dataframe.

df.head()

Using Tail() Method

You can print the rows from the end of the dataframe using the tail() function.

df.tail()

By default, head() and tail() methods will print the 5 rows.

However, you can print more samples by passing the number of rows to be printed as df.head(10).

Conclusion

To summarise, you’ve learned how to Select Rows From Pandas Dataframe using the iloc[], loc[], head(), tail() and how to select rows based on conditions to clean up the dataframe.

If you have any questions, comment below.

You May Also Like

1 thought on “How to Select Rows From Pandas Dataframe – Complete Guide”

Leave a Comment