Pandas dataframe allows you to store data in rows and column format.
You can sort pandas dataframe based on column value using the df.sort_values(by=[‘column value‘], inplace=True) statement.
Basic Example
df.sort_values(by=['Column Value'], inplace=True)
df
In this tutorial, you’ll learn how to sort pandas dataframe based on column values in detail.
Table of Contents
Sample Dataframe
This is the sample dataframe used throughout the tutorial.
It contains,
- Rows with values for all columns
- Rows with Empty or Missing Data for each column
Code
import pandas as pd
import numpy as np
data = {'Product': ['CPU','Speaker','Keyboard','Mouse', 'Monitor'],
'Price': [22000,2500,3000,1500, np.NaN]
}
df = pd.DataFrame(data, columns=['Product','Price'])
print (df)
Dataframe will Look like
Product Price
0 CPU 22000.0
1 Speaker 2500.0
2 Keyboard 3000.0
3 Mouse 1500.0
4 Monitor NaN
Using the sort_values Method
The sort_values()
method allows you to sort the dataframe based on the column passed in the parameter.
Parameters
by
– List of column names to sort byinplace
–True
denotes to sort in the same dataframe.False
denotes to sort and return a new dataframe.False
is the default value.
Code
The below code explains how to sort the dataframe based on the Product column and sort it in the source dataframe itself.
df.sort_values(by=['Product'], inplace=True)
df
You can see that the dataframe is sorted based on the Product name.
Dataframe Will Look Like
Product | Price | |
---|---|---|
0 | CPU | 22000.0 |
2 | Keyboard | 3000.0 |
4 | Monitor | NaN |
3 | Mouse | 1500.0 |
1 | Speaker | 2500.0 |
Sort Pandas Dataframe Based on Column Value in Ascending order
The sort_values()
method sorts in ascending order by default. Hence there is no additional parameter necessary.
Alternatively, you can also pass the ascending=True
parameter.
Code
The code below demonstrates how to sort the dataframe based on the Product column in ascending order.
df.sort_values(by=['Product'], inplace=True, ascending=True)
df
Dataframe Will Look Like
Product | Price | |
---|---|---|
0 | CPU | 22000.0 |
2 | Keyboard | 3000.0 |
4 | Monitor | NaN |
3 | Mouse | 1500.0 |
1 | Speaker | 2500.0 |
Sort Pandas Dataframe in Descending Order
To sort pandas dataframe in descending order
- Pass the parameter
ascending=False
Code
The code below demonstrates how to sort the dataframe based on the Product column in descending order.
df.sort_values(by=['Product'], inplace=True, ascending=False)
df
Dataframe Will Look Like
Product | Price | |
---|---|---|
1 | Speaker | 2500.0 |
3 | Mouse | 1500.0 |
4 | Monitor | NaN |
2 | Keyboard | 3000.0 |
0 | CPU | 22000.0 |
Sort Pandas Dataframe Based on Multiple Columns
To sort pandas dataframe based on multiple columns
- Pass the multiple columns as a list to the
by
parameter.
Code
The code below demonstrates how to sort the dataframe based on the Price, Product column in ascending order.
df.sort_values(by=['Price','Product'], inplace=True)
df
Dataframe Will Look Like
Product | Price | |
---|---|---|
3 | Mouse | 1500.0 |
1 | Speaker | 2500.0 |
2 | Keyboard | 3000.0 |
0 | CPU | 22000.0 |
4 | Monitor | NaN |
This is how you can sort pandas dataframe based on multiple columns.
Sort NaN Values First
In this section, you’ll learn how to handle the missing values while sorting the pandas dataframe.
If the dataframe contains a missing value, you can handle missing values using the na_position
parameter.
It accepts two values.
first
– Brings Missing values to the toplast
– Pushes missing values to the end. This is the default.
Code
The code below demonstrates how to sort the dataframe and bring the NaN values to the top while sorting.
df.sort_values(by=['Price'], na_position='first', inplace=True)
df
The sample dataframe contains one NaN
value. This is brought to the top of the dataframe while sorting.
Dataframe Will Look Like
Product | Price | |
---|---|---|
4 | Monitor | NaN |
3 | Mouse | 1500.0 |
1 | Speaker | 2500.0 |
2 | Keyboard | 3000.0 |
0 | CPU | 22000.0 |
This is how you can handle the missing values while sorting the pandas dataframe.
Reset Index While Sorting
The row index is not changed when you sort the pandas dataframe using the sort_values()
method. The actual row index will be retained.
You can use the optional ignore_index
parameter to handle the indexes while sorting.
It accepts two values.
ignore_index = False
leaves the indexes unchanged. This is the default behaviour.ignore_index = True
updates the index of the rows while sorting
Code
The below code demonstrates how to leave the indexes of the rows unchanged using the ignore_index=False
parameter.
df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=False)
df2
You can see that the row indexes remain unchanged.
Dataframe Will Look Like
Product | Price | |
---|---|---|
4 | Monitor | NaN |
3 | Mouse | 1500.0 |
1 | Speaker | 2500.0 |
2 | Keyboard | 3000.0 |
0 | CPU | 22000.0 |
Code
When you use the parameter ignore_index=True
, the row index will be reset.
df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=True)
df2
The row indexes are updated while sorting.
Dataframe Will Look Like
Product | Price | |
---|---|---|
0 | Monitor | NaN |
1 | Mouse | 1500.0 |
2 | Speaker | 2500.0 |
3 | Keyboard | 3000.0 |
4 | CPU | 22000.0 |
Reset Index Using reset_index Method
If you sort the dataframe and have the row indexes in an unordered way, you can reset the index using the reset_index() method.
The inplace=True
parameter updates the index in the same dataframe instead of creating a new dataframe.
Code
df2.reset_index(inplace=True)
df2
Dataframe Will Look Like
index | Product | Price | |
---|---|---|---|
0 | 4 | Monitor | NaN |
1 | 3 | Mouse | 1500.0 |
2 | 1 | Speaker | 2500.0 |
3 | 2 | Keyboard | 3000.0 |
4 | 0 | CPU | 22000.0 |
Additional Resources
How To Select Rows From Pandas Dataframe Based On Column Values