How to Sort Pandas dataframe Based on Column Value – Definitive Guide

Pandas dataframe allows you to store data in rows and column format.

You can sort pandas dataframe based on column value using the df.sort_values(by=[‘column value‘], inplace=True) statement.

In this tutorial, you’ll learn how to sort pandas dataframe based on column values in detail.

If You’re in Hurry…

df.sort_values(by=['Column Value'], inplace=True)

df

If You Want to Understand Details, Read on…

You can use the sort_values() method to sort the Pandas dataframe based on one column value or multiple columns. You can also sort it in ascending order or descending order.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

It contains,

  • Rows with values for all columns
  • Rows with Empty or Missing Data for each column

Code

import pandas as pd
import numpy as np

data = {'Product': ['CPU','Speaker','Keyboard','Mouse', 'Monitor'],
        'Price': [22000,2500,3000,1500, np.NaN]
        }

df = pd.DataFrame(data, columns=['Product','Price'])

print (df)

Dataframe will Look like

        Product    Price
    0       CPU  22000.0
    1   Speaker   2500.0
    2  Keyboard   3000.0
    3     Mouse   1500.0
    4   Monitor      NaN

Using the sort_values Method

sort_values() method allows you to sort the dataframe based on the column passed in the parameter by.

Parameters

  • by – List of column names to sort by
  • inplaceTrue denotes to sort in the same dataframe. False denotes to sort and return a new dataframe. False is the default value.

Code

The below code explains how to sort the dataframe based on the Product column and sort it in the source dataframe itself.

df.sort_values(by=['Product'], inplace=True)

df

You can see that the dataframe is sorted based on the Product name.

Dataframe Will Look Like

ProductPrice
0CPU22000.0
2Keyboard3000.0
4MonitorNaN
3Mouse1500.0
1Speaker2500.0

Sort Pandas Dataframe Based on Column Value in Ascending order

In this section, you’ll learn how to sort pandas dataframe based on column values in ascending order.

The sort_values() method sorts in ascending order by default. Hence there is no additional parameter necessary.

Alternatively, you can also pass the ascending=True parameter.

Code

The code below demonstrates how to sort the dataframe based on the Product column in ascending order.

df.sort_values(by=['Product'], inplace=True, ascending=True)

df

Dataframe Will Look Like

ProductPrice
0CPU22000.0
2Keyboard3000.0
4MonitorNaN
3Mouse1500.0
1Speaker2500.0

Sort Pandas Dataframe in Descending Order

In this section, you’ll learn how to sort pandas dataframe based on column values in descending order.

You can pass the ascending=False parameter to sort pandas dataframe in descending order.

Code

The code below demonstrates how to sort the dataframe based on the Product column in descending order.

df.sort_values(by=['Product'], inplace=True, ascending=False)

df

Dataframe Will Look Like

ProductPrice
1Speaker2500.0
3Mouse1500.0
4MonitorNaN
2Keyboard3000.0
0CPU22000.0

Sort Pandas Dataframe Based on Multiple Columns

In this section, you’ll learn how to sort pandas dataframe based on multiple columns.

You can pass the multiple columns as a list to the by parameter.

Code

The code below demonstrates how to sort the dataframe based on the Price, Product column in ascending order.

df.sort_values(by=['Price','Product'], inplace=True)

df

You can see that the dataframe is sorted based on the Price and Product columns in ascending order.

Dataframe Will Look Like

ProductPrice
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0
4MonitorNaN

This is how you can sort pandas dataframe based on multiple columns.

Sort NaN Values First

In this section, you’ll learn how to handle the missing values while sorting the pandas dataframe.

You can either bring the NaN values to the top of the dataframe while sorting or push them to the end.

If the dataframe contains a missing value, you can handle missing values using the na_position parameter.

It accepts two values.

  • first – Brings Missing values to the top
  • last – Pushes missing values to the end. This is the default.

Code

The code below demonstrates how to sort the dataframe and bring the NaN values to the top while sorting.

df.sort_values(by=['Price'], na_position='first', inplace=True)

df

The sample dataframe contains one NaN value. This is brought to the top of the dataframe while sorting.

Dataframe Will Look Like

ProductPrice
4MonitorNaN
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0

This is how you can handle the missing values while sorting the pandas dataframe.

Reset Index While Sorting

This section will teach you how to handle the row index while sorting the Pandas dataframe.

The row index is not changed when you sort the pandas dataframe using the sort_values() method. The actual row index will be retained.

You can use the optional ignore_index parameter to handle the indexes while sorting.

It accepts two values.

  • ignore_index = False leaves the indexes unchanged. This is the default behaviour.
  • ignore_index = True updates the index of the rows while sorting

Code

The below code demonstrates how to leave the indexes of the rows unchanged using the ignore_index=False parameter.

df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=False)

df2

You can see that the row indexes remain unchanged.

Dataframe Will Look Like

ProductPrice
4MonitorNaN
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0

Code

When you use the parameter ignore_index=True, the row index will be reset.

df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=True)

df2

The row indexes are updated while sorting.

Dataframe Will Look Like

ProductPrice
0MonitorNaN
1Mouse1500.0
2Speaker2500.0
3Keyboard3000.0
4CPU22000.0

Using reset_index Method

If you sort the dataframe and have the row indexes in an unordered way, you can reset the index using the reset_index() method.

The inplace=True parameter updates the index in the same dataframe instead of creating a new dataframe.

Code

df2.reset_index(inplace=True)

df2

Dataframe Will Look Like

indexProductPrice
04MonitorNaN
13Mouse1500.0
21Speaker2500.0
32Keyboard3000.0
40CPU22000.0

Conclusion

You’ve learned how to sort pandas dataframe based on column values, based on multiple column values. Also learnt how to sort in ascending order and descending order.

Additionally, you’ve learned how to handle missing values and the row index while sorting.

You May Also Like

Leave a Comment