How to Sort Pandas dataframe Based on Column Value – Definitive Guide

Pandas dataframe allows you to store data in rows and column format.

You can sort pandas dataframe based on column value using the df.sort_values(by=[‘column value‘], inplace=True) statement.

Basic Example

df.sort_values(by=['Column Value'], inplace=True)

df

In this tutorial, you’ll learn how to sort pandas dataframe based on column values in detail.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

It contains,

  • Rows with values for all columns
  • Rows with Empty or Missing Data for each column

Code

import pandas as pd
import numpy as np

data = {'Product': ['CPU','Speaker','Keyboard','Mouse', 'Monitor'],
        'Price': [22000,2500,3000,1500, np.NaN]
        }

df = pd.DataFrame(data, columns=['Product','Price'])

print (df)

Dataframe will Look like

        Product    Price
    0       CPU  22000.0
    1   Speaker   2500.0
    2  Keyboard   3000.0
    3     Mouse   1500.0
    4   Monitor      NaN

Using the sort_values Method

The sort_values() method allows you to sort the dataframe based on the column passed in the parameter.

Parameters

  • by – List of column names to sort by
  • inplaceTrue denotes to sort in the same dataframe. False denotes to sort and return a new dataframe. False is the default value.

Code

The below code explains how to sort the dataframe based on the Product column and sort it in the source dataframe itself.

df.sort_values(by=['Product'], inplace=True)

df

You can see that the dataframe is sorted based on the Product name.

Dataframe Will Look Like

ProductPrice
0CPU22000.0
2Keyboard3000.0
4MonitorNaN
3Mouse1500.0
1Speaker2500.0

Sort Pandas Dataframe Based on Column Value in Ascending order

The sort_values() method sorts in ascending order by default. Hence there is no additional parameter necessary.

Alternatively, you can also pass the ascending=True parameter.

Code

The code below demonstrates how to sort the dataframe based on the Product column in ascending order.

df.sort_values(by=['Product'], inplace=True, ascending=True)

df

Dataframe Will Look Like

ProductPrice
0CPU22000.0
2Keyboard3000.0
4MonitorNaN
3Mouse1500.0
1Speaker2500.0

Sort Pandas Dataframe in Descending Order

To sort pandas dataframe in descending order

  • Pass the parameter ascending=False

Code

The code below demonstrates how to sort the dataframe based on the Product column in descending order.

df.sort_values(by=['Product'], inplace=True, ascending=False)

df

Dataframe Will Look Like

ProductPrice
1Speaker2500.0
3Mouse1500.0
4MonitorNaN
2Keyboard3000.0
0CPU22000.0

Sort Pandas Dataframe Based on Multiple Columns

To sort pandas dataframe based on multiple columns

  • Pass the multiple columns as a list to the by parameter.

Code

The code below demonstrates how to sort the dataframe based on the Price, Product column in ascending order.

df.sort_values(by=['Price','Product'], inplace=True)

df

Dataframe Will Look Like

ProductPrice
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0
4MonitorNaN

This is how you can sort pandas dataframe based on multiple columns.

Sort NaN Values First

In this section, you’ll learn how to handle the missing values while sorting the pandas dataframe.

If the dataframe contains a missing value, you can handle missing values using the na_position parameter.

It accepts two values.

  • first – Brings Missing values to the top
  • last – Pushes missing values to the end. This is the default.

Code

The code below demonstrates how to sort the dataframe and bring the NaN values to the top while sorting.

df.sort_values(by=['Price'], na_position='first', inplace=True)

df

The sample dataframe contains one NaN value. This is brought to the top of the dataframe while sorting.

Dataframe Will Look Like

ProductPrice
4MonitorNaN
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0

This is how you can handle the missing values while sorting the pandas dataframe.

Reset Index While Sorting

The row index is not changed when you sort the pandas dataframe using the sort_values() method. The actual row index will be retained.

You can use the optional ignore_index parameter to handle the indexes while sorting.

It accepts two values.

  • ignore_index = False leaves the indexes unchanged. This is the default behaviour.
  • ignore_index = True updates the index of the rows while sorting

Code

The below code demonstrates how to leave the indexes of the rows unchanged using the ignore_index=False parameter.

df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=False)

df2

You can see that the row indexes remain unchanged.

Dataframe Will Look Like

ProductPrice
4MonitorNaN
3Mouse1500.0
1Speaker2500.0
2Keyboard3000.0
0CPU22000.0

Code

When you use the parameter ignore_index=True, the row index will be reset.

df2 = df.sort_values(by=['Price'], na_position='first', ignore_index=True)

df2

The row indexes are updated while sorting.

Dataframe Will Look Like

ProductPrice
0MonitorNaN
1Mouse1500.0
2Speaker2500.0
3Keyboard3000.0
4CPU22000.0

Reset Index Using reset_index Method

If you sort the dataframe and have the row indexes in an unordered way, you can reset the index using the reset_index() method.

The inplace=True parameter updates the index in the same dataframe instead of creating a new dataframe.

Code

df2.reset_index(inplace=True)

df2

Dataframe Will Look Like

indexProductPrice
04MonitorNaN
13Mouse1500.0
21Speaker2500.0
32Keyboard3000.0
40CPU22000.0

Additional Resources

How To Select Rows From Pandas Dataframe Based On Column Values

Leave a Comment