How To Create A New Column Based on Values from Other Columns in Pandas – Definitive Guide

Pandas allow you to store values as rows and columns.

You can create a new column based on values from other columns in Pandas using the other columns using df[‘New Column‘] = df[‘Old column1‘] * df[‘Old column 2‘].

Basic Example

The following code demonstrates how to create a total_price column using the multiplication of Unit_price and the no_of_units columns.

df['Total Price'] = df['Unit_Price'] * df['No_Of_Units'] 

df

The new column Total Price is added to the dataframe based on the values from the other two columns.

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsTotal Price
0Keyboard500.0005.02500.00
1Mouse200.0005.01000.00
2Monitor5000.23510.050002.35
3CPU10000.55020.0200011.00
4CPU10000.55020.0200011.00
5Speakers250.5008.02004.00
6TrackpadNaNNaNNaN

This tutorial teaches you the different methods to create a new column based on values from other columns in Pandas and when it is appropriate to use them.

To know the column names in the dataframe, read: How to Get Column Name in Pandas

You’ll learn how to create

  • Total_price column by multiplying the unit_price and the No_of_units column
  • Discounted_price column to add discounted price based on the number of units purchased

Sample Dataframe

You’ll use the following dataframe to create a new column based on values from the Unit_price column and the no_of_units column.

Code

import pandas as pd

import numpy as np

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers","Trackpad"],

        "Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],

        "No_Of_Units":[5,5, 10, 20, 20, 8,np.nan]
}

df = pd.DataFrame(data)

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_Units
0Keyboard500.0005.0
1Mouse200.0005.0
2Monitor5000.23510.0
3CPU10000.55020.0
4CPU10000.55020.0
5Speakers250.5008.0
6TrackpadNaNNaN

Using Columns Directly

This is the simplest method to create a new column based on values from other columns.

  • Perform the desired operation on any two or more columns in the dataframe
  • Assign it to the new column of the dataframe.

Code

The following code demonstrates how to multiply two columns and assign the result to a new column.

df['Total Price'] = df['Unit_Price'] * df['No_Of_Units'] 

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsTotal Price
0Keyboard500.0005.02500.00
1Mouse200.0005.01000.00
2Monitor5000.23510.050002.35
3CPU10000.55020.0200011.00
4CPU10000.55020.0200011.00
5Speakers250.5008.02004.00
6TrackpadNaNNaNNaN

Using Apply

This section teaches you how to apply a function of multiple columns, row-wise and create a new column based on the result of the apply function.

  • Apply the function to the column axis using the parameter axis=1.

Use this method when you want to apply a complex function or specify multiple conditions for the calculations.

Code

The following code demonstrates how to calculate the Total Price using the Unit price and the No of units columns using the lambda function.

df['Total Price'] = df.apply(lambda row: (row.Unit_Price * row.No_Of_Units ), axis=1)

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsTotal Price
0Keyboard500.0005.02500.00
1Mouse200.0005.01000.00
2Monitor5000.23510.050002.35
3CPU10000.55020.0200011.00
4CPU10000.55020.0200011.00
5Speakers250.5008.02004.00
6TrackpadNaNNaNNaN

Using LOC

The loc attribute allows you to access a group of rows and columns.

To create a new column based on other columns,

  • Use this loc attribute, and assign value to a new column.

Use this method when you want to specify simple conditions while performing the calculations—for example, performing the operation only on the Non-NaN columns.

Code

The following code calculates the total price for the rows with the non-empty no of units columns.

df.loc[df['No_Of_Units']!=np.nan,'Total_Price'] = (df['Unit_Price'] * df['No_Of_Units'])

df

The Total price is calculated only for the columns where the no of units is not empty.

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsTotal PriceTotal_Price
0Keyboard500.0005.02500.002500.00
1Mouse200.0005.01000.001000.00
2Monitor5000.23510.050002.3550002.35
3CPU10000.55020.0200011.00200011.00
4CPU10000.55020.0200011.00200011.00
5Speakers250.5008.02004.002004.00
6TrackpadNaNNaNNaNNaN

Using If Else While Creating New Column Based on Other Column

To use an If-else statement while creating a new column based on other columns

  • Define a function with the if-else statements.
  • Apply this function using the apply method and lambda expression.

Code

The following code demonstrates using the if and else while creating new columns.

  • If the No of units is less than 10, then apply the discount of 5%
  • If the No of units is equal to or greater than 10, then apply the discount of 10%
def calculate_discount(row):
  
    if row['No_Of_Units'] < 10:
        return (row['Unit_Price'] * row['No_Of_Units']) - ((5/100)*(row['Unit_Price'] * row['No_Of_Units']))

    elif row['No_Of_Units'] >= 10:
        return (row['Unit_Price'] * row['No_Of_Units']) - ((10/100)*(row['Unit_Price'] * row['No_Of_Units']))

    return (row['Unit_Price'] * row['No_Of_Units'])

df['Discounted Price'] = df.apply(lambda row: calculate_discount(row), axis=1)

df

Dataframe Will Look Like

product_nameUnit_PriceNo_Of_UnitsTotal PriceTotal_PriceDiscounted_Price
0Keyboard500.0005.02500.002500.002375.000
1Mouse200.0005.01000.001000.00950.000
2Monitor5000.23510.050002.3550002.3545002.115
3CPU10000.55020.0200011.00200011.00180009.900
4CPU10000.55020.0200011.00200011.00180009.900
5Speakers250.5008.02004.002004.001903.800
6TrackpadNaNNaNNaNNaNNaN

You can use the same code to specify multiple conditions while creating a new column based on values from other columns.

Sum All Columns and Create New Column

To sum all columns of the dataframe and create a new column, use the df.sum() method.

  • To apply the sum() function only to the numeric column, use the parameter numeric_only=True.

Code

df['C'] = df.sum(axis=1, numeric_only=True)

Additional Resources

Leave a Comment