How to Add Column to Pandas Dataframe – Definitive Guide

Pandas Data frame is a two-dimensional data structure that stores data in rows and columns structure.

You can add column to pandas dataframe using the df.insert(col_index_position, “Col_Name”, Col_Values_As_List, True) statement.

In this tutorial, you’ll see different methods available to add columns to pandas dataframe.

If You’re in Hurry…

You can use the below code snippet to add a new column to the pandas dataframe.

To add a column with empty values

df["new_Column"] = pd.NaT

df

where

  • df["new_Column"] – New column in the dataframe
  • pd.NaT – To specify the values as NaT for all the rows of this column. Its normally used to denote missing values. You can use when you don’t know the values upfront.

To add a column with values

new_column_values = ['val1','val2','val3','val4','val5']

df["new_Column"] = new_column_values 

df

where

  • new_column_values = ['val1','val2','val3','val4','val5'] – List which have values for the cells in the new column. The length of this list must be equal to the length of the dataframe. Otherwise, as error will be raised.
  • df["new_Column"]= new_column_values – Creating a new column in the dataframe and assign the list of values to the new column.

This is how you can add a new column to the pandas dataframe.

If You Want to Understand Details, Read on…

In this tutorial, you’ll learn the different methods available to add columns to the pandas dataframe. You can add columns using

  • Assignment operator or the subscript notation – Use the assignment operator = to create a column in the dataframe and assign list of values.
  • dataframe.insert() method – Use insert() method when you want to insert a column in a specific index position of the dataframe.
  • datafame.assign() method – Use assign() method when you want to insert a column and create a new dataframe out of it rather inserting a new column in the same dataframe.

Let’s look at the details of the scenario of adding a new column to the existing dataframe.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

import pandas as pd

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU", "Speakers"],
        "Unit_Price":[500,200, 5000, 10000, 250],
        "No_Of_Units":[5,5, 10, 20, 8],
       }

df = pd.DataFrame(data)

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_Units
0Keyboard5005
1Mouse2005
2Monitor500010
3CPU1000020
4Speakers2508

Let’s see the different types of adding a column to pandas dataframe.

Using Subscript Notation or Assignment operator**

You can add a column by using the = operator with a list of values. The length of the list of values must be equal to the length of the rows in the dataframe. Otherwise, an error will be raised.

list = ['val1','val2','val3','val4','val5']

df["new_column"] = list

where,

  • list = ['val1','val2','val3','val4','val5'] – creating a list with values
  • df["new_column"] = list – assigning the list to the dataframe column called “new_column”.

When you execute the below code snippet, a new column called Tax_new will be added to the dataframe with values available in the list called as tax.

tax = [10,15,12,10,11]

df['Tax_new %'] = tax

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTotal_PriceTax_new %
0Keyboard5005NaT10
1Mouse2005NaT15
2Monitor500010NaT12
3CPU1000020NaT10
4Speakers2508NaT11

Using Insert() method

You can add a column to pandas dataframe using the insert() method available in the pandas dataframe.

Usage

  • When you want to insert a column in specific position
  • To avoid inserting duplicate columns with the same name. You can avoid duplicates by specifying allow_duplicates flag.

Below is the code snippet to add column using the insert() method.

# Using DataFrame.insert() to add a column
df.insert(3, "Tax%", [5,10,10,5,10], True)

df

where,

  • 3 – Position where the new column needs to be inserted
  • Tax% – Name of the new column
  • [5,10,10,5,10] – List of values to be assigned to the new column
  • True – To allow duplicate columns. If False, the new column will not be inserted if a column with name Tax% is already existing.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%Total_PriceTax_new %
0Keyboard50055NaT10
1Mouse200510NaT15
2Monitor50001010NaT12
3CPU10000205NaT10
4Speakers250810NaT11

Using Assign() method

You can add a column to pandas dataframe using the assign() method available in the pandas dataframe.

Usage

  • When you cant to create a new dataframe with the existing dataframe with additional new columns inserted.
  • If you want to avoid modifications in the original dataframe.

Below is the code snippet to add column using the assign() method.

df2 = df.assign(Remarks = pd.NaT)

df2

Where,

  • Remarks = pd.NaT – Remarks is the column name to be inserted. pd.Nat is the values to be assigned to the new column. Note that, the column name is not enclosed with single quotes or double quotes.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%Total_PriceTax_new %Remarks
0Keyboard50055NaT10NaT
1Mouse200510NaT15NaT
2Monitor50001010NaT12NaT
3CPU10000205NaT10NaT
4Speakers250810NaT11NaT

This is how you can add columns with value in three different methods available in the pandas dataframe.

Next, you’ll add a column at a specific index.

Add column At Specific Index

In this section, you’ll add a column at a specific position.

You can add a column at a specific index by using the df.insert() method.

Use the below snippet to add a column at a specific index.

# Using DataFrame.insert() to add a column
df.insert(3, "State Tax", [5,10,10,5,10], True)

df

where,

  • 3 – Position where the new column needs to be inserted
  • State Tax – Name of the new column
  • [5,10,10,5,10] – List of values to be assigned to the new column
  • True – To allow duplicate columns. If False, the new column will not be inserted if a column with name Tax% is already existing.

An index is zero-based. Hence you’ll see the new column State Tax added in the fourth position of the dataframe.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsState TaxTax%Total_PriceTax_new %
0Keyboard500555NaT10
1Mouse20051010NaT15
2Monitor5000101010NaT12
3CPU100002055NaT10
4Speakers25081010NaT11

The State Tax and the Remarks column are added for demonstration.

Let’s delete these columns. Refer to how to drop column in pandas dataframe to know about deleting columns in pandas dataframe.

Now, use the below snippet to delete the columns at positions 3 and 6.

#Droping the column added for demonstration. 

# drop the duplicate column
df.drop(df.columns[[3,6]], axis=1, inplace=True)

df

Where,

  • df.columns[[3,6]] – Specifying column indexes to be deleted. Note that, the column numbers are enclosed in double square brackets. [[, ]]. This is necessary, if you want to delete more than one column at once.
  • axis=1 – Specifying the drop option to be made in the column axis. axis=0 will perform drop operation in the row axis. which means the row will be deleted.
  • inplace=True – Specifying the drop operation must be made in the same dataframe rather creating a copy of the dataframe after delete operation.

The columns in indexes 3 and 6 are deleted.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%Total_Price
0Keyboard50055NaT
1Mouse200510NaT
2Monitor50001010NaT
3CPU10000205NaT
4Speakers250810NaT

You’ve learned how to add columns at a specific indexes.

Next, you’ll learn how to add columns with a constant value.

Add Column to Dataframe With Constant Value

In this section, you’ll learn how to add a column to a dataframe with a constant value. This means, all the cells in the newly added column will have the same constant value.

You can do this by assigning a single value using the assignment operator as shown below.

df["Price_Increase_Col"] = 200

df

Where,

  • df["Price_Increase_Col"] – specifying the new column in the dataframe.
  • 200 – Constant value to be added to all the cells in the new column.

Now, a new column called Price_Increase_Col will be added to the dataframe with the value 200 in all the cells.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%Total_PricePrice_Increase_Col
0Keyboard50055NaT200
1Mouse200510NaT200
2Monitor50001010NaT200
3CPU10000205NaT200
4Speakers250810NaT200

You’ve learned how to add columns to the dataframe in various cases.

Next, you’ll learn how to add multiple columns to the dataframe at once.

Add Multiple Column to Dataframe

In this section, you’ll learn how to add multiple columns to the dataframe in pandas.

You can add multiple columns to the dataframe by using the assignment operator.

Syntax

df['new_column_1'], df['new_column_2'] = [constant_value_for_Col_1, constant_value_for_Col_2]

df

You can use this to add multiple columns at once and the cells will have the same constant values when you use the above syntax.

Example

You’re adding two columns Product_Category and Available_Units to the dataframe df.

df['Product_Category'], df['Available_Units'] = [pd.NaT, 3]

df

Where,

  • df['Product_Category'], df['Available_Units'] – List of new columns to be added separated by comma.
  • [pd.NaT, 3] – List of constant values to be added as a default value for the newly added column respectively.

Now, two new columns are added to the dataframe.

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%Total_PricePrice_Increase_ColProduct_CategoryAvailabile_Units
0Keyboard50055NaT200NaT3
1Mouse200510NaT200NaT3
2Monitor50001010NaT200NaT3
3CPU10000205NaT200NaT3
4Speakers250810NaT200NaT3

You’ve learned how to append multiple columns to the dataframe at once.

Next, you’ll need to drop the added columns to clean up the dataframe. So we can use the same for the upcoming use cases.

Four columns added are Total_Price, Price_Increase_Col, Product_Category, Available_Units in the index 4,5,6,7 respectively.

Use the below snippet to drop these columns.

#Droping the column added for demonstration. 

# drop the duplicate column
df.drop(df.columns[[4,5,6,7]], axis=1, inplace=True)

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTax%
0Keyboard50055
1Mouse200510
2Monitor50001010
3CPU10000205
4Speakers250810

This is how you can multiple columns at once to the existing dataframe.

Add Empty Column to Dataframe

In this section, you’ll learn how to add empty column to pandas dataframe.

You can add a column by using the = operator with value pd.NaT.

Snippet

df["new_column"] = pd.NaT

pd.NaT is used to denote the missing values in the Pandas dataframe. When you assign this value to a new column, a new column will be added to the dataframe with values as NaT which ideally means a null value.

When you execute the below line, a new column called Total_Price will be added to the dataframe with NaT values.

df["Total_Price"] = pd.NaT

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsTotal_Price
0Keyboard5005NaT
1Mouse2005NaT
2Monitor500010NaT
3CPU1000020NaT
4Speakers2508NaT

You’ve learned how to add a column to pandas dataframe with empty values.

Next, you’ll learn how to add columns with values.

Conclusion

To summarize, you’ve learned how to add columns to pandas dataframe. You’ve learned different methods available in the pandas Dataframe to add a new column in the existing dataframe along with the different use-cases to add new columns.

If you’ve any questions or feedback feel free to comment below.

You May also Like

Leave a Comment