Pandas dataframe is a two-dimensional data structure that is used to store values in row and columns format. The rows and columns can have labels that can be used to access them. Row labels are called indexes and Column labels are known as headers.
You can add header to pandas dataframe using the df.colums = [‘Column_Name1’, ‘column_Name_2’] method.
If You’re in Hurry…
You can use the below code snippet to set column headers to the dataframe.
Snippet
df.columns = ["sepal_length", "sepal_width", "petal_length", "petal_width"]
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the different methods available to add column names to the dataframe.
If you have a dataframe with column names already, you can consider renaming the dataframe columns.
Table of Contents
Sample Dataframe (Dataframe Without Header)
This is the sample dataframe used throughout the tutorial.
You’re loading the iris dataset from the sklearn datasets library and create a pandas dataframe out of it. When creating it doesn’t have the headers to it.
Snippet
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
df = pd.DataFrame(data=iris.data)
df.head()
You can print the dataframe using df.head() and you’ll see the first 5 rows of the dataframe.
Since it doesn’t have any headers, you’ll see the index as 0,1,2,3.
Dataframe Will Look Like
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 |
1 | 4.9 | 3.0 | 1.4 | 0.2 |
2 | 4.7 | 3.2 | 1.3 | 0.2 |
3 | 4.6 | 3.1 | 1.5 | 0.2 |
4 | 5.0 | 3.6 | 1.4 | 0.2 |
Now, let’s see the different ways to add the header to the pandas dataframe.
Adding Header To Existing Pandas Dataframe
In this section, you’ll learn how to Add Column Names to an existing Pandas Dataframe using the columns attribute or the set_axis() method.
Using Columns Attribute
You can use the columns attribute available in the dataframe to set the header.
It is the attributes that store the column values of the dataframe.
To add the headers, you can assign the column names as a list to this attribute as shown below.
Snippet
df.columns = ["sepal_length", "sepal_width", "petal_length", "petal_width"]
df.head()
Dataframe Will Look Like
sepal_length | sepal_width | petal_length | petal_width | |
---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 |
1 | 4.9 | 3.0 | 1.4 | 0.2 |
2 | 4.7 | 3.2 | 1.3 | 0.2 |
3 | 4.6 | 3.1 | 1.5 | 0.2 |
4 | 5.0 | 3.6 | 1.4 | 0.2 |
This is how you can use the columns attribute to insert headers to the dataframe.
Using Set_Axis() Method
In this section, you’ll learn how the set_axis() method sets the column headers of the dataframe. As per the doc, it is used to set the index of the specified axis.
In this context, you’ll use it to set the index of the column axes.
It accepts three parameters.
Columns_names_list
– List of column names to be assigned to the dataframeaxis=1
– To specify that the label needs to be set for the column axesinplace=True
– To specify that the changes must be made in the same dataframe rather than creating a new dataframe
Use the below snippet to add the header to the existing dataframe.
Snippet
df.set_axis(["sepal_length(cm)", "sepal_width(cm)", "petal_length(cm)", "petal_width(cm)"],axis=1,inplace=True)
df.head()
When you print the dataframe, using df.head()
method, you can see the first five rows printed along with the new column names.
Dataframe Will Look Like
sepal_length(cm) | sepal_width(cm) | petal_length(cm) | petal_width(cm) | |
---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 |
1 | 4.9 | 3.0 | 1.4 | 0.2 |
2 | 4.7 | 3.2 | 1.3 | 0.2 |
3 | 4.6 | 3.1 | 1.5 | 0.2 |
4 | 5.0 | 3.6 | 1.4 | 0.2 |
This is how you can add a title to the columns in the pandas dataframe.
Add Header While Reading from CSV File
In this section, you’ll learn how to add the header to the pandas dataframe while reading the data from the CSV file.
The read_csv()
method accepts the parameter names
. You can pass the column names as a list so that it is assigned to the dataframe created by reading the CSV file.
Use the below snippet to read the CSV file with your desired column names.
When using this method, ensure that the column headers are not already available in the CSV file. Otherwise, you’ll have the headers from the CSV file added as a data row in your dataframe.
Snippet
import pandas as pd
df = pd.read_csv("iris.csv", names=["sepal_length(cm)", "sepal_width(cm)", "petal_length(cm)", "petal_width(cm)"])
df.head()
Dataframe Will Look Like
sepal_length(cm) | sepal_width(cm) | petal_length(cm) | petal_width(cm) | ||
---|---|---|---|---|---|
1 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa |
2 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa |
3 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa |
4 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa |
5 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa |
This is how you can add column names while reading the CSV file. This is also known as Add Title To Dataframe Pandas while reading the CSV file.
Add Multilevel Column Header
Pandas dataframe can have multiple column headers for columns or rows. In this section, you’ll learn how to add a multilevel column header.
The dataframe created in the above sections contains headers already. Now, you’ll add the second-level column header.
You can use the same set_index method demonstrated in the previous sections. However, you need to use the parameter append=True
additionally to add the column names in the next level rather than replacing the existing column names.
Use the below snippet to add a multilevel column header to the existing dataframe.
Snippet
df['Flower Type'] = 'Iris'
df = df.set_index('Flower Type', append=True).unstack('Flower Type')
df.head()
Dataframe Will Look Like
sepal_length(cm) | sepal_width(cm) | petal_length(cm) | petal_width(cm) | ||
---|---|---|---|---|---|
Flower Type | Iris | Iris | Iris | Iris | |
1 | 5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa |
2 | 4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa |
3 | 4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa |
4 | 4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa |
5 | 5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa |
This is how you can add a multilevel column header to the existing pandas dataframe.
Conclusion
To summarize, you’ve learned how to add a header to the existing pandas dataframe using the df.column
attribute and the df.set_axis()
method. You’ve also learned how to set column names while reading the CSV file to create a pandas dataframe.
Also, you’ve set the multilevel column names for the dataframe using the set_index()
method.
If you have any questions, comment below.