How To Replace Header With First Row in Pandas Dataframe?

Pandas Dataframe is a two-dimensional data structure that allows you to store data in a row and column format. When storing data in row and column format, you may need to name the columns for better identification and ease of accessing the data. Sometimes, the first row of the dataframe will be having the column header information.

You can replace the header with the first row of the dataframe by using df.columns = df.iloc[0].

If You’re in Hurry…

You can use the below code snippet to replace the header with the first row of the pandas dataframe.

Snippet

df.columns = df.iloc[0] 

df = df[1:]

df.head()

While reading Data from CSV File

Snippet

import pandas as pd

df= pd.read_csv('iris.csv', header=[0])

df.head()

If You Want to Understand Details, Read on…

In this tutorial, you’ll learn the different methods available to replace the header with the first row and set the first two rows as multiple headers as pandas.

If you want to add a new header that doesn’t exist in the dataframe, refer to How to Add Header To Pandas Dataframe.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

You’ll first create a dataframe using the iris data. iris is having a list of tuples where each tuple is having sepal_length, sepal_width, petal_length, petal_width and the flower_type which denotes the category of the flower based on sepal and petal measurements.

Here, the column headers are also directly available in the list, hence the pd.DataFrame() method will consider it as just another row and create a dataframe with the index numbers as column headers as shown below.

Snippet

import pandas as pd

iris = [ ('sepal_length', 'sepal_width', 'petal_length','petal_width', 'flower_type'),
             ('spl_len(cm)', 'spl_wid(cm)' , 'petal_len(cm)', 'petal_wid(cm)','flower_type'),
         (5.1,3.5,1.4,0.2,'Iris-setosa'),
(4.9,3,1.4,0.2,'Iris-setosa'),
(4.7,3.2,1.3,0.2,'Iris-setosa'),
(4.6,3.1,1.5,0.2,'Iris-setosa'),
(5,3.6,1.4,0.2,'Iris-setosa')
              ]

#Create a DataFrame object with iris Data
df = pd.DataFrame(iris) 

df.head(5)        

When you print the dataframe, you can see that the numbers are available as column headers and the column names are available as rows separately.

Dataframe Looks Like

01234
0sepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
1spl_len(cm)spl_wid(cm)petal_len(cm)petal_wid(cm)flower_type
25.13.51.40.2Iris-setosa
34.931.40.2Iris-setosa
44.73.21.30.2Iris-setosa

Now, you’ll see how to replace the header of the pandas dataframe with the First row.

Pandas Replace Header With First Row

When the column headers are available in the first row of the dataframe, you can make that information as a column header and remove it from the dataframe rows.

There are two methods available for it.

  • Using the Slicing operator
  • Using the iLOC

Let’s see these methods in detail.

Using Slicing Operator to Replace Header With First Row

The slicing operator is used to slice the rows of a dataframe from a specific index.

For example, if you want to slice the rows beginning from the index 1, you can use the df[1:] statement.

where,

  • 1 denotes the beginning index of the rows to be sliced
  • : used to denote the range. If you want to slice until a specific row, you can use that index after the :. Otherwise, you can just use the :. This means all the rows until the end will be sliced.

In the below snippet, the following operations happen.

  • First row of the dataframe is assigned to the df.columns using the df.iloc[0] statement
  • Next, the dataframe is sliced from the second row using its index 1 and assigned to the dataframe index. This will remove the first row with index 0 from the dataframe
  • With these steps, the header of the dataframe is replaced with the first row of the dataframe.

This method will not reset the index of the rows. The header row will have index 0, the first row will have index 1 and the second row will have index 2, and so on.

Snippet

df.columns = df.iloc[0] 

df = df[1:]

df.head()

When you print the dataframe, you’ll see that the first row of the dataframe is made as the header of the pandas dataframe.

Dataframe Looks Like

0sepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
1spl_len(cm)spl_wid(cm)petal_len(cm)petal_wid(cm)flower_type
25.13.51.40.2Iris-setosa
34.931.40.2Iris-setosa
44.73.21.30.2Iris-setosa
54.63.11.50.2Iris-setosa

Using df.reset_index() to Replace Header With First Row

In this section, you’ll learn how to replace the header with the first row of the dataframe.

Similar to the previous section, first assign the first row to the dataframe columns using the df.columns = df.iloc[0].

Next, slice the dataframe from the first row using the iloc[1:] and reset its row index using the reset_index() method.

The statement drop=True will drop the first row as you have already made that as the header column.

This method will reset the index of the rows. The header row will not have an index and the first row will have an index 0 and the second row will have an index 1 and so on.

Snippet

df.columns = df.iloc[0]

df = df.iloc[1:].reset_index(drop=True)

df.head()

Dataframe Looks Like

sepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
0spl_len(cm)spl_wid(cm)petal_len(cm)petal_wid(cm)flower_type
15.13.51.40.2Iris-setosa
24.931.40.2Iris-setosa
34.73.21.30.2Iris-setosa
44.63.11.50.2Iris-setosa

Next, you’ll learn how to set the first two rows as headers.

Pandas Set First Two rows as Header

Pandas dataframe supports having multiple headers for each column. In this section, you’ll learn how to set the first two rows as the header. When you use this method, the pandas dataframe will have multiple header rows.

Similar to setting the first row as header, you can set the first two rows as a header by assigning the first two rows to the df.columns attribute using the statement df.columns = [df.iloc[0], df.iloc[1]].

After that, you can remove the first two rows from the dataframe by slicing the dataframe from the third row using the df[2:].

If you want to reset the index, you can use the reset_index() method while setting two rows as header.

Use the below snippet to set the first two rows as header rows from the dataframe.

Snippet

df.columns = [df.iloc[0], df.iloc[1]]

df = df[2:]

df.head()

When you print the dataframe using the df.head() method, you can see that the pandas dataframe is having two column headers for each column.

Dataframe Looks Like

sepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
1spl_len(cm)spl_wid(cm)petal_len(cm)petal_wid(cm)flower_type
25.13.51.40.2Iris-setosa
34.931.40.2Iris-setosa
44.73.21.30.2Iris-setosa
54.63.11.50.2Iris-setosa
653.61.40.2Iris-setosa

Pandas Replace Header With nth Row

If you have the potential headers at any of the header rows, you can replace the header with the nth row.

Just use the index of that specific row in the place of i in df.iloc[i] statement.

Pandas Set First Row as Header While Reading CSV

In this section, you’ll learn how to set the first row as a header while reading the data from a CSV file using the read_csv method.

The read_csv() method accepts the parameter header. You can pass header=[0] to make the first row from the CSV file as a header of the dataframe.

Use the below snippet to set the first row as a header while reading the CSV file to create the dataframe.

Snippet

import pandas as pd

df= pd.read_csv('iris.csv', header=[0])

df.head()

When printing the dataframe, you can see that the first row from the CSV file is set as the header of the dataframe.

Dataframe Looks Like

nosepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
0nospl_len(cm)spl_wid(cm)petal_len(cm)peral_wid(cm)flower
115.13.51.40.2Iris-setosa
224.931.40.2Iris-setosa
334.73.21.30.2Iris-setosa
444.63.11.50.2Iris-setosa

Pandas Set Two Rows as Header While Reading CSV

In this section, you’ll learn how to set two rows as a header while reading the data from a CSV file.

The read_csv() method accepts the parameter header. You can pass header=[0, 1] to make the first two rows from the CSV file as a header of the dataframe. Using this way, you can create a dataframe with multiple header rows.

Use the below snippet to set the first two rows as a header while reading the CSV file to create the dataframe.

Snippet

import pandas as pd

df= pd.read_csv('iris.csv', header=[0,1])

df.head()

When you print the dataframe, you can see that the first two rows of the CSV file are made as the header of the dataframe.

Dataframe Looks Like

nosepal_lengthsepal_widthpetal_lengthpetal_widthflower_type
nospl_len(cm)spl_wid(cm)petal_len(cm)peral_wid(cm)flower
015.13.51.40.2Iris-setosa
124.93.01.40.2Iris-setosa
234.73.21.30.2Iris-setosa
344.63.11.50.2Iris-setosa
455.03.61.40.2Iris-setosa

This is how you can make the first row as the header of the dataframe while reading data from the CSV file.

Conclusion

To summarize, you’ve learned how to replace the header with the first row of the dataframe and setting the first two rows as a header of the dataframe.

Additionally, you’ve also learned how to set the first row as a header while reading data from the CSV file.

If you have any questions, comment below.

You May Also Like

Leave a Comment