Pandas dataframe is a two-dimensional data structure used to store data in rows and columns format. Each column will have headers/names.
You can get column names in Pandas dataframe using df.columns statement.
In this tutorial, you’ll learn the different methods available to get column names from the pandas dataframe.
If you’re in Hurry
You can use the below code snippet to get column names from pandas dataframe.
Snippet
df.columns
You’ll see all the column names from the dataframe printed as Index.
The index is an immutable sequence used for indexing.
Output
Index(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity',
'Available_Since_Date'],
dtype='object')
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the different methods available to get the pandas dataframe column headers for various purposes.
Table of Contents
Sample Dataframe
This is the sample dataframe used throughout the tutorial.
import pandas as pd
data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU", "Speakers",pd.NaT],
"Unit_Price":[500,200, 5000, 10000, 250.50,350],
"No_Of_Units":[5,5, 10, 20, 8,pd.NaT],
"Available_Quantity":[5,6,10,"Not Available", pd.NaT,pd.NaT],
"Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','01/05/2021',pd.NaT]
}
df = pd.DataFrame(data)
# Converting one column as float to demonstrate dtypes
df = df.astype({"Unit_Price": float})
df
Dataframe Looks Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.0 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.0 | 5 | 6 | 4/23/2021 |
2 | Monitor | 5000.0 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.0 | 20 | Not Available | 09/18/2021 |
4 | Speakers | 250.5 | 8 | NaT | 01/05/2021 |
5 | NaT | 350.0 | NaT | NaT | NaT |
Now, let’s see how to get the column headers.
Pandas Get Column Names
In this section, you’ll see how to get column names using different methods.
Using Columns
The columns attribute of the dataframe returns the column labels of the dataframe.
Snippet
df.columns
Output
Index(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity',
'Available_Since_Date'],
dtype='object')
Get Column Names as Array
You can get the column names as an array using the .columns.values
property of the dataframe.
Snippet
df.columns.values
You’ll see the column headers returned as array
.
Output
array(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity',
'Available_Since_Date'], dtype=object)
Pandas Get List From Dataframe Columns Headers
You can get column names as a list by using the .columns.values
property and converting it to a list using the tolist()
method, as shown below.
Code
df.columns.values.tolist()
You’ll see the column headers returned as list.
Output
['product_name',
'Unit_Price',
'No_Of_Units',
'Available_Quantity',
'Available_Since_Date']
Another way to get column headers as a list is by using the list()
method.
You can pass the dataframe object to the list()
method. It’ll return the column headers as a list.
Code
columns_list = list(df)
columns_list
You’ll see the column headers displayed as a list.
Output
['product_name',
'Unit_Price',
'No_Of_Units',
'Available_Quantity',
'Available_Since_Date']
This is how you can get pandas column names as a list.
Pandas List Column Names and Types
In this section, you’ll learn how to list column names and types of each column of the dataframe.
You can do this by using the dtypes. The dtypes
return a series with the data type of each column in the dataframe.
Snippet
df.dtypes
Output
You’ll see the column name and the data type of each column is printed as series.
product_name object
Unit_Price float64
No_Of_Units object
Available_Quantity object
Available_Since_Date object
dtype: object
Pandas Get Column Names by Index
In this section, you’ll learn how to get column names by using its index.
- You can get the name from a specific index by passing the index to the
columns
attribute - The index is
0
based. Hence, if you use2
, you’ll get a column from the third position.
Code
df.columns[2]
Output
You’ll see the column header available in the position 3
.
'No_Of_Units'
This is how you can get a single column header using the index.
Pandas Get Column Names Based on Condition
In this section, you’ll learn how to get column names based on conditions.
- This can be useful when you want to identify columns that contain specific values. It is also known as getting column names by value.
- For example, if you need to get column names which have the value 5 in any cell, then you can use the following example.
Snippet
df.columns[
(df == 5) # mask
.any(axis=0) # mask
]
Output
In the sample dataframe, the columns No_Of_Units and Available_Quantity contains the value 5
. Hence, you’ll see the two columns printed as index
.
Index(['No_Of_Units', 'Available_Quantity'], dtype='object')
This is how you can get column names based on value.