Pandas DataFrame is a two-dimensional data structure used to store the data in the tabular format. It is similar to spreadsheets or a database table.
You can iterate over the pandas dataframe using the df.itertuples() method.
If You’re in Hurry…
You can use the below code to iterate over rows in pandas dataframe.
This is one of the fastest methods available to iterate over rows in the pandas dataframe.
Example
for tuple in df.itertuples():
print(tuple)
You’ll see the below output.
Each row in the dataframe will be iterated over and printed using the print statement.
Output
Pandas(Index=0, Lang='Java', Difficulty='Medium', Difficulty_Score=5, Type='Statically Typed')
Pandas(Index=1, Lang='Python', Difficulty='Easy', Difficulty_Score=2, Type='Dynamically Typed')
Pandas(Index=2, Lang='Cobol', Difficulty='Hard', Difficulty_Score=10, Type='NA')
Pandas(Index=3, Lang='Javascript', Difficulty='Medium', Difficulty_Score=8, Type='Dynamically typed')
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the various methods available to iterate over rows in the Pandas Dataframe.
Table of Contents
Sample DataFrame
import pandas as pd
data = {
"Lang":["Java","Python","Cobol","Javascript"],
"Difficulty":["Medium","Easy","Hard","Medium"],
"Difficulty_Score":[5,2,10,8],
"Type":["Statically Typed","Dynamically Typed","NA","Dynamically typed"],
}
df = pd.DataFrame(data)
print(df)
DataFrame Visualization
Lang Difficulty Difficulty_Score Type
0 Java Medium 5 Statically Typed
1 Python Easy 2 Dynamically Typed
2 Cobol Hard 10 NA
3 Javascript Medium 8 Dynamically typed
Now let’s discuss the various methods available to iterate over the rows in the pandas dataframe.
Using Itertuples() Function
In this section, you’ll learn how to iterate over rows in Pandas dataframe using the Itertuples()
method.
Itertuples() method iterates over the dataframe rows and returns a named tuple.
It accepts two parameters.
- Index – If true, it’ll include the index of the row as the first element of the tuple. If false, it’ll not in include the index of the row in the tuple. Default is set to
true
. - name – You can give a name to each tuple. By Default, its Pandas.
This is the fastest method to iterate over rows in Pandas dataframe.
Example
for tuple in df.itertuples():
print(tuple)
You’ve not passed the index parameter or the name parameter. Hence, the default value for both the parameter is used.
In each tuple displayed below, it’s named Pandas and it contains the index element for each row.
Output
Pandas(Index=0, Lang='Java', Difficulty='Medium', Difficulty_Score=5, Type='Statically Typed')
Pandas(Index=1, Lang='Python', Difficulty='Easy', Difficulty_Score=2, Type='Dynamically Typed')
Pandas(Index=2, Lang='Cobol', Difficulty='Hard', Difficulty_Score=10, Type='NA')
Pandas(Index=3, Lang='Javascript', Difficulty='Medium', Difficulty_Score=8, Type='Dynamically typed')
This is how you can iterate over rows in Pandas DataFrame using itertuples()
.
Next, you’ll see how to iterate over rows by column name.
Pandas Iterate Over Rows by Column Name
In this section, you’ll learn how to iterate over rows by column name. For this also you can use the itertuples()
and when iterating over each tuple, you can access the column in each tuple by using the column name as shown in the example.
Example
for tuple in df.itertuples():
print(tuple.Lang)
In this iteration, you’ll iterate over rows and access only the column Lang
.
Output
Java
Python
Cobol
Javascript
This is how you can iterate over rows by column name.
Next, you’ll learn how to use the apply()
method in Dataframe.
Using Apply() Function
In this section, you’ll learn how to use the apply()
method to iterate over rows in pandas dataframe.
Apply() method is used to apply a function to a row or a column in the dataframe.
A function can be of anything. You can define a function to perform any mathematical operation such as addition, subtraction, or average, and so on.
When to Use
You can use this method if you wish to modify the rows while iterating it over.
For eg.
- Iterating and appending a new value to a string based on a condition.
- Iterating and updating a new value for a column based on a condition.
Example 1
The below example shows how to iterate over the dataframe and concatenate two columns to display it together. This is achieved by defining a lambda function that will concatenate two columns of each row.
print(df.apply(lambda row: row["Lang"] + " " + str(row["Difficulty_Score"]), axis = 1))
Output
0 Java 5
1 Python 2
2 Cobol 10
3 Javascript 8
dtype: object
Example 2
The below example shows how to iterate over the dataframe and add 10 to the difficulty score of each language and display it…
…This is achieved by defining a lambda function which will add 10 to the difficulty score of each language.
print(df.apply(lambda row: row["Difficulty_Score"] + 10, axis = 1))
Output
0 15
1 12
2 20
3 18
dtype: int64
This is how you can use the apply() method to iterate over rows in a dataframe and apply a function to each row or column.
Next, you’ll learn about the iterrows()
function.
Using Iterrows() Function
In this section, you’ll learn how to use iterrows()
function to iterate through rows in a pandas dataframe.
iterrows() method iterates over dataframe as (index, series) pairs.
- Index is the name of each row
- series is a set of data in each row.
Example
for index, row in df.iterrows():
print(index)
print("*****")
print(row)
Output
0
*****
Lang Java
Difficulty Medium
Difficulty_Score 5
Type Statically Typed
Name: 0, dtype: object
1
*****
Lang Python
Difficulty Easy
Difficulty_Score 2
Type Dynamically Typed
Name: 1, dtype: object
2
*****
Lang Cobol
Difficulty Hard
Difficulty_Score 10
Type NA
Name: 2, dtype: object
3
*****
Lang Javascript
Difficulty Medium
Difficulty_Score 8
Type Dynamically typed
Name: 3, dtype: object
This is how you can use the iterrows()
method to iterate through the pandas dataframe and access the index and series of data in the dataframe.
Next, you’ll see the index attribute of the Dataframe.
Using Index Attribute
In this section, you’ll learn how to iterate over rows in the dataframe using the index attribute.
Index attribute is an immutable sequence used of indexing elements in a dataframe.
You can use this attribute and iterate over rows by using the index values and each column name.
Dataframe is two-dimensional.
- First dimension specifies the index of the columns
- Second Dimension specifies the index of the columns
df["lang"][i]
will yield value of the lang
column in the position i
.
Example
In the example, you’re iterating the dataframe index, and accessing the rows with this index, and specifying the column name in the first dimension to access the row of these columns.
for i in df.index:
print(df["Lang"][i],df["Difficulty"][i] )
Output
Java Medium
Python Easy
Cobol Hard
Javascript Medium
This is how you can iterate through the pandas dataframe using the index attribute.
Next, you’ll see about the loc
function.
Using LOC[] Function
In this section, you’ll learn how to use the LOC[] attribute to iterate over the dataframe.
LOC[] attribute is primarily label based and you can access the particular labels(Column name) from the specified index.
Example
In the below example, you’ll access the dataframe using the loc attribute and the column name lang
and Difficulty
during each iteration to access the values of these columns.
for i in range(len(df)):
print(df.loc[i,"Lang"], df.loc[i,"Difficulty"])
Output
Java Medium
Python Easy
Cobol Hard
Javascript Medium
This is how the loc[]
attribute is used to iterate through the dataframe.
Next, you’ll see how to use the iLOC[] function.
Using iLOC[] Fuction Of DataFrame
In this section, you’ll learn how to use the iLOC[] attribute to iterate over the dataframe.
iLOC[] attribute is primarily integer-based and you can access the particular index by specifying the integer.
Example
In the below example, you’ll access the dataframe using the iloc attribute.
During each row iteration,
- Use the column index
0
to access the first column. - Use the column index
1
to access the second column.
for i in range(len(df)):
print(df.iloc[i,0], df.iloc[i,1])
Output
Java Medium
Python Easy
Cobol Hard
Javascript Medium
This is how you can iterate over the dataframe using the iLOC[]
attribute.
Next, you’ll see the iteritems()
function to iterate over the dataframe.
Using Iteritems() Function
In this section, you’ll use the Iteritems() function to iterate over the dataframe.
iteritems() function iterates over the dataframe columns and returns a tuple with column name and content as a series.
iteritems()
is deprecated and will be removed in the future pandas version. You can use the items() method instead.
Example
for item in df.iteritems():
print(item)
Output
('Lang', 0 Java
1 Python
2 Cobol
3 Javascript
Name: Lang, dtype: object)
('Difficulty', 0 Medium
1 Easy
2 Hard
3 Medium
Name: Difficulty, dtype: object)
('Difficulty_Score', 0 5
1 2
2 10
3 8
Name: Difficulty_Score, dtype: int64)
('Type', 0 Statically Typed
1 Dynamically Typed
2 NA
3 Dynamically typed
Name: Type, dtype: object)
This is how you can use the iteritems()
method.
Using Items() Function
In this section, you’ll use the items() method in the dataframe to iterate over the rows.
items() method iterate over the dataframe and returns a tuple with the column name and content as a series of data.
Example
for item in df.items():
print(item)
Output
('Lang', 0 Java
1 Python
2 Cobol
3 Javascript
Name: Lang, dtype: object)
('Difficulty', 0 Medium
1 Easy
2 Hard
3 Medium
Name: Difficulty, dtype: object)
('Difficulty_Score', 0 5
1 2
2 10
3 8
Name: Difficulty_Score, dtype: int64)
('Type', 0 Statically Typed
1 Dynamically Typed
2 NA
3 Dynamically typed
Name: Type, dtype: object)
pandas iterate over rows by column name
In this subsection, you’ll use the iteritems()
to iterate over the dataframe and use the columnName and columnData fields to access the column data.
Example
for (columnName, columnData) in df.iteritems():
print('Column Name : ', columnName)
print('Column Contents : ', columnData.values)
Output
Column Name : Lang
Column Contents : ['Java' 'Python' 'Cobol' 'Javascript']
Column Name : Difficulty
Column Contents : ['Medium' 'Easy' 'Hard' 'Medium']
Column Name : Difficulty_Score
Column Contents : [ 5 2 10 8]
Column Name : Type
Column Contents : ['Statically Typed' 'Dynamically Typed' 'NA' 'Dynamically typed']
This is also known as Pandas Iterate Over Columns.
pandas iterate over rows with condition
In this subsection, you’ll use the iteritems()
to iterate over the dataframe and use an if condition to check if the current column is a specific column and access the column data if the condition is true. Else, the column will be skipped.
Example
for (columnName, columnData) in df.iteritems():
if(columnName == "Lang"):
print('Column Name : ', columnName)
print('Column Contents : ', columnData.values)
Output
Column Name : Lang
Column Contents : ['Java' 'Python' 'Cobol' 'Javascript']
Pandas Iterate Over Columns
Example
for (columnName, columnData) in df.iteritems():
print('Column Name : ', columnName)
print('Column Contents : ', columnData.values)
Output
Column Name : Lang
Column Contents : ['Java' 'Python' 'Cobol' 'Javascript']
Column Name : Difficulty
Column Contents : ['Medium' 'Easy' 'Hard' 'Medium']
Column Name : Difficulty_Score
Column Contents : [ 5 2 10 8]
Column Name : Type
Column Contents : ['Statically Typed' 'Dynamically Typed' 'NA' 'Dynamically typed']
Conclusion
To summarize, you’ve learned how to iterate over rows in Pandas dataframe using the different methods available in the Dataframe.
Among all the methods available, itertuples()
is the fastest method available to iterate over the pandas dataframe.
If you have any questions, feel free to comment below.
Thanks Vikram. Really nice tutorial.
My only comment would be that iteritems() and items() seem to be the same thing.
Hello Nick,
I appreciate for taking the time to write the feedback and am glad that you found it helpful.
Yes. Both iteritems() and items are the same. iteritems() yields the result from the items() internally.
iteritems() would be removed in future versions. Hence, items() is the recommended method.
I have updated the tutorial with this information.
Regards,
Vikram