Iterating over rows and columns in Pandas DataFrame

Iteration is the process in which we traverse the DataFrame, going over the items, and doing the necessary tasks. In this article, we will look at different ways of iterating over rows and columns in Pandas. Iterating over Pandas DataFrame can be visualized in a way similar to a Python dictionary. Just like there are key-value pairs in a dictionary, in a similar way, we iterate over keys and get the value pairs accordingly.

Here we will consider the following student_record DataFrame to demonstrate our iterations over rows and columns:

import pandas as pd

student_records = [[‘John’,14,82.5],[‘Maria’,12,90.0],[‘Tom’,13,77.0],[‘Amy’,12,71.0]]
df = pd.DataFrame(student_records,columns=[‘Name’,’Age’,’Marks’])
print(df)

Thus, the DataFrame is:

Name Age Marks
0 John 14 82.5
1 Maria 12 90.0
2 Tom 13 77.0
3 Amy 12 71.0

Now let’s get started with various kinds of iterations in Pandas.

Iterate over Rows

There are several ways to iterate over rows in Pandas DataFrame. Let’s have a look at them.

iterrows()

This operation returns each index value along with a series containing the record of each row’s set of values. Let’s take a look at its working:

# using iterrows()
for i, j in df.iterrows():
print(i,j)
print()

Following is the output obtained:

0 Name John
Age 14
Marks 82.5
Name: 0, dtype: object

1 Name Maria
Age 12
Marks 90
Name: 1, dtype: object

2 Name Tom
Age 13
Marks 77
Name: 2, dtype: object

3 Name Amy
Age 12
Marks 71
Name: 3, dtype: object

itertuples()

This function iterates over rows and returns a tuple for each row in the DataFrame. Each returned tuple consists of the index value along with all other values of the record. Take a look at the code:

# using itertuples()
for x in df.itertuples():
print(x)

The returned output is:

Pandas(Index=0, Name=’John’, Age=14, Marks=82.5)
Pandas(Index=1, Name=’Maria’, Age=12, Marks=90.0)
Pandas(Index=2, Name=’Tom’, Age=13, Marks=77.0)
Pandas(Index=3, Name=’Amy’, Age=12, Marks=71.0)

Iterate over Columns

The various ways of iterating over columns in Pandas DataFrame are:

iteritems()

This operation iterates over each column, taking the column label as key and the column data as the value of the key-value pair.

Let’s use this method on the above student_record data:

# using iteritems()
for i, j in df.iteritems():
print(i,j)
print()

We get:

Name 0 John
1 Maria
2 Tom
3 Amy
Name: Name, dtype: object

Age 0 14
1 12
2 13
3 12
Name: Age, dtype: int64

Marks 0 82.5
1 90.0
2 77.0
3 71.0
Name: Marks, dtype: float64

Column Names

We can also iterate over the columns by creating a list of column labels and then iterating over that list, as:

# creating a list of column labels
cols = list(df)

for i in cols:
print (i,df[i].values,sep=” : “)

Output obtained after terating over all columns in the list is:

Name : [‘John’ ‘Maria’ ‘Tom’ ‘Amy’]
Age : [14 12 13 12]
Marks : [82.5 90. 77. 71. ]

You can also choose only certain labels in your list and iterate over them, i.e., create a list with only those column names which you want to iterate over.

Summary

In this article, we looked at various ways for iterating over rows and columns in Pandas DataFrame. In the upcoming articles, we will look over various other operations such as apply(), map(), reduce(), etc.

Leave a Reply

Your email address will not be published.