Categories: pandasPython

Pandas map() and reduce() Operations

In this article, we will focus on the map() and reduce() operations in Pandas and how they are used for Data Manipulation.

map()

Pandas map() operation is used to map the values of a Series according to the given input value which can either be another Series, a dictionary, or a function. map() operation does not work on a DataFrame.

Syntax:

Series.map(arg, na_action=None)

The parameters are:

  • arg : (Series, dict, or function) mapping correspondence
  • na_action : (None, ‘ignore’) If ‘ignore’, then propagate NaN values, without passing them to the mapping correspondence (default: None)

Let us look at few examples of map() operation on the following Series:

import numpy as np
import pandas as pd

country = [‘Germany’, ‘Canada’, np.nan, ‘Japan’, ‘Australia’]

series = pd.Series(country)
print(series)

This gives the following Series:

0 Germany
1 Canada
2 NaN
3 Japan
4 Australia
dtype: object

Now applying map() operations on this Series, by using a dictionary as an argument:

series.map({‘Canada’: ‘Ottawa’, ‘Japan’: ‘Tokyo’, ‘Australia’:’Canberra’})

Output:

0 NaN
1 Ottawa
2 NaN
3 Tokyo
4 Canberra
dtype: object

You can also map it to a function, for example:

print(series.map(‘He is from {}’.format, na_action=’ignore’))

Output:

0 He is from Germany
1 He is from Canada
2 NaN
3 He is from Japan
4 He is from Australia
dtype: object

If we don’t use na_action=‘ignore’ here, then it would change the line at index 2 as – “He is from nan”.

reduce()

reduce() operation is used on a Series to apply the function passed in its argument to all elements on the Series. reduce() is defined in the functools module of Python.

The way the algorithm of this function works is that initially, the function is called with the first two elements from the Series and the result is returned. The function is now applied to this result and the next element in the Series. The process keeps repeating itself until there are items in the sequence. The final result is ultimately returned by the function.

For example, consider the following series:

import pandas as pd

data = [11,6,7,3,28,1]

series = pd.Series(data) print(series)

The series is:

0 11
1 6
2 7
3 3
4 28
5 1
dtype: int64

Now, let’s apply a function on this Series that uses reduce to find the product of all elements in the list:

# import functools module
import functools

# using reduce operation to apply function on the series
product = functools.reduce(lambda x,y : x*y,series)
print (“Product: “,product,sep=””)

Output:

Product: 38808

Look at another example which uses reduce() to find minimum element of the Series:

# import functools module
import functools

# using reduce operation to apply function on the series
minimum = functools.reduce(lambda x,y : x if x < y else y,series)
print (“Minimum value: “,minimum,sep=””)

Output:

Minimum value: 1

Summary

In this article, we looked at map() and reduce() functions. In the next one, we will look at ways to handle missing values in Pandas.

Pallavi Pandey

Recent Posts

MapReduce Algorithm

In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…

8 months ago

Linear Programming using Pyomo

Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…

1 year ago

Networking and Professional Development for Machine Learning Careers in the USA

In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…

1 year ago

Predicting Employee Churn in Python

Analyze employee churn, Why employees are leaving the company, and How to predict, who will…

2 years ago

Airflow Operators

Airflow operators are core components of any workflow defined in airflow. The operator represents a…

2 years ago

MLOps Tutorial

Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…

2 years ago