In this article, we will focus on the map() and reduce() operations in Pandas and how they are used for Data Manipulation.
Pandas map() operation is used to map the values of a Series according to the given input value which can either be another Series, a dictionary, or a function. map() operation does not work on a DataFrame.
Syntax:
Series.map(arg, na_action=None) |
The parameters are:
Let us look at few examples of map() operation on the following Series:
import numpy as np import pandas as pd country = [‘Germany’, ‘Canada’, np.nan, ‘Japan’, ‘Australia’] series = pd.Series(country) print(series) |
This gives the following Series:
0 Germany 1 Canada 2 NaN 3 Japan 4 Australia dtype: object |
Now applying map() operations on this Series, by using a dictionary as an argument:
series.map({‘Canada’: ‘Ottawa’, ‘Japan’: ‘Tokyo’, ‘Australia’:’Canberra’}) |
Output:
0 NaN 1 Ottawa 2 NaN 3 Tokyo 4 Canberra dtype: object |
You can also map it to a function, for example:
print(series.map(‘He is from {}’.format, na_action=’ignore’)) |
Output:
0 He is from Germany 1 He is from Canada 2 NaN 3 He is from Japan 4 He is from Australia dtype: object |
If we don’t use na_action=‘ignore’ here, then it would change the line at index 2 as – “He is from nan”.
reduce() operation is used on a Series to apply the function passed in its argument to all elements on the Series. reduce() is defined in the functools module of Python.
The way the algorithm of this function works is that initially, the function is called with the first two elements from the Series and the result is returned. The function is now applied to this result and the next element in the Series. The process keeps repeating itself until there are items in the sequence. The final result is ultimately returned by the function.
For example, consider the following series:
import pandas as pd data = [11,6,7,3,28,1] series = pd.Series(data) print(series) |
The series is:
0 11 1 6 2 7 3 3 4 28 5 1 dtype: int64 |
Now, let’s apply a function on this Series that uses reduce to find the product of all elements in the list:
# import functools module import functools # using reduce operation to apply function on the series product = functools.reduce(lambda x,y : x*y,series) print (“Product: “,product,sep=””) |
Output:
Product: 38808 |
Look at another example which uses reduce() to find minimum element of the Series:
# import functools module import functools # using reduce operation to apply function on the series minimum = functools.reduce(lambda x,y : x if x < y else y,series) print (“Minimum value: “,minimum,sep=””) |
Output:
Minimum value: 1 |
In this article, we looked at map() and reduce() functions. In the next one, we will look at ways to handle missing values in Pandas.
In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…
Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…
In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…
Analyze employee churn, Why employees are leaving the company, and How to predict, who will…
Airflow operators are core components of any workflow defined in airflow. The operator represents a…
Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…