Categories: pandasPython

apply() in Pandas

apply() in Pandas is used to apply a function(e.g. lambda function) to a DataFrame or Series. This is highly useful in various Machine Learning and Data Analysis projects where we need to separate data based on certain conditions or apply lambda functions to a DataFrame or Series.

DataFrame – apply()

Using Pandas apply(), we can apply a function along an axis of a DataFrame. The function is applied along each row if axis=0, and it is applied along each column if axis=1.

The syntax is:

DataFrame.apply(func, axis, raw, result_type, args, **kwds)

The parameters are:

func : the function to be applied
axis : axis along which the function is applied; for rows – 0 or ‘index’; for columns – 1 or ‘columns’
raw : (bool) if False, then row or column is passed as a series to the function, otherwise it is passed as ndarray objects (default: False)
result_type : (‘expand’, ‘reduce’, or ‘broadcast’) can only be applied on columns or axis=1; ‘expand’ makes list-like results turn into columns, ‘reduce’ returns a Series; ‘broadcast’ results will retain the original index and columns. (default: None)
args : (tuple) positional arguments to pass to the function.
**kwds : additional keyword arguments that can be passed to the function

Let’s look at various examples of the result of applying Pandas apply() on DataFrame.

Consider the following DataFrame:

import numpy as np
import pandas as pd

data = [[5,14],[8,12],[2,13]]

df = pd.DataFrame(data, columns=[‘col1′,’col2’], index=[‘row1′,’row2′,’row3’])
print(df)

Our DataFrame looks like:

col1 col2
row1 5 14
row2 8 12
row3 2 13

Let us apply a custom function to the DataFrame values. The function is:

# user function which takes an argument n
def myfunc(n):
return ((2*n)-3)

Now apply this function to the DataFrame df:

df.apply(myfunc)

As a result, the function gets applied to each and every value of the DataFrame:

col1 col2
row1 7 25
row2 13 21
row3 1 23

Now let’s apply some built-in NumPy functions:

df.apply(np.sqrt)

Output:

col1 col2
row1 2.236068 3.741657
row2 2.828427 3.464102
row3 1.414214 3.605551

We can also apply a function to each row or each column as:

df.apply(np.sum, axis=’index’)

The result of summing entries of each row is:

col1 15
col2 39
dtype: int64

We can also apply list-like values using a lambda function.

df.apply(lambda x: [3, 15], axis=1)

This results in:

col1 col2
row1 3 15
row2 3 15
row3 3 15

Series – apply()

We can also apply a function on the values of a Series using the apply() function. The syntax for apply() on series is:

Series.apply(func, convert_dtype, args, **kwds)

The parameters are:

func : the Python function to be applied
convertt_dtype : (bool) if True, it tries to find better dtype for elementwise function results (default: True)
args : (tuple) positional arguments to pass to the function.
**kwds : additional keyword arguments that can be passed to the function

Consider the following Series:

import numpy as np
import pandas as pd

data = [5,8,12,2]
series = pd.Series(data, index=[‘s1′,’s2′,’s3′,’s4’])

This gives the following Series:

s1 5
s2 8
s3 12
s4 2
dtype: int64

Applying the function to this Series:

series.apply(np.sqrt)

This gives:

s1 2.236068
s2 2.828427
s3 3.464102
s4 1.414214
dtype: float64

Another example:

series.apply(lambda x: x ** 2)

Output:

s1 25
s2 64
s3 144
s4 4
dtype: int64

Similarly, we can apply various types of built-in and user-defined functions to the Series.

Summary

In this article, we looked at the apply() function of Pandas. The next article will focus on map() and reduce() operations.

Pallavi Pandey

Next Data Science Interview Questions Part-4 (Unsupervised Learning) »

Previous « Iterating over rows and columns in Pandas DataFrame

MLOps

MLOps Tutorial

Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…

2 years ago

apply() in Pandas

DataFrame – apply()

Series – apply()

Summary

Recent Posts

MapReduce Algorithm

Linear Programming using Pyomo

<strong>Networking and Professional Development for Machine Learning Careers in the USA</strong>

Predicting Employee Churn in Python

Airflow Operators

MLOps Tutorial

apply() in Pandas

DataFrame – apply()

Series – apply()

Summary

Related Post

Recent Posts

MapReduce Algorithm

Linear Programming using Pyomo

<strong>Networking and Professional Development for Machine Learning Careers in the USA</strong>

Predicting Employee Churn in Python

Airflow Operators

MLOps Tutorial