Categories: pandasPython

Concatenating data in Pandas

Concatenation combines one or more different DataFrames into one. The concat() function of Pandas for combining DataFrames across rows or columns. Consider the following DataFrames:

import pandas as pd

record1 = [[‘John’,14,82.5],[‘Maria’,12,90.0],[‘Tom’,13,77.0]]
df1 = pd.DataFrame(record1,columns=[‘Name’,’Age’,’Marks’],index=[0, 1, 2])
print(df1)

record2 = [[‘Ben’,12,65.5],[‘Amy’,12,71.0],[‘Tina’,14,63.5]]
df2 = pd.DataFrame(record2,columns=[‘Name’,’Age’,’Marks’],index=[3, 4, 5])
print(df2)

record3 = [[‘Adam’,15,87.0],[‘Carla’,14,73.0]]
df3 = pd.DataFrame(record3,columns=[‘Name’,’Age’,’Marks’],index=[6, 7])
print(df3)

The three DataFrames are:

Name Age Marks
0 John 14 82.5
1 Maria 12 90.0
2 Tom 13 77.0

Name Age Marks
3 Ben 12 65.5
4 Amy 12 71.0
5 Tina 14 63.5

Name Age Marks
6 Adam 15 87.0
7 Carla 14 73.0

Now, to concatenate them into one, we use:

df = pd.concat([df1,df2,df3])
Name Age Marks
0 John 14 82.5
1 Maria 12 90.0
2 Tom 13 77.0
3 Ben 12 65.5
4 Amy 12 71.0
5 Tina 14 63.5
6 Adam 15 87.0
7 Carla 14 73.0

Using concat(), we can also concatenate along the columns. We just need to change the parameter “axis=1”. The type of join can also be specified. For example:

import pandas as pd

record1 = [[‘S1′,’John’,14], [‘S2′,’Maria’,12], [‘S3′,’Tom’,13], [‘S4′,’Adam’,15]]
df1 = pd.DataFrame(record1,columns=[‘S_Id’,’Name’,’Age’])
print(df1)

record2 = [[‘S1’,14,82.5], [‘S2’,13,90.0], [‘S3’,14,77.0], [‘S4’,15,87.0]]
df2 = pd.DataFrame(record2,columns=[‘S_Id’,’Age’,’Marks’])
print(df2)

df = pd.concat([df1,df2],axis=1)
print(df)

The two DataFrames are:

S_Id Name Age
0 S1 John 14
1 S2 Maria 12
2 S3 Tom 13
3 S4 Adam 15

S_Id Age Marks
0 S1 14 82.5
1 S2 13 90.0
2 S3 14 77.0
3 S4 15 87.0

And the concatenated one is:

S_Id Name Age S_Id Age Marks
0 S1 John 14 S1 14 82.5
1 S2 Maria 12 S2 13 90.0
2 S3 Tom 13 S3 14 77.0
3 S4 Adam 15 S4 15 87.0

Summary

This article focused on Concatenating data in Pandas. The next article will focus on crosstab, pivot tables and melt() function operations in pandas.

Pallavi Pandey

Recent Posts

MapReduce Algorithm

In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…

1 month ago

Linear Programming using Pyomo

Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…

8 months ago

Networking and Professional Development for Machine Learning Careers in the USA

In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…

10 months ago

Predicting Employee Churn in Python

Analyze employee churn, Why employees are leaving the company, and How to predict, who will…

1 year ago

Airflow Operators

Airflow operators are core components of any workflow defined in airflow. The operator represents a…

1 year ago

MLOps Tutorial

Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…

1 year ago