Concatenating data in Pandas
Concatenation combines one or more different DataFrames into one. The concat() function of Pandas for combining DataFrames across rows or columns. Consider the following DataFrames:
import pandas as pd record1 = [[‘John’,14,82.5],[‘Maria’,12,90.0],[‘Tom’,13,77.0]] df1 = pd.DataFrame(record1,columns=[‘Name’,’Age’,’Marks’],index=[0, 1, 2]) print(df1) record2 = [[‘Ben’,12,65.5],[‘Amy’,12,71.0],[‘Tina’,14,63.5]] df2 = pd.DataFrame(record2,columns=[‘Name’,’Age’,’Marks’],index=[3, 4, 5]) print(df2) record3 = [[‘Adam’,15,87.0],[‘Carla’,14,73.0]] df3 = pd.DataFrame(record3,columns=[‘Name’,’Age’,’Marks’],index=[6, 7]) print(df3) |
The three DataFrames are:
Name Age Marks 0 John 14 82.5 1 Maria 12 90.0 2 Tom 13 77.0 Name Age Marks 3 Ben 12 65.5 4 Amy 12 71.0 5 Tina 14 63.5 Name Age Marks 6 Adam 15 87.0 7 Carla 14 73.0 |
Now, to concatenate them into one, we use:
df = pd.concat([df1,df2,df3]) |
Name Age Marks 0 John 14 82.5 1 Maria 12 90.0 2 Tom 13 77.0 3 Ben 12 65.5 4 Amy 12 71.0 5 Tina 14 63.5 6 Adam 15 87.0 7 Carla 14 73.0 |
Using concat(), we can also concatenate along the columns. We just need to change the parameter “axis=1”. The type of join can also be specified. For example:
import pandas as pd record1 = [[‘S1′,’John’,14], [‘S2′,’Maria’,12], [‘S3′,’Tom’,13], [‘S4′,’Adam’,15]] df1 = pd.DataFrame(record1,columns=[‘S_Id’,’Name’,’Age’]) print(df1) record2 = [[‘S1’,14,82.5], [‘S2’,13,90.0], [‘S3’,14,77.0], [‘S4’,15,87.0]] df2 = pd.DataFrame(record2,columns=[‘S_Id’,’Age’,’Marks’]) print(df2) df = pd.concat([df1,df2],axis=1) print(df) |
The two DataFrames are:
S_Id Name Age 0 S1 John 14 1 S2 Maria 12 2 S3 Tom 13 3 S4 Adam 15 S_Id Age Marks 0 S1 14 82.5 1 S2 13 90.0 2 S3 14 77.0 3 S4 15 87.0 |
And the concatenated one is:
S_Id Name Age S_Id Age Marks 0 S1 John 14 S1 14 82.5 1 S2 Maria 12 S2 13 90.0 2 S3 Tom 13 S3 14 77.0 3 S4 Adam 15 S4 15 87.0 |
Summary
This article focused on Concatenating data in Pandas. The next article will focus on crosstab, pivot tables and melt() function operations in pandas.