How to Concatenate DataFrames In Pandas?

9 minutes read

Concatenating DataFrames in Pandas can be done using the concat() function. It allows you to combine DataFrames either vertically (along the rows) or horizontally (along the columns).


To concatenate DataFrames vertically, you need to ensure that the columns of both DataFrames align. You can achieve this by using the axis parameter and setting it to 0. Here's an example:

1
2
3
4
5
6
7
8
9
import pandas as pd

# Creating two DataFrames
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]})

# Concatenating vertically
result = pd.concat([df1, df2], axis=0)
print(result)


Output:

1
2
3
4
5
6
7
   A   B
0  1   4
1  2   5
2  3   6
0  7  10
1  8  11
2  9  12


On the other hand, if you want to concatenate DataFrames horizontally, you need to ensure that the indices align. To do this, set the axis parameter to 1. Here's an example:

1
2
3
4
5
6
7
8
9
import pandas as pd

# Creating two DataFrames
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'C': [7, 8, 9], 'D': [10, 11, 12]})

# Concatenating horizontally
result = pd.concat([df1, df2], axis=1)
print(result)


Output:

1
2
3
4
   A  B  C   D
0  1  4  7  10
1  2  5  8  11
2  3  6  9  12


Note that when concatenating horizontally, if the DataFrames have overlapping column names, the resulting DataFrame will contain all the columns without any conflict resolution.

Best Python Books of July 2024

1
Learning Python, 5th Edition

Rating is 5 out of 5

Learning Python, 5th Edition

2
Head First Python: A Brain-Friendly Guide

Rating is 4.9 out of 5

Head First Python: A Brain-Friendly Guide

3
Python for Beginners: 2 Books in 1: Python Programming for Beginners, Python Workbook

Rating is 4.8 out of 5

Python for Beginners: 2 Books in 1: Python Programming for Beginners, Python Workbook

4
Python All-in-One For Dummies (For Dummies (Computer/Tech))

Rating is 4.7 out of 5

Python All-in-One For Dummies (For Dummies (Computer/Tech))

5
Python for Everybody: Exploring Data in Python 3

Rating is 4.6 out of 5

Python for Everybody: Exploring Data in Python 3

6
Learn Python Programming: The no-nonsense, beginner's guide to programming, data science, and web development with Python 3.7, 2nd Edition

Rating is 4.5 out of 5

Learn Python Programming: The no-nonsense, beginner's guide to programming, data science, and web development with Python 3.7, 2nd Edition

7
Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition

Rating is 4.4 out of 5

Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition


How to concatenate two DataFrames in Pandas?

To concatenate two DataFrames in Pandas, you can use the concat function.


Here is an example of concatenating two DataFrames vertically (i.e., stacking one DataFrame on top of another):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

# Create two DataFrames
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]})

# Concatenate the two DataFrames vertically
concatenated = pd.concat([df1, df2], axis=0)

print(concatenated)


Output:

1
2
3
4
5
6
7
   A   B
0  1   4
1  2   5
2  3   6
0  7  10
1  8  11
2  9  12


If you want to concatenate the DataFrames horizontally (i.e., side by side), set the axis parameter to 1:

1
2
3
4
# Concatenate the two DataFrames horizontally
concatenated = pd.concat([df1, df2], axis=1)

print(concatenated)


Output:

1
2
3
4
   A  B  A   B
0  1  4  7  10
1  2  5  8  11
2  3  6  9  12


Note that the indexes from the original DataFrames are preserved in the concatenated DataFrame. You can reset the index using the reset_index method if desired.


What is the impact of missing data on DataFrame concatenation in Pandas?

Missing data can have several impacts on DataFrame concatenation in Pandas:

  1. Reduction in the size of the resulting DataFrame: If one of the DataFrames being concatenated has missing data in a particular column, while the other DataFrame has non-missing data in that column, the resulting DataFrame will have missing values in that column.
  2. Misalignment of data: If the DataFrames being concatenated have missing values in different locations, the resulting DataFrame will have misaligned data. This can cause issues when performing computations or analyses on the concatenated DataFrame.
  3. Handling of missing values: Pandas provides different methods to handle missing data during concatenation. By default, missing values are propagated to the result DataFrame. However, there are options to ignore missing values or fill them with default values.
  4. Inconsistent column names: If the DataFrames being concatenated have different column names, the resulting DataFrame will have a combination of all the columns. This can lead to confusion and the need to rename or reorganize columns afterwards.


Overall, missing data in the DataFrames being concatenated can introduce inconsistencies and complications in the resulting concatenated DataFrame, which require appropriate handling and analysis.


How to concatenate DataFrames while dropping the original index in Pandas?

To concatenate DataFrames while dropping the original index in Pandas, you can use the ignore_index parameter of the pd.concat() function. This parameter is set to False by default, which preserves the original index values. By setting it to True, the resulting concatenated DataFrame will have a new index that ignores the original index values.


Here's an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

# Create two DataFrames
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]})

# Concatenate DataFrames while dropping the original index
concatenated_df = pd.concat([df1, df2], ignore_index=True)

print(concatenated_df)


Output:

1
2
3
4
5
6
7
   A   B
0  1   4
1  2   5
2  3   6
3  7  10
4  8  11
5  9  12


As you can see, the resulting concatenated_df DataFrame has a new index that starts from 0 and ignores the original index values from df1 and df2.


How to concatenate DataFrames with different indexes in Pandas?

To concatenate DataFrames with different indexes in Pandas, you can use the concat() function with the ignore_index parameter set to True. The ignore_index parameter is used to reset the index of the resulting DataFrame.


Here is an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

# Create two DataFrames with different indexes
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=[0, 1, 2])
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]}, index=[3, 4, 5])

# Concatenate the DataFrames
result = pd.concat([df1, df2], ignore_index=True)

print(result)


Output:

1
2
3
4
5
6
7
   A   B
0  1   4
1  2   5
2  3   6
3  7  10
4  8  11
5  9  12


In the resulting DataFrame, the indexes of the original DataFrames are ignored, and a new index is created.


How to concatenate DataFrames while preserving the original index in Pandas?

To concatenate DataFrames while preserving the original index in Pandas, you can use the concat() function with the ignore_index=False parameter. Here is an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import pandas as pd

# Create two sample DataFrames
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]})

# Concatenate DataFrames while preserving index
concatenated = pd.concat([df1, df2], ignore_index=False)

print(concatenated)


Output:

1
2
3
4
5
6
7
   A   B
0  1   4
1  2   5
2  3   6
0  7  10
1  8  11
2  9  12


Note that by default, the concat() function concatenates along axis 0 (rows). If you want to concatenate along columns, you can use axis=1 parameter.

Facebook Twitter LinkedIn Telegram Whatsapp Pocket

Related Posts:

You can drop level 0 in two dataframes using a for loop in pandas by iterating over the dataframes and dropping the first level of the index. This can be achieved by using the droplevel method on the MultiIndex of the dataframe. Here is an example code snippet...
To merge or join two Pandas DataFrames, you can use the merge() function provided by Pandas. This function allows you to combine DataFrames based on a common column or key. Here is an explanation of how to perform this operation:Import the necessary libraries:...
To apply a function to a list of dataframes in pandas, you can use a for loop or the apply method. First, create a list of dataframes that you want to apply the function to. Then, iterate over each dataframe in the list using a for loop or use the apply method...