How to Combine Two Columns in Pandas – Definitive Guide

When working with data using Pandas, you may need to combine two columns in Pandas to create another column.

You can combine two columns in Pandas using df[“new column name“] = df[“column 1”] + df["column 2”] statement.

In this tutorial, you’ll learn how to combine or concatenate two or more columns in Pandas dataframe to create another column.

If You’re in Hurry…

You can use the + operator to concatenate two columns in the pandas dataframe.

This is the simplest method to join two columns in Pandas dataframe.

df[“New Column Name“] = df[“Column 1“] + " " + df[“Column 2“]
df

This will concatenate Column 1 and Column 2 and add the value to the new column.

If You Want to Understand Details, Read on…

The dataset might contain a text value in the dataframe, or some cells might have missing values.

Different methods are available to concatenate two or more columns in the dataframe in different scenarios.

Let us learn each one.

Sample Dataframe

For the demonstration, you’ll use the below dataframe. It has four columns.

First NameString Type column

Last NameString Type column

CountryString Type column

Country CodeNumber Type column

You’ll use these columns to concatenate and create a new column.

Code

import pandas as pd 

# List of Tuples
users = [ ('Shivam', 'Pandey', 'India', 1),
             ('Kumar', 'Ram' , 'US', 2 ),
         ('Felix','John' , 'Germany', 3 ),
              ]

#Create a DataFrame object
df = pd.DataFrame(  users, 
                    columns = ['First Name' , 'Last Name', 'Country', 'Country Code']
                    ) 

df

Dataframe Will Look Like

First NameLast NameCountryCountry Code
0ShivamPandeyIndia1
1KumarRamUS2
2FelixJohnGermany3

You’ll use the columns in the dataframe df to learn different concatenation methods.

Using + Operator

You can use the plus operator to concatenate two Strings.

Similarly, you can pass the two dataframe columns with the plus operator to combine two Strings and form a new column.

You can assign the result value to the same dataframe using the new column name. This will add a new column to the dataframe with the combined string value.

Using the plus(+) operator is the simplest method of concatenation.

The following code demonstrates how the First Name Column and the Second Name column are combined to form a column called Full Name.

Also, a space is added as a separator between the two-column values.

Code

df["Full Name"] = df["First Name"] + " " + df["Last Name"]

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam Pandey
1KumarRamUS2Kumar Ram
2FelixJohnGermany3Felix John

This is how you can use the plus (+) operator to combine two columns of the Pandas dataframe.

Using Map() Method

You can use the map() function to combine two columns of the Pandas dataframe.

When to Use

You can use this method when you want to handle the Null or missing values in the Pandas Dataframe.

There are two actions available to handle the NaN values.

  • None – Default option. It uses None when a missing value is available in the column.
  • ’ignore’ – Used to ignore the concatenation operation when any values are missing in the dataframe.

The following code uses the map() method to handle the Null values and uses the Plus (+) operator to concatenate the values.

Code

df["Full Name"] = df["First Name"].map(str, na_action=None) + df["Last Name"].map(str, na_action='ignore')

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn

Now, you’ll add another row to the dataframe with None values to learn how the map() method handles these values during concatenation.

Code

df.loc[3] = ['Borris', None, 'UK', 4, None]

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn
3BorrisNoneUK4None

Now, you’ll use the map() method and (+) operator to combine two string columns of the dataframe.

The na_action = None parameter in the map() method denotes that the Null or missing values must be ignored during concatenation.

Code

df["Full Name"] = df["First Name"].map(str, na_action=None) + df["Last Name"].map(str, na_action='ignore')

df

Dataframe Will Look Like

The output dataframe will show that the rows with the Null or missing values are not combined. Instead, the NaN value is added to the column.

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn
3BorrisNoneUK4NaN

Now, drop the rows that are added to demonstrate the missing values.

Code

df.drop([3], axis=0, inplace=True)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn

This is how you can use the map() method to combine two String columns of the dataframe and handle null values or missing values differently.

Using Cat() method

You can use the cat method to concatenate two strings.

The cat() method concatenates two strings in the series with the given separator.

The default separator is ‘’. So if you do not specify any separator, the Strings will be concatenated without any separator.

When to Use

You can use the cat() method when you want to specify a representation character that needs to be used instead of Null or missing values.

map() vs cat()

The difference between the map() method and the cat() method is that the map method doesn’t allow specifying the replacement character for the missing values. It only allows specifying if the missing values MUST be ignored or used.

Code

df['Full Name'] = df['First Name'].str.cat(df['Last Name'],sep=“,”,  na_rep='----')

df

The missing value is represented with the ---- value and concatenated as shown in the fourth row.

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam,Pandey
1KumarRamUS2Kumar,Ram
2FelixJohnGermany3Felix,John
3BorrisNoneUK4Borris,—-

This is how you can use the cat() method to concatenate two String columns in Pandas Dataframe.

Using Apply() Method

You can use the apply method to combine two columns of the Pandas Dataframe.

The apply() method applies the defined function across the dataframe in the specified axis(Columns or Rows).

To concatenate two columns, you can select the two columns and apply the join() function.

Join method concatenates two strings with a separator. If you do not want to use any separator while combining two strings, you can use ’’.join().

The following code combines the First Name and Last Name columns and assigns the result to the Full Name column.

Code

df['Full Name'] = df[['First Name', 'Last Name']].apply(','.join , axis=1)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam,Pandey
1KumarRamUS2Kumar,Ram
2FelixJohnGermany3Felix,John

This is how you can use the apply() method and concatenate two string columns of the Pandas Dataframe.

Using agg() Method

You can also use the agg() method to combine two strings columns of the Pandas Dataframe.

This method is used to aggregate the columns or rows. There are No specific advantages of using this method over the other methods to combine two string columns.

Code

#concat two columns with space

df['Full Name'] = df[['First Name', 'Last Name']].agg(' '.join, axis=1)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam Pandey
1KumarRamUS2Kumar Ram
2FelixJohnGermany3Felix John

These are the different methods to combine two string columns of the Pandas dataframe and the appropriate use-cases.

Now let us see how these are applied in different scenarios.

Convert to Text and Combine Two Columns

In this section, you’ll learn how to convert a number column to a String before concatenating it with the String column.

The sample dataframe contains a number type column called Country Code. This column cannot be concatenated directly with another String column. Hence, you need to convert this number column to a string value before concatenating.

You can use the astype(str) to convert a number to a string.

The following code demonstrates how to convert a column to text and combine two columns. The country and the country code columns are combined with the separator ‘-‘.

Code

df['Country & Code'] = df['Country'] +'-' + df['Country Code'].astype(str)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam PandeyIndia-1
1KumarRamUS2Kumar RamUS-2
2FelixJohnGermany3Felix JohnGermany-3

Concatenate Two Columns With a Separator

You can concatenate two columns of Pandas data frame with a separator using the cat() method below.

Code

df['Full Name'] = df['First Name'].str.cat(df['Last Name'],sep=",")

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam,PandeyIndia-1
1KumarRamUS2Kumar,RamUS-2
2FelixJohnGermany3Felix,JohnGermany-3
3BorrisNoneUK4NaNNone

Concatenate Two Columns If Not Null

This section will teach you how to concatenate two columns only if the columns are not null.

You can use the map() function because it supports defining the action for the Null or missing values using the na_action parameter.

The following code shows how to combine two columns of the Pandas dataframe only if the values are not null.

Code

df.loc[3] = ['Borris', None, 'UK', 4, None, None]

df["Full Name"] = df["First Name"].map(str, na_action= 'ignore') + df["Last Name"].map(str, na_action='ignore')

df

Since the fourth row of the dataframe has Null values, that specific row is not combined. It has the value of NaN as shown in the output.

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1ShivamPandeyIndia-1
1KumarRamUS2KumarRamUS-2
2FelixJohnGermany3FelixJohnGermany-3
3BorrisNoneUK4NaNNone

Now, drop the fourth row that contains the missing values.

df.drop([3], axis=0, inplace=True)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1ShivamPandeyIndia-1
1KumarRamUS2KumarRamUS-2
2FelixJohnGermany3FelixJohnGermany-3

Combine More Than Two Columns

In this section, you’ll learn how to combine more than two columns of the Pandas Dataframe.

You can use the agg() method to combine more than two columns using the join() method.

Code

#concat two columns with space

df['Full Name'] = df[['First Name', 'Last Name', 'Country']].agg(' '.join, axis=1)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam Pandey IndiaIndia-1
1KumarRamUS2Kumar Ram USUS-2
2FelixJohnGermany3Felix John GermanyGermany-3

Conclusion

To summarise, you’ve learned how to combine two columns in Pandas Dataframe.

Among the different methods available such as cat(), map() and the plus operator, you’ve learned when it’s appropriate to use each technique.

If you have any questions, please comment below.

You May Also Like

Leave a Comment