How to Combine Two Columns in Pandas – Definitive Guide

When working with data using Pandas, you may need to combine two columns in Pandas to create another column.

You can combine two columns in Pandas using df[“new column name“] = df[“column 1”] + df["column 2”] statement.

Basic Example

df[“New Column Name“] = df[“Column 1“] + " " + df[“Column 2“]
df
  • This will concatenate Column 1 and Column 2 and add the value to the new column.

In this tutorial, you’ll learn how to combine or concatenate two or more columns in Pandas dataframe to create another column.

Sample Dataframe

For the demonstration, you’ll use the below dataframe. It has four columns.

  • First NameString Type column
  • Last NameString Type column
  • CountryString Type column
  • Country CodeNumber Type column

Use these columns to concatenate and create a new column.

Code

import pandas as pd 

# List of Tuples
users = [ ('Shivam', 'Pandey', 'India', 1),
             ('Kumar', 'Ram' , 'US', 2 ),
         ('Felix','John' , 'Germany', 3 ),
              ]

#Create a DataFrame object
df = pd.DataFrame(  users, 
                    columns = ['First Name' , 'Last Name', 'Country', 'Country Code']
                    ) 

df

Dataframe Will Look Like

First NameLast NameCountryCountry Code
0ShivamPandeyIndia1
1KumarRamUS2
2FelixJohnGermany3

You’ll use the columns in the dataframe df to learn different concatenation methods.

Using + Operator

The plus operator concatenates two Strings. This is the simplest method of concatenation.

Using plus operator, you can,

  • Combine two columns and form a new column
  • Assign the result value to the same dataframe using the new column name.
  • This will add a new column to the dataframe with the combined string value.

Code

The following code demonstrates how the First Name Column and the Second Name column are combined to form a column called Full Name.

df["Full Name"] = df["First Name"] + " " + df["Last Name"]

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam Pandey
1KumarRamUS2Kumar Ram
2FelixJohnGermany3Felix John

This is how you can use the plus (+) operator to combine two columns of the Pandas dataframe.

Using Map() Method

The map() function maps the values of the series according to an input function.

  • Pass a function to combine two columns

Use this method when you want to handle the Null or missing values in the Pandas Dataframe.

There are two actions available to handle the NaN values.

  • None – Default option. It uses None when a missing value is available in the column.
  • ’ignore’ – Used to ignore the concatenation operation when any values are missing in the dataframe.

Code

The following code uses the map() method to handle the Null values and uses the Plus (+) operator to concatenate the values.

df["Full Name"] = df["First Name"].map(str, na_action=None) + df["Last Name"].map(str, na_action='ignore')

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn

Add another row to the dataframe with None values to learn how the map() method handles these values during concatenation.

Code

df.loc[3] = ['Borris', None, 'UK', 4, None]

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn
3BorrisNoneUK4None

Now, use the map() method and (+) operator to combine two string columns of the dataframe.

The na_action = None parameter in the map() method denotes that the Null or missing values must be ignored during concatenation.

Code

The following code demonstrates how to ignore missing values during concatenation.

df["Full Name"] = df["First Name"].map(str, na_action=None) + df["Last Name"].map(str, na_action='ignore')

df

Dataframe Will Look Like

The output dataframe will show that the rows with the Null or missing values are not combined. Instead, the NaN value is added to the column.

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn
3BorrisNoneUK4NaN

Now, drop the rows that are added to demonstrate the missing values.

df.drop([3], axis=0, inplace=True)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1ShivamPandey
1KumarRamUS2KumarRam
2FelixJohnGermany3FelixJohn

This is how you can use the map() method to combine two String columns of the dataframe and handle null values or missing values differently.

Using Cat() method

You can use the cat method to concatenate two strings.

  • The cat() method concatenates two strings in the series with the given separator.
  • The default separator is ‘’. So if you do not specify any separator, the strings will be concatenated without any separator.

Use the cat() method when you want to specify a representation character that needs to be used instead of Null or missing values.

map() vs cat()

The difference between the map() method and the cat() method is that the map method doesn’t allow specifying the replacement character for the missing values. It only allows specifying if the missing values MUST be ignored or used.

Code

df['Full Name'] = df['First Name'].str.cat(df['Last Name'],sep=“,”,  na_rep='----')

df

Dataframe Will Look Like

The missing value is represented with the ---- value and concatenated as shown in the fourth row.

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam,Pandey
1KumarRamUS2Kumar,Ram
2FelixJohnGermany3Felix,John
3BorrisNoneUK4Borris,—-

This is how you can use the cat() method to concatenate two String columns in Pandas Dataframe.

Using Apply() Method

You can use the apply method to combine two columns of the Pandas Dataframe.

The apply() method applies the defined function across the dataframe in the specified axis(Columns or Rows).

  • To concatenate two columns, you can select the two columns and apply the join() function.
  • The join method concatenates two strings with a separator. If you do not want to use any separator while combining two strings, you can use ’’.join().

Code

The following code combines the First Name and Last Name columns and assigns the result to the Full Name column.

df['Full Name'] = df[['First Name', 'Last Name']].apply(','.join , axis=1)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull Name
0ShivamPandeyIndia1Shivam,Pandey
1KumarRamUS2Kumar,Ram
2FelixJohnGermany3Felix,John

This is how you can use the apply() method and concatenate two string columns of the Pandas Dataframe.

Convert to Text and Combine Two Columns

In this section, you’ll learn how to convert a number column to a String before concatenating it with the String column.

The sample dataframe contains a number-type column called Country Code. This column cannot be concatenated directly with another String column.

  • Convert the number column to string using astype(str)
  • Concatenate with the desired columns

Code

The following code demonstrates how to convert a column to text and combine two columns. The country and the country code columns are combined with the separator ‘-‘.

df['Country & Code'] = df['Country'] +'-' + df['Country Code'].astype(str)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam PandeyIndia-1
1KumarRamUS2Kumar RamUS-2
2FelixJohnGermany3Felix JohnGermany-3

Concatenate Two Columns With a Separator

You can concatenate two columns of Pandas data frame with a separator using the cat() method below.

Code

df['Full Name'] = df['First Name'].str.cat(df['Last Name'],sep=",")

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam,PandeyIndia-1
1KumarRamUS2Kumar,RamUS-2
2FelixJohnGermany3Felix,JohnGermany-3
3BorrisNoneUK4NaNNone

Concatenate Two Columns If Not Null

This section will teach you how to concatenate two columns only if the columns are not null.

  • Use the map() function because it supports defining the action for the Null or missing values using the na_action parameter.

Code

The following code shows how to combine two columns of the Pandas dataframe only if the values are not null.

df.loc[3] = ['Borris', None, 'UK', 4, None, None]

df["Full Name"] = df["First Name"].map(str, na_action= 'ignore') + df["Last Name"].map(str, na_action='ignore')

df

Since the fourth row of the dataframe has Null values, that specific row is not combined. It has the value of NaN as shown in the output.

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1ShivamPandeyIndia-1
1KumarRamUS2KumarRamUS-2
2FelixJohnGermany3FelixJohnGermany-3
3BorrisNoneUK4NaNNone

Now, drop the fourth row that contains the missing values.

df.drop([3], axis=0, inplace=True)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1ShivamPandeyIndia-1
1KumarRamUS2KumarRamUS-2
2FelixJohnGermany3FelixJohnGermany-3

Combine More Than Two Columns

In this section, you’ll learn how to combine more than two columns of the Pandas Dataframe.

You can use the agg() method to combine more than two columns using the join() method.

Code

#concat two columns with space

df['Full Name'] = df[['First Name', 'Last Name', 'Country']].agg(' '.join, axis=1)

df

Dataframe Will Look Like

First NameLast NameCountryCountry CodeFull NameCountry & Code
0ShivamPandeyIndia1Shivam Pandey IndiaIndia-1
1KumarRamUS2Kumar Ram USUS-2
2FelixJohnGermany3Felix John GermanyGermany-3

Additional Resources

Leave a Comment