How to Change Column Type In Pandas Dataframe- Definitive Guide

Pandas Dataframe columns shall have different data types.

You can change the column type in pandas dataframe using the df.astype() method.

In this tutorial, you’ll learn how to change the column type of the pandas dataframe using

  • pandas astype()
  • pandas to_numeric()

If you’re in Hurry

You can use the following code to change the column type of the pandas dataframe using the astype() method.

df = df.astype({"Column_name": str}, errors='raise') 

df.dtypes

Where,

  • df.astype() – Method to invoke the astype funtion in the dataframe.
  • {"Column_name": str} – List of columns to be cast into another format. Column_name is the column which needs to be cast into another format. str is the target datatype to which the column values should be converted. You can use any of the built-in datatypes of Python or the datatypes available in Numpy.
  • errors='raise' – To specify how the exceptions are to be handled while converting. raise will raise the error, and ignore will ignore the errors and performs conversion only on the possible cell values.

This is how you can convert data types of columns in the dataframe.

If You Want to Understand Details, Read on…

In this detailed tutorial, you’ll learn how to change column type in pandas dataframe using different methods provided by the pandas themselves.

Sample Dataframe

This is the sample dataframe used throughout the tutorial.

  • NumPy is used for the datatype int64 as int64 is not available in python by default.

Code

import pandas as pd
import numpy as np

# Creating a Dictionary
data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU", "Speakers"],
        "Unit_Price":[500,200, 5000, 10000, 250.50],
        "No_Of_Units":[5,5, 10, 20, 8],
        "Available_Quantity":[5,10,11,15, "Not Available"],
        "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','01/05/2021']
       }

# Creating a dataframe from the dictionary
df = pd.DataFrame(data)

# Printing the datatype of the columns
df.dtypes

You can check out the datatype of each column by using the code df.dtypes.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

The dataframe consists of types object, float64 and int64.

Note: The String types are displayed as objects.

Printing the dataframe

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.05511/5/2021
1Mouse200.05104/23/2021
2Monitor5000.0101108/21/2021
3CPU10000.0201509/18/2021
4Speakers250.58Not Available01/05/2021

Pandas Change Column Type To String

In this section, you’ll learn how to change the column type to String.

  • Use the astype() method and mention str as the target datatype.

In the sample dataframe, the column Unit_Price is float64. The following code converts the Unit_Price to a String format.

Code

df = df.astype({"Unit_Price": str})

df.dtypes

Where,

  • df.astype – Method to convert to another datatype
  • {"Unit_Price": str}Unit_Price is column name and str is the target datatype.

The df.dtypes will print the types of the column.

Datatypes of Columns

    product_name            object
    Unit_Price              object
    No_Of_Units              int64
    Available_Quantity      object
    Available_Since_Date    object
    dtype: object

Change Column Type To Int Using to_numeric()

The to_numeric() method converts a column to int or float based on the values available in the column.

  • If the column contains only numbers without decimals, to_numeric() will convert it to int64
  • If the column contains numbers with decimal points, to_numeric() will convert it to float64.

Use to_numeric() when you want to convert the number into int64 instead of int32.

Example

  • Unit_Price column contains decimal numbers, and hence it is converted into float64
  • The No_Of_Units column contains Only numbers, and hence it is converted into int64
# convert column "Unit_Price" of a DataFrame

df["Unit_Price"] = pd.to_numeric(df["Unit_Price"])

df["No_Of_Units"] = pd.to_numeric(df["No_Of_Units"])

df.dtypes

Datatypes after converting it using the to_numeric() method.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

Printing the dataframe

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.05511/5/2021
1Mouse200.05104/23/2021
2Monitor5000.0101108/21/2021
3CPU10000.0201509/18/2021
4Speakers250.58Not Available01/05/2021

Change Column Type To Int Using astype()

astype() method is used to convert columns to any type specified in the method parameter.

Use astype() when you want to convert the number into int32 instead of int64.

Code

You can convert the column to int by specifying int in the parameter as shown below.

df = df.astype({"No_Of_Units": int})

df.dtypes

Where,

  • df.astype() – Method to invoke the astype funtion in the dataframe.
  • {"No_Of_Units": int} – List of columns to be cast into another format. No_Of_Units is the column which need to be cast into int format. int is the target datatype to which the column values should be converted. Now the column will be converted to int32.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int32
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

astype() is useful but you need to note few points. You need to use np.int64, if you want to convert it into 64-bit integer.

Pandas Change Column Type From Object to Int64

In this section, you’ll learn how to change column type from object to int64.

  • Pass the subset of the desired columns to the to_numeric() method
  • It automatically converts numbers to int64 by default and returns the values.

Snippet

df["No_Of_Units"] = pd.to_numeric(df["No_Of_Units"])

df.dtypes

The No_Of_Units column is converted to int64.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

Now, let’s see the default behavior of the astype() method and how it can be used to convert objects to int64.

If you just specify int in astype, it converts the column to int32.

Snippet

df = df.astype({"No_Of_Units": int})

df.dtypes

The No_Of_Units column is converted to int32.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int32
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

Now, you’ll convert object to int64 using astype().

You can use np.int64 in type to convert column to int64.

Snippet

df = df.astype({"No_Of_Units": np.int64})

df.dtypes

The No_Of_Units column is converted to int64.

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

This is how you can convert to_numeric() and astype() to cast column type from object to int64.

Pandas Change Column Type From Int To String

In this section, you’ll learn how to change column type from Int to String.

  • Use the astype() method to convert an int column to a String.

Code

df = df.astype({"No_Of_Units": str}, errors='raise')

df.dtypes

Where,

  • df.astype() – Method to invoke the astype function in the dataframe.
  • {"No_Of_Units": str} – List of columns to be cast into another format. No_Of_Units is the column that needs to be cast into another format. str is the target datatype to which the column values should be converted.

Datatypes of Columns

You can see the No_Of_Units is converted to String, and it is displayed as an object type.

    product_name             object
    Unit_Price              float64
    No_Of_Units              object
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

Pandas Change Column Type To Float

In this section, you’ll learn how to change the column type to float.

  • Use the astype() method to convert a column to float.

In the sample dataframe, the column Unit_Price has numbers with decimal values, but the column type is in String format.

df = df.astype({"Unit_Price": float})

df.dtypes

Where,

  • df.astype() – Method to invoke the astype function in the dataframe.
  • {"Unit_Price": float} – List of columns to be cast into another format. No_Of_Units is the column which need to be cast into another format. float is the target datatype to which the column values should be converted.

Datatypes of Columns

You can see that the Unit_Price column is converted into float64.

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity       object
    Available_Since_Date     object
    dtype: object

Now, let’s try to convert the column Available_Quantity to float. which has non-numeric characters in one of the cells. The non-numeric value is Not Available.

Note that the errors='coerce' parameter is used, which will force the conversion of the possible values.

df["Available_Quantity"] = pd.to_numeric(df["Available_Quantity"], errors='coerce')

df.dtypes

Datatypes of Columns

    product_name             object
    Unit_Price              float64
    No_Of_Units               int64
    Available_Quantity      float64
    Available_Since_Date     object
    dtype: object

The column is converted to float64 without any problems. The non-numeric characters are converted to NaN which means Not A Number.

Printing the dataframe

df

Dataframe Looks Like

product_nameUnit_PriceNo_Of_UnitsAvailable_QuantityAvailable_Since_Date
0Keyboard500.055.011/5/2021
1Mouse200.0510.04/23/2021
2Monitor5000.01011.008/21/2021
3CPU10000.02015.009/18/2021
4Speakers250.58NaN01/05/2021

This is how you can cast column type to float.

Next, you’ll learn how to cast column type to Datetime.

Pandas Change Column Type To Datetime64

In this section, you’ll learn how to change the column type to Datetime64.

In the sample dataframe, the column Available_Since_Date has the date value as a String type.

Code

df['Available_Since_Date']= pd.to_datetime(df['Available_Since_Date'])

df.dtypes

Datatypes of Columns

    product_name                    object
    Unit_Price                     float64
    No_Of_Units                      int64
    Available_Quantity              object
    Available_Since_Date    datetime64[ns]
    dtype: object

You could see that the column Available_Since_Date column is converted into datetime64[ns].

to_datetime() also supports error handling where,

  • errors='raise' will raise an error if there is invalid date values available in any of the cells.
  • errors='ignore' will silently ignore errors if there is invalid date values available in any of the cells and returns the column intact.
  • errors='coerce' will convert the valid dates to datetime type and set other cells to NaT.

Pandas Convert Multiple Columns to Int

In this section, you’ll learn how to convert multiple columns to int using the astype() method.

  • Create a list with the multiple column names
  • Pass the list to the dataframe and apply the astype() method with the target datatype Eg. int, or string.
df[['column_1','column_2']] = df[['column_1','column_2']].astype(np.int64)

df.dtypes

The column_1 and Column_2 will be converted to int using the astype().

Using the to_numeric() method to convert multiple columns

  • Use the apply method to apply the to_numeric() function to the specified columns, as shown below.
df[['column_1','column_2']] = df[['column_1','column_2']].apply(pd.to_numeric)

df.dtypes

This is how you can convert multiple column types to another format.

Pandas Convert All Columns

In this section, you’ll learn how to change the column type of all columns in a dataframe. For example, converting all object columns to string.

To convert the column type of all columns,

  • Create a list of all columns called columns_list by using list(df)
  • Pass this list to the dataframe and invoke the astype() method
  • Pass the target datatype(str) as a parameter to the astype() method

Code

columns_list = list(df)

df[columns_list] = df[columns_list].astype(str)

df.dtypes

Datatypes of Columns

    product_name            object
    Unit_Price              object
    No_Of_Units             object
    Available_Quantity      object
    Available_Since_Date    object
    dtype: object

You can see that all the columns of the dataframe are converted to String, and it is displayed as an object.

Additional Resources

Leave a Comment