Pandas Dataframe columns shall have different data types.
You can change the column type in pandas dataframe using the df.astype() method.
In this tutorial, you’ll learn how to change the column type of the pandas dataframe using
- pandas
astype()
- pandas
to_numeric()
If you’re in Hurry
You can use the following code to change the column type of the pandas dataframe using the astype() method.
df = df.astype({"Column_name": str}, errors='raise')
df.dtypes
Where,
df.astype()
– Method to invoke the astype funtion in the dataframe.{"Column_name": str}
– List of columns to be cast into another format. Column_name is the column which needs to be cast into another format.str
is the target datatype to which the column values should be converted. You can use any of the built-in datatypes of Python or the datatypes available in Numpy.errors='raise'
– To specify how the exceptions are to be handled while converting.raise
will raise the error, andignore
will ignore the errors and performs conversion only on the possible cell values.
This is how you can convert data types of columns in the dataframe.
If You Want to Understand Details, Read on…
In this detailed tutorial, you’ll learn how to change column type in pandas dataframe using different methods provided by the pandas themselves.
Table of Contents
Sample Dataframe
This is the sample dataframe used throughout the tutorial.
- NumPy is used for the datatype
int64
asint64
is not available in python by default.
Code
import pandas as pd
import numpy as np
# Creating a Dictionary
data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU", "Speakers"],
"Unit_Price":[500,200, 5000, 10000, 250.50],
"No_Of_Units":[5,5, 10, 20, 8],
"Available_Quantity":[5,10,11,15, "Not Available"],
"Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','01/05/2021']
}
# Creating a dataframe from the dictionary
df = pd.DataFrame(data)
# Printing the datatype of the columns
df.dtypes
You can check out the datatype of each column by using the code df.dtypes
.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
The dataframe consists of types object
, float64
and int64
.
Note: The String types are displayed as objects.
Printing the dataframe
df
Dataframe Looks Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.0 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.0 | 5 | 10 | 4/23/2021 |
2 | Monitor | 5000.0 | 10 | 11 | 08/21/2021 |
3 | CPU | 10000.0 | 20 | 15 | 09/18/2021 |
4 | Speakers | 250.5 | 8 | Not Available | 01/05/2021 |
Pandas Change Column Type To String
In this section, you’ll learn how to change the column type to String
.
- Use the
astype()
method and mention str as the target datatype.
In the sample dataframe, the column Unit_Price is float64
. The following code converts the Unit_Price to a String format.
Code
df = df.astype({"Unit_Price": str})
df.dtypes
Where,
df.astype
– Method to convert to another datatype{"Unit_Price": str}
– Unit_Price is column name andstr
is the target datatype.
The df.dtypes
will print the types of the column.
Datatypes of Columns
product_name object
Unit_Price object
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
Change Column Type To Int Using to_numeric()
The to_numeric() method converts a column to int or float based on the values available in the column.
- If the column contains only numbers without decimals,
to_numeric()
will convert it toint64
- If the column contains numbers with decimal points,
to_numeric()
will convert it tofloat64
.
Use to_numeric()
when you want to convert the number into int64
instead of int32
.
Example
- Unit_Price column contains decimal numbers, and hence it is converted into
float64
- The No_Of_Units column contains Only numbers, and hence it is converted into
int64
# convert column "Unit_Price" of a DataFrame
df["Unit_Price"] = pd.to_numeric(df["Unit_Price"])
df["No_Of_Units"] = pd.to_numeric(df["No_Of_Units"])
df.dtypes
Datatypes after converting it using the to_numeric()
method.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
Printing the dataframe
df
Dataframe Looks Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.0 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.0 | 5 | 10 | 4/23/2021 |
2 | Monitor | 5000.0 | 10 | 11 | 08/21/2021 |
3 | CPU | 10000.0 | 20 | 15 | 09/18/2021 |
4 | Speakers | 250.5 | 8 | Not Available | 01/05/2021 |
Change Column Type To Int Using astype()
astype() method is used to convert columns to any type specified in the method parameter.
Use astype()
when you want to convert the number into int32
instead of int64
.
Code
You can convert the column to int
by specifying int
in the parameter as shown below.
df = df.astype({"No_Of_Units": int})
df.dtypes
Where,
df.astype()
– Method to invoke the astype funtion in the dataframe.{"No_Of_Units": int}
– List of columns to be cast into another format. No_Of_Units is the column which need to be cast into int format.int
is the target datatype to which the column values should be converted. Now the column will be converted to int32.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int32
Available_Quantity object
Available_Since_Date object
dtype: object
astype()
is useful but you need to note few points. You need to use np.int64
, if you want to convert it into 64-bit integer.
Pandas Change Column Type From Object to Int64
In this section, you’ll learn how to change column type from object to int64
.
- Pass the subset of the desired columns to the
to_numeric()
method - It automatically converts numbers to
int64
by default and returns the values.
Snippet
df["No_Of_Units"] = pd.to_numeric(df["No_Of_Units"])
df.dtypes
The No_Of_Units column is converted to int64
.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
Now, let’s see the default behavior of the astype()
method and how it can be used to convert objects to int64
.
If you just specify int
in astype, it converts the column to int32.
Snippet
df = df.astype({"No_Of_Units": int})
df.dtypes
The No_Of_Units column is converted to int32
.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int32
Available_Quantity object
Available_Since_Date object
dtype: object
Now, you’ll convert object to int64
using astype()
.
You can use np.int64
in type to convert column to int64.
Snippet
df = df.astype({"No_Of_Units": np.int64})
df.dtypes
The No_Of_Units column is converted to int64
.
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
This is how you can convert to_numeric()
and astype()
to cast column type from object to int64
.
Pandas Change Column Type From Int To String
In this section, you’ll learn how to change column type from Int to String.
- Use the
astype()
method to convert anint
column to aString
.
Code
df = df.astype({"No_Of_Units": str}, errors='raise')
df.dtypes
Where,
df.astype()
– Method to invoke the astype function in the dataframe.{"No_Of_Units": str}
– List of columns to be cast into another format. No_Of_Units is the column that needs to be cast into another format.str
is the target datatype to which the column values should be converted.
Datatypes of Columns
You can see the No_Of_Units is converted to String
, and it is displayed as an object
type.
product_name object
Unit_Price float64
No_Of_Units object
Available_Quantity object
Available_Since_Date object
dtype: object
Pandas Change Column Type To Float
In this section, you’ll learn how to change the column type to float.
- Use the
astype()
method to convert a column to float.
In the sample dataframe, the column Unit_Price has numbers with decimal values, but the column type is in String
format.
df = df.astype({"Unit_Price": float})
df.dtypes
Where,
df.astype()
– Method to invoke the astype function in the dataframe.{"Unit_Price": float}
– List of columns to be cast into another format. No_Of_Units is the column which need to be cast into another format.float
is the target datatype to which the column values should be converted.
Datatypes of Columns
You can see that the Unit_Price column is converted into float64
.
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date object
dtype: object
Now, let’s try to convert the column Available_Quantity to float. which has non-numeric characters in one of the cells. The non-numeric value is Not Available.
Note that the errors='coerce'
parameter is used, which will force the conversion of the possible values.
df["Available_Quantity"] = pd.to_numeric(df["Available_Quantity"], errors='coerce')
df.dtypes
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity float64
Available_Since_Date object
dtype: object
The column is converted to float64
without any problems. The non-numeric characters are converted to NaN
which means Not A Number.
Printing the dataframe
df
Dataframe Looks Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.0 | 5 | 5.0 | 11/5/2021 |
1 | Mouse | 200.0 | 5 | 10.0 | 4/23/2021 |
2 | Monitor | 5000.0 | 10 | 11.0 | 08/21/2021 |
3 | CPU | 10000.0 | 20 | 15.0 | 09/18/2021 |
4 | Speakers | 250.5 | 8 | NaN | 01/05/2021 |
This is how you can cast column type to float
.
Next, you’ll learn how to cast column type to Datetime.
Pandas Change Column Type To Datetime64
In this section, you’ll learn how to change the column type to Datetime64
.
- Use the method to_datetime() to convert a string to DateTime.
In the sample dataframe, the column Available_Since_Date has the date value as a String type.
Code
df['Available_Since_Date']= pd.to_datetime(df['Available_Since_Date'])
df.dtypes
Datatypes of Columns
product_name object
Unit_Price float64
No_Of_Units int64
Available_Quantity object
Available_Since_Date datetime64[ns]
dtype: object
You could see that the column Available_Since_Date column is converted into datetime64[ns]
.
to_datetime()
also supports error handling where,
errors='raise'
will raise an error if there is invalid date values available in any of the cells.errors='ignore'
will silently ignore errors if there is invalid date values available in any of the cells and returns the column intact.errors='coerce'
will convert the valid dates to datetime type and set other cells toNaT
.
Pandas Convert Multiple Columns to Int
In this section, you’ll learn how to convert multiple columns to int using the astype()
method.
- Create a list with the multiple column names
- Pass the list to the dataframe and apply the
astype()
method with the target datatype Eg.int
, orstring
.
df[['column_1','column_2']] = df[['column_1','column_2']].astype(np.int64)
df.dtypes
The column_1 and Column_2 will be converted to int using the astype()
.
Using the to_numeric() method to convert multiple columns
- Use the apply method to apply the
to_numeric()
function to the specified columns, as shown below.
df[['column_1','column_2']] = df[['column_1','column_2']].apply(pd.to_numeric)
df.dtypes
This is how you can convert multiple column types to another format.
Pandas Convert All Columns
In this section, you’ll learn how to change the column type of all columns in a dataframe. For example, converting all object columns to string.
To convert the column type of all columns,
- Create a list of all columns called
columns_list
by usinglist(df
) - Pass this list to the dataframe and invoke the
astype()
method - Pass the target datatype(
str
) as a parameter to theastype()
method
Code
columns_list = list(df)
df[columns_list] = df[columns_list].astype(str)
df.dtypes
Datatypes of Columns
product_name object
Unit_Price object
No_Of_Units object
Available_Quantity object
Available_Since_Date object
dtype: object
You can see that all the columns of the dataframe are converted to String, and it is displayed as an object.