Numpy arrays are used for array computing. It can be used for performing a number of mathematical operations such as algebraic, trigonometric, and statistical routines.
You can convert the NumPy array to Pandas Dataframe by using the pd.DataFrame(array)
method.
If You’re in Hurry…
You can use the below code snippet to convert the NumPy array to Pandas Dataframe.
Snippet
import numpy as np
import pandas as pd
array = np.random.rand(5, 5)
df = pd.DataFrame(array)
df
This is how you can create a pandas dataframe from the NumPy Array.
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the different methods available to create pandas dataframe from the NumPy Array.
Table of Contents
Creating NumPy Array
First, you’ll create a NumPy array which will be converted to pandas Dataframe.
You can create a NumPy array by using the np.random.rand()
method. This will create a 5 X 5-dimensional array filled with random values.
Snippet
import numpy as np
import pandas as pd
array = np.random.rand(5, 5)
array
When you print the array, you’ll see the output of 5 rows and 5 columns with random values.
Output
array([[0.93083461, 0.49167774, 0.43159395, 0.4410153 , 0.80704423],
[0.92919269, 0.58450733, 0.6947164 , 0.6369035 , 0.31362118],
[0.53760608, 0.83053222, 0.3622226 , 0.57997871, 0.83459934],
[0.70689251, 0.32799213, 0.01533952, 0.0212185 , 0.93386042],
[0.13681433, 0.90448399, 0.67102222, 0.45538514, 0.15043999]])
Now, you’ll learn how this NumPy array will be converted to Pandas Dataframe.
Convert Numpy Array to Pandas Dataframe
In this section, you’ll learn how to convert Numpy array to pandas dataframe without using any additional options such as column names or indexes.
You can convert NumPy array to pandas dataframe using the dataframe constructor pd.DataFrame(array)
.
Use the below snippet to create a pandas dataframe from the NumPy array.
Snippet
df = pd.DataFrame(array)
df
When you print the dataframe using df
, you’ll see the array is converted as a dataframe.
DataFrame will look Like
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 0.930835 | 0.491678 | 0.431594 | 0.441015 | 0.807044 |
1 | 0.929193 | 0.584507 | 0.694716 | 0.636904 | 0.313621 |
2 | 0.537606 | 0.830532 | 0.362223 | 0.579979 | 0.834599 |
3 | 0.706893 | 0.327992 | 0.015340 | 0.021219 | 0.933860 |
4 | 0.136814 | 0.904484 | 0.671022 | 0.455385 | 0.150440 |
This is how you can create a dataframe using the NumPy array without any additional options.
Convert NumPy Array to Pandas Dataframe with Column Names
In this section, you’ll learn how to convert NumPy array to pandas dataframe with column names.
Typically, NumPy arrays don’t have column names. Hence, while converting the NumPy arrays to Pandas dataframe, there will not be any column names assigned to the dataframe.
You can convert NumPy Array to pandas dataframe with column names using the attribute columns
and passing the column values as a list.
Use the below snippet to convert the NumPy array to pandas dataframe with column names.
The list of column values must be in the same dimension as the array columns. If you’ve 5
columns in the array, then you need to pass 5 values in the list.
Snippet
df = pd.DataFrame(array, columns = ['Col_one', 'Col_two', 'Col_Three', 'Col_Four', 'Col_Five'])
df
When you print the dataframe using df
, you’ll see that columns in the dataframe are named accordingly.
DataFrame will look Like
Col_one | Col_two | Col_Three | Col_Four | Col_Five | |
---|---|---|---|---|---|
0 | 0.930835 | 0.491678 | 0.431594 | 0.441015 | 0.807044 |
1 | 0.929193 | 0.584507 | 0.694716 | 0.636904 | 0.313621 |
2 | 0.537606 | 0.830532 | 0.362223 | 0.579979 | 0.834599 |
3 | 0.706893 | 0.327992 | 0.015340 | 0.021219 | 0.933860 |
4 | 0.136814 | 0.904484 | 0.671022 | 0.455385 | 0.150440 |
This is how you can create a pandas dataframe using the NumPy array with column values.
Convert Numpy Array to Pandas Dataframe with Index
In this section, you’ll learn how to convert NumPy array to pandas dataframe with index.
Typically, NumPy arrays don’t have row indexes. Hence, while converting the NumPy arrays to Pandas dataframe, there will not be any indexes assigned to the dataframe.
You can convert NumPy Array to pandas dataframe with index using the attribute index
and passing the index values as a list.
Use the below snippet to convert NumPy array to pandas dataframe with index.
The list of index values must be in the same dimension as the array rows. If you’ve 5
rows in the array, then you need to pass 5 values in the index list.
Snippet
df = pd.DataFrame(array, columns = ['Col_one', 'Col_two', 'Col_Three', 'Col_Four', 'Col_Five'], index = ['Row_1', 'Row_2','Row_3','Row_4','Row_5'])
df
When you print the dataframe using df
, you’ll see that rows in the dataframe are named using the passed indexes accordingly.
DataFrame will look Like
Col_one | Col_two | Col_Three | Col_Four | Col_Five | |
---|---|---|---|---|---|
Row_1 | 0.930835 | 0.491678 | 0.431594 | 0.441015 | 0.807044 |
Row_2 | 0.929193 | 0.584507 | 0.694716 | 0.636904 | 0.313621 |
Row_3 | 0.537606 | 0.830532 | 0.362223 | 0.579979 | 0.834599 |
Row_4 | 0.706893 | 0.327992 | 0.015340 | 0.021219 | 0.933860 |
Row_5 | 0.136814 | 0.904484 | 0.671022 | 0.455385 | 0.150440 |
This is how you can create a pandas dataframe with a NumPy array with index values.
Convert Object Type NumPy array to Dataframe
Until now, you’ve learned how to convert NumPy array which has the same type of data to a pandas dataframe.
In this section, you’ll learn how to convert object type NumPy array which has different types of data in each column to a pandas dataframe.
First, create a NumPy.ndarray with String value in one column and int value in one column.
For example,
- First column has country names which are of
String
type - Second column has a country codes which are of
Int
type.
Snippet
import numpy as np
arr = np.array([['India',1],['Germany',2],['US',3]], dtype=object)
print(arr)
print(type(arr))
print(arr.dtype)
Output
[['India' 1]
['Germany' 2]
['US' 3]]
<class 'numpy.ndarray'>
object
Now, you’ll convert this ndarray into a dataframe object.
You can use the DataFrame()
constructor available in the pandas library to convert Numpy ndarray to a dataframe.
You can also pass the name for columns using the columns[]
attribute as shown below.
Snippet
df = pd.DataFrame(arr, columns = ['Country', 'Code'])
df
When you print the dataframe, you’ll see the dataframe with two columns named.
DataFrame will look Like
Country | Code | |
---|---|---|
0 | India | 1 |
1 | Germany | 2 |
2 | US | 3 |
You can check the type of the dataframe columns using the below snippet.
Snippet
df.dtypes
You can see both the columns are created as objects rather than creating the code
column as a number. If you want to convert code column to number, read Change column type in Pandas.
Output
Country object
Code object
dtype: object
Concatenate NumPy Array to Pandas Dataframe
In the previous sections, you’ve learned how to create a Pandas dataframe from the NumPy array.
In this section, you’ll learn how to concatenate the NumPy array to the existing pandas dataframe. This is also known as adding a NumPy array to pandas dataframe.
First, create a NumPy array with two columns namely Country and Code. Then create a dataframe called df
using pd.DataFrame()
method.
Next, create a second NumPy array with one column called countries. After creating a second NumPy array, you cannot directly concatenate with the existing dataframe. You need to create a separate dataframe for the new NumPy Array and then concatenate two data frames.
You can concatenate the second dataframe to the first dataframe using the assignment operator as shown below.
Snippet
import numpy as np
arr = np.array([['India',1],['Germany',2],['US',3]], dtype=object)
df = pd.DataFrame(arr, columns = ['Country', 'Code'])
arr1 = np.array([['India'],['Germany'],['US']], dtype=object)
df2 = pd.DataFrame(arr1, columns = ['Country'])
df['New_Column'] = df2['Country']
df
When you print the dataframe df
, you’ll see the second NumPy array appended to the first dataframe.
DataFrame will look Like
Country | Code | New_Column | |
---|---|---|---|
0 | India | 1 | India |
1 | Germany | 2 | Germany |
2 | US | 3 | US |
This is how you can Add Numpy Array to Pandas Dataframe using the dataframe append method.
Conclusion
To summarize, you’ve learned how to convert a NumPy array to a pandas dataframe. This is also known as creating a pandas dataframe from a NumPy array.
Additionally, you’ve learned how to convert pandas dataframe with column names and indexes. Also, you’ve learned how to convert NumPy arrays with different column types to a dataframe and convert the column types of the column in the dataframe.
If you have any questions, comment below.