Pandas dataframe allows you to store data in a 2-dimensional structure.
You can convert select columns in Pandas dataframe to NumPy array using arr = df[[“RegNo”, “Country_Code”]].to_numpy() statement.
This tutorial teaches you the different methods to convert select columns in a pandas dataframe into a NumPy array.
If you’re in Hurry
Use the following code to convert select columns into a NumPy array.
arr = df[["RegNo", "Country_Code"]].to_numpy()
print(arr)
If You Want to Understand Details, Read on…
Using different methods, let us create a sample dataframe and convert a few columns to a NumPy array.
Creating Dataframe
Create a sample dataframe with five columns in it.
import pandas as pd
# List of Tuples
users = [ (101,'Shivam', 'Pandey', 'India', 1),
(102,'Kumar', 'Ram' , 'US', 2 ),
(103,'Felix','John' , 'Germany', 3 ),
(104,'Michael','John' , 'India', 1 ),
]
#Create a DataFrame object
df = pd.DataFrame( users,
columns = ['RegNo', 'First Name’ , 'Last Name', 'Country', 'Country_Code']
)
df
DataFrame Will Look Like
RegNo | First Name | Last Name | Country | Country_Code | |
---|---|---|---|---|---|
0 | 101 | Shivam | Pandey | India | 1 |
1 | 102 | Kumar | Ram | US | 2 |
2 | 103 | Felix | John | Germany | 3 |
3 | 104 | Michael | John | India | 1 |
Using to_numpy()
The to_numpy() method converts the pandas dataframe into a Numpy Array.
To convert select columns using to_numpy()
,
- Select the subset of dataframe columns by passing the list of columns
- Invoke the
to_numpy()
method to convert those columns to NumPy array
Use this method when you know the column names to convert it into a NumPy array.
Code
The following code converts the two columns RegNo and the Country_code into a numpy array.
arr = df[["RegNo", "Country_Code"]].to_numpy()
print(arr)
Output
[[101 1]
[102 2]
[103 3]
[104 1]]
Using iloc And to_numpy()
The iloc attribute of the dataframe allows you to select the subset of the dataframe using the index
.
To convert select columns into a NumPy array using iloc
,
- Select the subset of columns using its index position.
- Invoke the
to_numpy()
method
Use this method when you want to select columns in a specific range. Using this method, you can also filter rows that need to be converted into a NumPy array. For example, the first ten rows can be converted.
Code
The following code demonstrates converting the columns starting from index three until the end.
arr=df.iloc[:,3:].to_numpy()
print(arr)
Output
[['India' 1]
['US' 2]
['Germany' 3]
['India' 1]]
Using loc And to_numpy()
The loc attribute of the dataframe allows you to access specific rows/columns using its label.
To convert select columns into a NumPy array using loc,
- Select the subset of columns using its labels.
- Invoke the
to_numpy()
method
Use this method when you know the columns and filter rows that need to be converted into a NumPy array.
Code
The following code demonstrates how to convert the row with the index 0 to 2 and the specific columns RegNo and Country_Code of those rows into a NumPy array.
arr=df.loc[0:2,["RegNo", "Country_Code"]].to_numpy()
print(arr)
Output
[[101 1]
[102 2]
[103 3]
Using Values Attribute
The values attribute returns a NumPy representation of the pandas dataframe.
To convert specific columns of the dataframe,
- Select the desired columns using the column names
- Invoke the values attribute, and it’ll return the NumPy representation of the values
Code
arr = df[["RegNo", "Country_Code"]].values
print(arr)
Output
[[101 1]
[102 2]
[103 3]
[104 1]]