Numpy arrays are used for data manipulation.
You can read the csv file into a record array in numpy using the np.genfromtxt() method.
This tutorial teaches you how to read a CSV file into a record array in numpy.
If you’re in Hurry
Use the following code to read CSV file into a record array in NumPy.
import numpy as np
csv_arr = np.genfromtxt('User_details.csv', delimiter=",",dtype =None, encoding=None)
csv_arr
- The data will default be read as a
float
dtype. Usedtype=None
to determine datatypes based on the content available in the file. - Use
encoding=None
to use the system’s default encoding. Otherwise, there will be a warning message.
If You Want to Understand Details, Read on…
The sample CSV file contains only data and doesn’t have headers. Let us convert the CSV data into a NumPy array.
Using NumPy GenFromTxt
The genfromtxt() method loads the data from the CSV file.
It provides options to handle missing values, and you can also specify how to handle errors on invalid data.
- The data will default be read as a
float
dtype. Usedtype=None
to determine datatypes based on the content available in the file. - Use
encoding=None
to use the system’s default encoding. If this optional parameter is not specified,VisibleDeprecationWarning: Reading Unicode strings without specifying the encoding argument is deprecated
will occur.
Use this method when you want to handle missing values in a special way.
Code
import numpy as np
csv_arr = np.genfromtxt('User_details.csv', delimiter=",", dtype =None, encoding=None)
csv_arr
Output
array([(1, 'Shivam', 'Pandey', 'India', 1),
(2, 'Kumar', 'Ram', 'India', 1),
(3, 'Felix', 'John', 'Germany', 2)],
dtype=[('f0', '<i8'), ('f1', '<U6'), ('f2', '<U6'), ('f3', '<U7'), ('f4', '<i8')])
Using NumPy LoadTxt
The loadtxt() method reads data from the text file.
You cannot handle missing values in the loadTxt()
method. Hence, all the rows in the file must have the same number of columns.
- The data will default be read as a
float
dtype. You cannot passdtype=None
to automatically infer datatypes as in the file. You need to specify the appropriate converter function to convert the data. - Alternatively, you can read only a few columns that contain numerical data using the
usecols
parameter.
Use this method when your data doesn’t contain any missing values.
Code
The following code demonstrates how to read the first column of the CSV file into a NumPy array.
import numpy as np
csv_arr = np.loadtxt('User_details.csv', usecols = (0), delimiter=",")
csv_arr
Output
array([1., 2., 3.])
Using Pandas Read_CSV
Pandas read_csv() is an alternate method to read the CSV file into a NumPy array.
It provides a dataframe, and you can use the to_numpy()
to convert the pandas dataframe into a NumPy array.
Advantages
read_csv()
works with the commas inside quotes.- No need to worry about the datatypes. It will automatically infer the data types based on the data.
- It is fast and has only less CPU usage.
Code
The following code demonstrates how to read a CSV file as a dataframe and convert it into a NumPy array.
import pandas as pd
df = pd.read_csv('User_details.csv', header=None)
csv_arr = df.values
csv_arr
Output
array([[1, 'Shivam', 'Pandey', 'India', 1],
[2, 'Kumar', 'Ram', 'India', 1],
[3, 'Felix', 'John', 'Germany', 2]], dtype=object)
Converting Specific Column from CSV to Numpy Array
To convert specific columns from CSV into a NumPy array, you can pass the usecols
parameter and specify the list of column indexes.
Code
import numpy as np
csv_arr = np.genfromtxt('User_details.csv', delimiter=",",usecols = (0), dtype =None, encoding=None)
csv_arr
Output
array([1, 2, 3])