CSV files are sometimes stored in online storage, and you may need to read them via URL.
You can read a CSV file from a URL in Pandas using the read_CSV(URL) method.
Syntax
import pandas as pd
df = pd.read_csv("https://sampleurl.com/csv/addresses.csv")
df
This tutorial teaches you how to read a CSV file from a URL and how to authenticate read requests while reading.
Using Read_CSV
The read_CSV()
method reads the CSV file either from the file path or a URL.
To read the CSV file from a URL,
- Pass the URL to the method instead of the filename or file path.
- Ensure the URL is accessible publicly. Else you’ll get
HTTP Error 401: Unauthorized error
.
Code
The following code demonstrates how to use the read_CSV()
method to read public URLs.
import pandas as pd
df = pd.read_csv("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv")
df
The CSV file is read, and the dataframe is created.
DataFrame Will Look Like
John | Doe | 120 jefferson st. | Riverside | NJ | 08075 | |
---|---|---|---|---|---|---|
0 | Jack | McGinnis | 220 hobo Av. | Phila | PA | 9119 |
1 | John “Da Man” | Repici | 120 Jefferson St. | Riverside | NJ | 8075 |
2 | Stephen | Tyler | 7452 Terrace “At the Plaza” road | SomeTown | SD | 91234 |
3 | NaN | Blankman | NaN | SomeTown | SD | 298 |
4 | Joan “the bone”, Anne | Jet | 9th, at Terrace plc | Desert City | CO | 123 |
Using Read_CSV with Authentication
Sometimes the CSV files will be stored in a location and protected with authentication credentials.
To read CSV files from the protected URLs,
- Encode the username and password using the
base64
encoder - Pass the authorization details using the storage_options parameter of the
read_CSV()
method. For theHTTPS
URLs, the authentication will be passed as the headers to the URL requests library.
Code
The following code demonstrates how to authenticate the private URLs using the authorization details.
- Remember to update the username and password in place of the
user:pass
import pandas as pd
from base64 import b64encode
df = pd.read_csv(
'https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv',
storage_options={'Authorization': b'Basic %s' % b64encode(b'user:pass')},)
df
The request is authenticated using the credentials, and the CSV file is read.
DataFrame Will Look like
John | Doe | 120 jefferson st. | Riverside | NJ | 08075 | |
---|---|---|---|---|---|---|
0 | Jack | McGinnis | 220 hobo Av. | Phila | PA | 9119 |
1 | John “Da Man” | Repici | 120 Jefferson St. | Riverside | NJ | 8075 |
2 | Stephen | Tyler | 7452 Terrace “At the Plaza” road | SomeTown | SD | 91234 |
3 | NaN | Blankman | NaN | SomeTown | SD | 298 |
4 | Joan “the bone”, Anne | Jet | 9th, at Terrace plc | Desert City | CO | 123 |