How to Read a CSV file From a URL in Pandas (With Authentication) – Definitive Guide

CSV files are sometimes stored in online storage, and you may need to read them via URL.

You can read a CSV file from a URL in Pandas using the read_CSV(URL) method.

Syntax

import pandas as pd

df = pd.read_csv("https://sampleurl.com/csv/addresses.csv")

df

This tutorial teaches you how to read a CSV file from a URL and how to authenticate read requests while reading.

Using Read_CSV

The read_CSV() method reads the CSV file either from the file path or a URL.

To read the CSV file from a URL,

  • Pass the URL to the method instead of the filename or file path.
  • Ensure the URL is accessible publicly. Else you’ll get HTTP Error 401: Unauthorized error.

Code

The following code demonstrates how to use the read_CSV() method to read public URLs.

import pandas as pd

df = pd.read_csv("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv")

df

The CSV file is read, and the dataframe is created.

DataFrame Will Look Like

JohnDoe120 jefferson st.RiversideNJ08075
0JackMcGinnis220 hobo Av.PhilaPA9119
1John “Da Man”Repici120 Jefferson St.RiversideNJ8075
2StephenTyler7452 Terrace “At the Plaza” roadSomeTownSD91234
3NaNBlankmanNaNSomeTownSD298
4Joan “the bone”, AnneJet9th, at Terrace plcDesert CityCO123

Using Read_CSV with Authentication

Sometimes the CSV files will be stored in a location and protected with authentication credentials.

To read CSV files from the protected URLs,

  • Encode the username and password using the base64 encoder
  • Pass the authorization details using the storage_options parameter of the read_CSV() method. For the HTTPS URLs, the authentication will be passed as the headers to the URL requests library.

Code

The following code demonstrates how to authenticate the private URLs using the authorization details.

  • Remember to update the username and password in place of the user:pass
import pandas as pd

from base64 import b64encode

df = pd.read_csv(
     'https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv',
     storage_options={'Authorization': b'Basic %s' % b64encode(b'user:pass')},)

df

The request is authenticated using the credentials, and the CSV file is read.

DataFrame Will Look like

JohnDoe120 jefferson st.RiversideNJ08075
0JackMcGinnis220 hobo Av.PhilaPA9119
1John “Da Man”Repici120 Jefferson St.RiversideNJ8075
2StephenTyler7452 Terrace “At the Plaza” roadSomeTownSD91234
3NaNBlankmanNaNSomeTownSD298
4Joan “the bone”, AnneJet9th, at Terrace plcDesert CityCO123

Additional Resources

Leave a Comment