How to Download File From S3 Using Boto3 [Python]?

Introduction

Boto3 is an AWS SDK for Python. It allows users to create, and manage AWS services such as EC2 and S3. It provides object-oriented API services and low-level services to the AWS services.

In this tutorial, you’ll

  • create session in Boto3 [Python]
  • Download files from S3 using Boto3 [Python]
  • Download all from S3 Bucket using Boto3 [Python]

Prerequisties

  • Install Boto3 using the command sudo pip3 install boto3
  • If AWS cli is installed and configured you can use the same credentials to create session using Boto3. You can install and configure AWS Cli using the How to install and Configure AWS Cli on ubuntu.

Create S3 Session in Boto3

In this section, you’ll create an S3 session in Boto3.

  • Create a session by using boto3.Session() API by passing the access key and the secret access key.
  • Boto3 looks at various configuration locations until it finds the configuration values, such as settings.AWS_SERVER_PUBLIC_KEY.

Create a generic session to your AWS service using the following code.

import boto3
session = boto3.Session(
    aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
    aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY
)
  • boto3.Session() – Api method to create a session
  • aws_access_key_id – Parameter to denote the Access Key ID. settings.AWS_SERVER_PUBLIC_KEY is used to refer the global environmental variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • aws_secret_access_key – Parameter to denote the Secret access key. settings.AWS_SERVER_SECRET_KEY is used to refer the global environment variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite

You’ve created a generic session for your AWS.

Use the following command to access S3 as a resource using the session.

s3 = session.resource('s3')

If you do not want to create a session and access the resource, you can create an s3 client directly by using the following command.

s3_client = boto3.client('s3', 
                      aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, 
                      aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY, 
                      region_name=REGION_NAME
                      )
  • boto3.client – Api method to create a client directly
  • aws_access_key_id – Parameter to denote the Access Key ID. settings.AWS_SERVER_PUBLIC_KEY is used to refer the global environmental variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • aws_secret_access_key – Parameter to denote the Secret access key. settings.AWS_SERVER_SECRET_KEY is used to refer the global environment variable. This will be set and available when your install and configure the AWS Cli version as specified in the prerequisite
  • region_name – Region where your S3 object resides. AWS Region is a separate geographic area. You can learn more about the AWS regions and the list of regions available in this AWS guide

You’ve created a client directly to access the S3 objects.

Download a Single File From S3 Using Boto3

This section teaches you to download a single file from AWS S3 using Boto3.

  • Use the download_file API from the S3 resource of the Boto3.

Code

import boto3

session = boto3.Session(
    aws_access_key_id=<Access Key ID>,
    aws_secret_access_key=<Secret Access Key>,
)

s3 = session.resource('s3')

s3.Bucket('BUCKET_NAME').download_file('OBJECT_NAME', 'FILE_NAME')

print('success')
  • session – to create a session with your AWS account. Explained in previous section
  • s3 – Resource created out of the session
  • s3.Bucket().download_file() – API method to download file from your S3 buckets.
    • BUCKET_NAME – Name your S3 Bucket. Root or parent folder
    • OBJECT_NAME – Name for the file to be downloaded. You can also give a name that is different from the object name. for e.g. If your file is existing as a.txt, you can download it as b.txt using this parameter
    • FILE_NAME – Full path of your S3 Objects. Including the sub folders in your s3 Bucket. for e.g. /folder1/folder2/filename.txt

Use the following script to download a single file from S3 using Boto3 Client.

import boto3
 
s3_client = boto3.client('s3', 
                      aws_access_key_id=<Access Key ID>,
                      aws_secret_access_key=<Secret Access Key>,
                      region_name='ap-south-1'
                      )

s3_client.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')
print('success')
  • s3_client – Client Created for S3 using Boto3
  • s3.client.download_file() – API method to download file from your S3 buckets.
    • BUCKET_NAME – Name your S3 Bucket. Root or parent folder
    • OBJECT_NAME – Name for the file to be downloaded. You can also give a name that is different from the object name. for e.g. If your file is existing as a.txt, you can download it as b.txt using this parameter
    • FILE_NAME – Full path of your S3 Objects. Including the sub folders in your s3 Bucket. for e.g. /folder1/folder2/filename.txt

You’ve downloaded a single file from AWS S3 using Python Boto3.

After downloading a file, you can Read the file Line By Line in Python.

Download All Files From S3 Using Boto3

In this section, you’ll download all files from S3 using Boto3.

  • Create an s3 resource and iterate over a for loop using objects.all() API.
  • Create necessary subdirectories to avoid file replacements if there are one or more files existing in different sub buckets.
  • Then download the file.
import os
import boto3

#Create Session
session = boto3.Session(
    aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
    aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
)

#Initiate S3 Resource
s3 = session.resource('s3')

# Select Your S3 Bucket
your_bucket = s3.Bucket('your_bucket_name')

# Iterate All Objects in Your S3 Bucket Over the for Loop
for s3_object in your_bucket.objects.all():
   
    #Use this statement if your files are available directly in your bucket. 
    your_bucket.download_file(s3_object.key, filename_with_extension)

    #use below three line ONLY if you have sub directories available in S3 Bucket
    #Split the Object key and the file name.
    #parent directories will be stored in path and Filename will be stored in the filename
  
    path, filename = os.path.split(s3_object.key)

    #Create sub directories if its not existing
    os.makedirs(path)
    
    #Download the file in the sub directories or directory if its available. 
    your_bucket.download_file(s3_object.key, path/filename)

Download Folder From S3 Using Boto3

You cannot download the folder from S3 using Boto3 using a clean implementation. Instead, you can download all files from a directory using the previous section. It’s the clean implementation.

You can also check How to download all files and folders from S3 using AWS Cli.

Running Python File in Terminal

After you’ve created the script in Python3, you may need to run the Python script from the terminal. Refer to the tutorial to learn How to Run Python File in terminal.

If you have any issues, you can also comment below to ask a question.

Conclusion

In this tutorial, you’ve learned

  • How to specify credentials when connecting to AWS using Boto3 Python
  • How to download file from S3 using Boto3 Python
  • How to download all files from AWS S3 bucket using Boto3 Python
  • How to download folder from S3 using Boto3 Python

You May also Like

Leave a Comment