How To Retrieve Subfolder names in An S3 Bucket In Boto3 Python? – Definitive Guide

AWS S3 is a simple storage service. It provides creating prefixes inside the bucket for better organisation of objects. These prefixes act similar to the subfolders.

You can retrieve the subfolder names in an S3 bucket in Boto3 using list_objects_v2(Bucket=bucket, Delimiter=‘/‘).get(‘CommonPrefixes’) statement.

This tutorial teaches you the different methods to retrieve subfolder names in an S3 bucket using the Boto3 client or Boto3 resource.

To know the difference between Boto3 client and resource, read the Difference between Boto3 Resource And Client.

Retrieve Subfolder names in An S3 Bucket Using Boto3 Client

In this section, you’ll use the Boto3 client to list all the subfolder names in an S3 bucket.

Follow the below steps to retrieve the subfolder names in an S3 bucket.

When using the Boto3 client, only the immediate subfolders are listed.

Code

The following code demonstrates how to use the Boto3 client and the list_objects_v2() to list the subfolder names of an S3 bucket.

import boto3

s3_client = boto3.client('s3', 
  aws_access_key_id= 'Your Access Key ID',
  aws_secret_access_key= 'You Secret access key'
                      )

bucket = 'stackvidhya'

result = s3_client.list_objects_v2(Bucket=bucket,   Delimiter='/')

for obj in result.get('CommonPrefixes'):

    print(obj.get('Prefix'))

Output

The immediate subfolders in an S3 bucket are displayed.

    csv_files/
    json_files/
    text_files/

Retrieve in An S3 Bucket Using Boto3 Resource

In this section, you’ll use the Boto3 Resource to list all the subfolder names in an S3 bucket.

Follow the below steps to retrieve the subfolder names in an S3 bucket.

  • Create a Boto3 Resource object by specifying the security credentials
  • Use the objects.all() method and pass the bucket name to get all the objects available in the bucket. This will return all the objects, including the prefixes, a.k.a subfolders.
  • If the object ends with /, it is a subfolder. You can check this using an If statement and print it.

All the subfolders in the S3 bucket are displayed, including the subfolders under the subfolder.

Code

The following code demonstrates how to use the Boto3 resource and the objects.all() to list the subfolder names of an S3 bucket.

import boto3

session = boto3.Session(
aws_access_key_id= 'Your Access Key ID',
aws_secret_access_key= 'You Secret access key'
)

s3_resource = session.resource('s3')

s3_bucket = s3_resource.Bucket("stackvidhya")

for obj in s3_bucket.objects.all():
    if(obj.key.endswith('/')):
        print(obj.key)

Output

    csv_files/
    csv_files/csv_sub_folder/
    json_files/
    text_files/

Retrieve Subfolders Inside a Specific S3 Prefix Using Boto3 Client

This section teaches you how to Retrieve subfolders inside an S3 prefix or subfolder using the Boto3 client.

In other words, how to retrieve subfolders inside a subfolder of an S3 bucket.

  • Pass the prefix name using the Prefix parameter to the list_objects_v2() method. The prefix name must end with /.
  • All the subfolders under that specific prefix will be returned.
  • You can iterate through the CommonPrefixes key and get the prefixes.

Code

The following code demonstrates how to retrieve the subfolders under a specific subfolder.

import boto3

s3_client = boto3.client('s3', 
aws_access_key_id= 'Your Access Key ID',
  aws_secret_access_key= 'You Secret access key'
                      )

bucket = 'stackvidhya'

# Important: ’/’ in the end
prefix = 'csv_files/'

result = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix,  Delimiter='/')

for o in result.get('CommonPrefixes'):
    print(o.get('Prefix'))

Output

    csv_files/csv_sub_folder/

Retrieve Subfolders inside an S3 Prefix Using Boto3 Resource

This section teaches you how to Retrieve subfolders inside an S3 prefix or subfolder using the Boto3 resource.

There is no specific method available to get subfolders of a particular prefix. Hence, you need to get all the objects and filter them.

  • Get all the objects
  • Check if the object name starts with your desired prefix and check if it ends with /.
  • If True, then print it. Objects that pass these conditions are the prefix of specified prefixes.

Code

The following code demonstrates how to retrieve the subfolders under a specific subfolder using the boto3 resource.

import boto3

session = boto3.Session(
aws_access_key_id= 'Your Access Key ID',
  aws_secret_access_key= 'You Secret access key'
)

s3_resource = session.resource('s3')

# Important: ’/’ in the end
prefix = 'csv_files/'

s3_bucket = s3_resource.Bucket("stackvidhya")

for obj in s3_bucket.objects.all():
    if(obj.key.startswith(prefix) & obj.key.endswith('/')):
        print(obj.key)

Output

    csv_files/
    csv_files/csv_sub_folder/

Conclusion

You’ve learned how to retrieve subfolder names in an S3 bucket using the Boto3 client and the Boto3 resource.

Additionally, you’ve learned how to retrieve the subfolders under a specific prefix.

You can use these steps according to your needs.

If you’ve any questions, feel free to comment below.

You May Also Like

Leave a Comment