AWS S3 – List all objects within a given bucket

29 November 2016

I think it is fair to say that Amazon Web Services has depth in the various offerings, but with this depth comes unsurprisingly a little complexity in getting to grips with it. I have spent the last few months working with Lambda, SNS, DynamoDB and S3 for various projects. It is not always clear from the documentation the best practice method to perform a certain operation with the various AWS products.

One such issue I had was with listing all objects within an S3 bucket. While working on uploading objects to S3, I wanted to know there was a simple and efficient operation where I could list all items and verify all objects had been uploaded correctly. For this, I used the AWS SDK for Python, Boto3. The following code snippet shows how to get a listing of all objects within a given bucket:

import boto3

# Uncomment this to enable botocore debug mode
# boto3.set_stream_logger(name=‘botocore’)

# Create a boto3 session
session = boto3.session.Session()

# Create a resource instance
s3 = session.resource(‘s3’)

# Our S3 bucket for which we want to list all objects
s3_bucket_name = ‘my-test-bucket'

# Create an iterable of all ObjectSummary resources
objects = s3.Bucket(s3_bucket_name).objects.all()

# Print out objects
for object in objects:
  print object

The important thing to note here is that no requests to Amazon S3 are made until the for loop is executed. The initial iteration results in an API call to Amazon S3, the response to this is a list of 1000 objects. Once these have been exhausted another API call is made to retrieve the next batch of 1000. This means for a bucket with 10000 objects, the above code will result in 10 GET requests to S3. You can see this for yourself by uncommenting the debug line in the above snippet and seeing the output. Be warned, it is very verbose!

Here is some sample output from the above code:

s3.ObjectSummary(bucket_name='my-test-bucket', key=u’pictures/clouds.jpg’)
s3.ObjectSummary(bucket_name='my-test-bucket', key=u’pictures/snow.jpg’)
s3.ObjectSummary(bucket_name='my-test-bucket', key=u’pictures/mountains.jpg’)

Note that this is only the ‘s3.ObjectSummary’ that is printed out, you can also access other attributes such as:

  • size
  • e_tag
  • last_modified
  • owner
  • storage_class