Skip to content

Improve S3 performance for listing objects in transfer tasks#10293

Open
jamesls wants to merge 5 commits into
aws:v2from
jamesls:jmes-bucket-lister
Open

Improve S3 performance for listing objects in transfer tasks#10293
jamesls wants to merge 5 commits into
aws:v2from
jamesls:jmes-bucket-lister

Conversation

@jamesls
Copy link
Copy Markdown
Member

@jamesls jamesls commented May 8, 2026

This improves the rate at which we can list objects for S3 transfer tasks such as recursive download, sync, and s3 to s3 copies. In high compute environments this has become one of the main bottlenecks affecting the transfer of a large number of objects, particularly when using the CRT transfer client. We aren't able to queue work fast enough. To speed things up I added three changes.

The first is an improvement in parsing the ListObjectsV2 response. We were previously double-parsing the LastModified member, which is mostly a historical artifact of when the CLI had differing behavior for parsing timestamps than botocore. As a result of this custom parsing being left in place in the bucket lister we were parsing the timestamps twice. To minimize the scope of changes, we keep the existing local-timezone datetime parsing in the bucket lister, but we set the botocore parser used in the bucket lister client to be a noop. This does make the code slightly more complicated as we only plumb through this behavior for bucket lister so we need new client factory methods for that, so we should decide if it's worth trying to make this behavior the default for all of the S3 client creation used in the CLI.

The remaining changes are related to moving the bucket listing off of the main thread and over to a producer/consumer model, with the main thread now pulling objects off of a shared queue.

The producer thread is further broken down into this "quick page" feature where alternating threads are used to retrieve subsequent pages with an SAX based XML parser being used to do a first pass scan to extract the NextContinuationToken. This allows the network IO work to continue as soon as possible while botocore finishes the standard XML parsing of the response body, and the subsequent "page drain" of processing the S3 key names and queueing files over to the CRT layer.

As for rollout, I've added a new bucket_lister config option under S3, with the default being the existing single threaded behavior. Users can opt-in via:

s3 =
    bucket_lister = threaded

The idea would be that this will flip to the default behavior after some period of bake time.

This improves the rate at which we can list objects for S3
transfer tasks such as recursive download, sync, and s3 to s3
copies.  In high compute environments this has become one of
the main bottlenecks affecting the transfer of a large number
of objects, particularly when using the CRT transfer client.
We aren't able to queue work fast enough.  To speed things up
I added three changes.

The first is an improvement in parsing the `ListObjectsV2` response.  We
were previously double-parsing the `LastModified` member, which is
mostly a historical artifact of when the CLI had differing behavior for
parsing timestamps than botocore.  As a result of this custom parsing
being left in place in the bucket lister we were parsing the timestamps
twice.  To minimize the scope of changes, we keep the existing
local-timezone datetime parsing in the bucket lister, but we set the
botocore parser used in the bucket lister client to be a noop.  This
does make the code slightly more complicated as we only plumb through
this behavior for bucket lister so we need new client factory methods
for that, so we should decide if it's worth trying to make this
behavior the default for all of the S3 client creation used in the
CLI.

The remaining changes are related to moving the bucket listing off of
the main thread and over to a producer/consumer model, with the main
thread now pulling objects off of a shared queue.

The producer thread is further broken down into this "quick page"
feature where alternating threads are used to retrieve subsequent
pages with an SAX based XML parser being used to do a first pass scan
to extract the `NextContinuationToken`.  This allows the network IO
work to continue as soon as possible while botocore finishes the
standard XML parsing of the response body, and the subsequent "page
drain" of processing the S3 key names and queueing files over to the
CRT layer.

As for rollout, I've added a new `bucket_lister` config option under
S3, with the default being the existing single threaded behavior.
Users can opt-in via:

```
s3 =
    bucket_lister = threaded
```

The idea would be that this will flip to the default behavior after
some period of bake time.
Copy link
Copy Markdown
Contributor

@hssyoo hssyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, diff's looking good. Need to play around with it a bit but in the meantime had some small comments.

Can we also add a changelog entry?

Comment thread awscli/topics/s3-config.rst
Comment thread awscli/customizations/s3/filegenerator.py Outdated
Comment thread awscli/customizations/s3/bucketlister.py Outdated
Comment thread awscli/customizations/s3/bucketlister.py Outdated
Comment thread awscli/customizations/s3/bucketlister.py Outdated
Comment thread awscli/customizations/s3/bucketlister.py Outdated
jamesls added 4 commits May 12, 2026 09:16
On early shutdowns from either the user or non-recoverable
errors it's possible that this can happen when the quick page
threads are blocked on the bare `put()` calls.  There's nothing
to wake them up for shutdown so this can block the process from
exiting (this would require the threads have already hit their
10 page look ahead limit and are waiting on pages to free up).
To fix this, we need an explicit pool loop to check if a shutdown
has been triggered and use `put(timeout=...)` so we never block
indefinitely.
Seeing `standard` vs. `threaded` was confusing, and even more
so if `threaded` ever becomes the default.  I think it's more clear
if we have `single` vs. `threaded`.
@jamesls
Copy link
Copy Markdown
Member Author

jamesls commented May 12, 2026

Can we also add a changelog entry?

Changelog added.

Thanks for the contribution, diff's looking good. Need to play around with it a bit but in the meantime had some small comments.

One thing I've been using to see only the effects of the bucket lister improvements is to skip the actual transfers and use --dryrun, e.g.:

$ time aws s3 cp --recursive s3://mybucket/foo/ /tmp/ --dryrun --quiet

real    0m16.369s
user    0m4.287s
sys     0m2.366s

$ aws configure set s3.bucket_lister threaded

$ time aws s3 cp --recursive s3://mybucket/foo/ /tmp/ --dryrun --quiet

real    0m9.999s
user    0m3.692s
sys     0m1.839s

Copy link
Copy Markdown
Contributor

@hssyoo hssyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏆 I saw comparable improvements with ~20k objects on my Mac. Also ran an internal build against this branch to verify builds and integration tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants