Improve S3 performance for listing objects in transfer tasks by jamesls · Pull Request #10293 · aws/aws-cli

jamesls · 2026-05-08T18:27:28Z

This improves the rate at which we can list objects for S3 transfer tasks such as recursive download, sync, and s3 to s3 copies. In high compute environments this has become one of the main bottlenecks affecting the transfer of a large number of objects, particularly when using the CRT transfer client. We aren't able to queue work fast enough. To speed things up I added three changes.

The first is an improvement in parsing the ListObjectsV2 response. We were previously double-parsing the LastModified member, which is mostly a historical artifact of when the CLI had differing behavior for parsing timestamps than botocore. As a result of this custom parsing being left in place in the bucket lister we were parsing the timestamps twice. To minimize the scope of changes, we keep the existing local-timezone datetime parsing in the bucket lister, but we set the botocore parser used in the bucket lister client to be a noop. This does make the code slightly more complicated as we only plumb through this behavior for bucket lister so we need new client factory methods for that, so we should decide if it's worth trying to make this behavior the default for all of the S3 client creation used in the CLI.

The remaining changes are related to moving the bucket listing off of the main thread and over to a producer/consumer model, with the main thread now pulling objects off of a shared queue.

The producer thread is further broken down into this "quick page" feature where alternating threads are used to retrieve subsequent pages with an SAX based XML parser being used to do a first pass scan to extract the NextContinuationToken. This allows the network IO work to continue as soon as possible while botocore finishes the standard XML parsing of the response body, and the subsequent "page drain" of processing the S3 key names and queueing files over to the CRT layer.

As for rollout, I've added a new bucket_lister config option under S3, with the default being the existing single threaded behavior. Users can opt-in via:

s3 =
    bucket_lister = threaded

The idea would be that this will flip to the default behavior after some period of bake time.

This improves the rate at which we can list objects for S3 transfer tasks such as recursive download, sync, and s3 to s3 copies. In high compute environments this has become one of the main bottlenecks affecting the transfer of a large number of objects, particularly when using the CRT transfer client. We aren't able to queue work fast enough. To speed things up I added three changes. The first is an improvement in parsing the `ListObjectsV2` response. We were previously double-parsing the `LastModified` member, which is mostly a historical artifact of when the CLI had differing behavior for parsing timestamps than botocore. As a result of this custom parsing being left in place in the bucket lister we were parsing the timestamps twice. To minimize the scope of changes, we keep the existing local-timezone datetime parsing in the bucket lister, but we set the botocore parser used in the bucket lister client to be a noop. This does make the code slightly more complicated as we only plumb through this behavior for bucket lister so we need new client factory methods for that, so we should decide if it's worth trying to make this behavior the default for all of the S3 client creation used in the CLI. The remaining changes are related to moving the bucket listing off of the main thread and over to a producer/consumer model, with the main thread now pulling objects off of a shared queue. The producer thread is further broken down into this "quick page" feature where alternating threads are used to retrieve subsequent pages with an SAX based XML parser being used to do a first pass scan to extract the `NextContinuationToken`. This allows the network IO work to continue as soon as possible while botocore finishes the standard XML parsing of the response body, and the subsequent "page drain" of processing the S3 key names and queueing files over to the CRT layer. As for rollout, I've added a new `bucket_lister` config option under S3, with the default being the existing single threaded behavior. Users can opt-in via: ``` s3 = bucket_lister = threaded ``` The idea would be that this will flip to the default behavior after some period of bake time.

hssyoo

Thanks for the contribution, diff's looking good. Need to play around with it a bit but in the meantime had some small comments.

Can we also add a changelog entry?

On early shutdowns from either the user or non-recoverable errors it's possible that this can happen when the quick page threads are blocked on the bare `put()` calls. There's nothing to wake them up for shutdown so this can block the process from exiting (this would require the threads have already hit their 10 page look ahead limit and are waiting on pages to free up). To fix this, we need an explicit pool loop to check if a shutdown has been triggered and use `put(timeout=...)` so we never block indefinitely.

Seeing `standard` vs. `threaded` was confusing, and even more so if `threaded` ever becomes the default. I think it's more clear if we have `single` vs. `threaded`.

jamesls · 2026-05-12T14:34:48Z

Can we also add a changelog entry?

Changelog added.

Thanks for the contribution, diff's looking good. Need to play around with it a bit but in the meantime had some small comments.

One thing I've been using to see only the effects of the bucket lister improvements is to skip the actual transfers and use --dryrun, e.g.:

$ time aws s3 cp --recursive s3://mybucket/foo/ /tmp/ --dryrun --quiet

real    0m16.369s
user    0m4.287s
sys     0m2.366s

$ aws configure set s3.bucket_lister threaded

$ time aws s3 cp --recursive s3://mybucket/foo/ /tmp/ --dryrun --quiet

real    0m9.999s
user    0m3.692s
sys     0m1.839s

hssyoo

🏆 I saw comparable improvements with ~20k objects on my Mac. Also ran an internal build against this branch to verify builds and integration tests pass.

hssyoo reviewed May 11, 2026

View reviewed changes

jamesls added 4 commits May 12, 2026 09:16

Add changelog entry for bucket lister feature

4cf9c69

Incorporate review feedback, fix lint and typecheck issues

9cd66d1

Rename bucket_lister standard to single and clarify docs

72e7bc0

Seeing `standard` vs. `threaded` was confusing, and even more so if `threaded` ever becomes the default. I think it's more clear if we have `single` vs. `threaded`.

hssyoo approved these changes May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve S3 performance for listing objects in transfer tasks#10293

Improve S3 performance for listing objects in transfer tasks#10293
jamesls wants to merge 5 commits into
aws:v2from
jamesls:jmes-bucket-lister

jamesls commented May 8, 2026

Uh oh!

hssyoo left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamesls commented May 12, 2026

Uh oh!

hssyoo left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jamesls commented May 8, 2026

Uh oh!

hssyoo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamesls commented May 12, 2026

Uh oh!

hssyoo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants