RIFM: Best Practices Design Patterns: Optimizing Amazon S3 Performance

Voiced by Amazon Polly

Whitepaper (Amazon)

Whitepaper (Backup)


If you walk away with anything, it should be this: Amazon S3 can handle whatever you throw at it, as long as you follow the rules. It can be as fast as you need (3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests/second/prefix), scales horizontally seamlessly across a wide pool of IP addresses and can provide single-digit millisecond latencies. Use the SDK if you value your time and sanity.


Amazon S3 scales. Big time. It can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests/second/prefix, and you aren’t limited by prefixes in a bucket. Scale performance by parallelizing requests. (Use the SDK).

Latency? What’s that? If you’re working on the next Facebook you can get consistent latencies on small objects as low as 100-200 milliseconds. Optionally pair with your favorite AWS acceleration service like CloudFront or ElastiCache to achieve as low as single-digit millisecond latencies.

Be sure to check the AWS KMS limits if you’re using server-side encryption with AWS Key Management Service. 

You can now use sequential (like date-based) naming for prefixes. This is a positive shift, make the computers do the hard work instead of trying to reason about randomized prefix naming.

Part 1: Performance Guidelines

Measure Everything

If you’re at the point where performance is an issue on S3, start measuring everything you can to find bottlenecks. Evaluate EC2 instance types, measure network throughput, DRAM, DNS statistics, etc.

Scale Connections Horizontally

Amazon S3 is a huge distributed system, not a single network endpoint. Issue multiple concurrent requests to the S3 API to get the best performance. The SDK is your friend here. There are no Amazon limits to the number of connections made to your bucket.

Use the SDK (and byte-range fetches)

If you can’t use the SDK, you can fetch different byte ranges from the same object using the Range HTTP header in the GET request. It is recommended that if you PUT an object using a multipart upload you should GET them in the same part sizes.

Use the SDK (and retry requests as needed)

Amazon S3’s Load Balancing works such that if a request is slow, retried requests will likely be routed differently and succeed quicker. Use the SDK to configure timeout and retry values.

Place Storage and Compute in the Same Region

You can reduce latency and data transfer costs by placing your S3 bucket and EC2 instance in the same region.

Use Transfer Acceleration to Reduce Distance Based Latency

Transfer Acceleration uses CloudFront’s edge network to automatically optimize the route an object takes to S3. This is best used for large amounts of data (gigabytes to terabytes) when transferring regularly across large geographic spans. Amazon has built a tool to compare accelerated / non-accelerated upload speeds, check it before turning this on. Transfer Acceleration can be used on new or existing buckets.

Part 2: Performance Design Patterns

Cache Frequently Accessed Content

If your application has content that is frequently requested, consider using a caching service such as CloudFront, ElastiCache, or Elemental MediaStore. Caching successfully can result in single-digit millisecond latency and reduce costs.

CloudFront is a CDN (content delivery network) that caches data at different geographic points around the world. Consider using this when you want to improve the speed at which users access your application’s data without changing your application logic as much.

ElastiCache is a service for a managed, in-memory cache. This will create and manage EC2 instances that cache objects. To use this you would need to update your application logic to include a caching strategy (when do you send objects to the cache vs retrieve objects from it?)

Elemental MediaStore is caching and a CDN just for video.

Timeouts and Retries Can Improve Latency (use the SDK)

Amazon S3 dynamically scales based on request rates. While it’s doing it’s scaling thing you will receive HTTP 503 responses until the optimization is complete. You should implement retries and exponential backoff to account for 503 responses.

There are different methods recommended for large variably sized requests and fixed-size requests. Generally, you should expect more consistent response times for fixed-size requests because Amazon can optimize it’s scaling more reliably. If you can use the SDK, it will handle most of this for you including exponential back off and retries. 

Parallelize Requests for High Throughput

Amazon S3 Transfer Manager is included in some SDK’s, and similar concepts are found in most others. Try to create parallel connections by launching multiple requests concurrently to spread out your requests in S3. Amazon S3 will load balance your requests and create multiple connections to your S3 bucket improving performance. Remember, there are no Amazon limits to the connections to your bucket. 

It is possible to tune your application by using the SDKs to make direct GET and PUT requests instead of using Transfer Management, but remember to measure your performance before you do. Use a network utility tool like netstat to ensure you are connecting to a variety of IP addresses.

Use Transfer Acceleration When Location is an Issue

Transfer Acceleration reduces latency found from a geographic distance. It uses CloudFront’s edge network of over 50 locations. It can be used on new or existing buckets. If you need to regularly transfer large objects around the world or even just the country, use the Transfer Acceleration Speed Comparison Tool to see if it is worth the cost.


There is no substitute for reading the original, and this one is only about 8 pages long. If you have the time, read it straight from Amazon.

Leave a Reply

Your email address will not be published.