TechFedd LogoTechFedd

How Amazon S3 Works ✨

The System Design Newsletter

The System Design Newsletter

Neo Kim • Published 8 months ago • 1 min read

Read Original
How Amazon S3 Works ✨

Core Technical Concepts/Technologies Discussed

  • Amazon S3 (Simple Storage Service)
  • Object storage architecture
  • Data durability and availability
  • Storage classes (Standard, Intelligent-Tiering, Glacier)
  • Consistency models (read-after-write, eventual)
  • Scalability and performance optimizations

Main Points

  • S3 Basics:

    • Fully managed object storage service for unstructured data (images, videos, logs).
    • Uses a flat namespace with bucket-object hierarchy (no file system structure).
  • Durability & Availability:

    • 99.999999999% (11 nines) durability via data replication across multiple AZs.
    • 99.99% availability for Standard storage class.
  • Storage Classes:

    • Standard: Low-latency, high-throughput for frequent access.
    • Intelligent-Tiering: Automatically moves objects between tiers based on usage.
    • Glacier: Low-cost archival storage with retrieval delays (minutes to hours).
  • Consistency Models:

    • Read-after-write: Immediate consistency for new object PUTs.
    • Eventual consistency: Updates/deletes may take time to propagate.
  • Performance:

    • Scales to thousands of requests/sec; supports multipart uploads for large files (>5GB).
    • Prefixes and request rate tuning optimize performance for high-throughput workloads.

Technical Specifications/Implementation

  • Bucket Naming: Globally unique, DNS-compliant (no underscores).
  • Data Partitioning: Uses a distributed key-value store; partitions objects by bucket + key.
  • Security: IAM policies, bucket policies, ACLs, and encryption (SSE-S3, SSE-KMS).

Key Takeaways

  1. S3 is designed for massive scalability, durability, and low-cost storage across diverse use cases.
  2. Storage class selection balances cost vs. access frequency (e.g., Glacier for archives).
  3. Prefix design impacts performance—avoid sequential keys to prevent hotspotting.

Limitations/Considerations

  • Eventual consistency: Not suitable for real-time sync across distributed systems.
  • Costs: API requests and data transfer fees can add up for high-throughput workloads.
  • Further Exploration: Integration with AWS services (Lambda, CloudFront) for advanced workflows.

#59: Break Into Amazon Engineering (4 Minutes)

This article was originally published on The System Design Newsletter

Visit Original Source