All topics
Cloud · Learning hub

S3 notes for developers

Master S3 with a curated set of 8 developer notes — core concepts, patterns, and interview prep. Maintained by the DevRecall team.

Save this stack to your DevRecallMore Cloud notes
S3

S3 Fundamentals

S3 Fundamentals Amazon S3 is an object storage service with virtually unlimited capacity. Objects are stored in buckets, identified by keys, and organized into

S3 Fundamentals

Amazon S3 is an object storage service with virtually unlimited capacity. Objects are stored in buckets, identified by keys, and organized into storage classes. The AWS CLI makes it easy to interact with S3 programmatically.

Buckets, Objects & Keys

# Bucket names: globally unique, 3-63 chars, lowercase letters/numbers/hyphens
# Object key: the full path within the bucket (e.g. "images/2024/photo.jpg")
# URL: https://<bucket>.s3.<region>.amazonaws.com/<key>

# Create a bucket (in a specific region)
aws s3 mb s3://my-app-assets-prod --region us-east-1
aws s3 mb s3://my-app-backups --region us-west-2

# List buckets
aws s3 ls

# List objects in a bucket
aws s3 ls s3://my-app-assets-prod
aws s3 ls s3://my-app-assets-prod/images/            # List a "folder" (prefix)
aws s3 ls s3://my-app-assets-prod --recursive        # All objects recursively
aws s3 ls s3://my-app-assets-prod --recursive --human-readable --summarize

# Copy objects
aws s3 cp local-file.txt s3://my-app-assets-prod/uploads/file.txt
aws s3 cp s3://my-bucket/file.txt ./local-file.txt   # Download
aws s3 cp s3://src-bucket/key.txt s3://dst-bucket/key.txt  # Copy between buckets

# Copy with metadata and storage class
aws s3 cp large-file.zip s3://my-bucket/ \
  --storage-class STANDARD_IA \
  --metadata "version=1.2,author=alice" \
  --content-type "application/zip"

# Sync local directory to S3 (delta — only changed files)
aws s3 sync ./dist s3://my-app-assets-prod/ \
  --delete \
  --exclude ".DS_Store" \
  --exclude "*.map"

# Sync S3 to local (backup)
aws s3 sync s3://my-app-backups/db/ ./backups/

# Move (copy + delete)
aws s3 mv s3://my-bucket/old-key.txt s3://my-bucket/new-key.txt

# Delete objects
aws s3 rm s3://my-bucket/file.txt
aws s3 rm s3://my-bucket/uploads/ --recursive        # Delete prefix

# Remove empty bucket
aws s3 rb s3://my-bucket
# Remove non-empty bucket (force)
aws s3 rb s3://my-bucket --force

Storage Classes

# Storage class   | Durability | Availability | Min duration | Use case
# ─────────────────────────────────────────────────────────────────────────────
# Standard         | 11 9s      | 99.99%       | None         | Frequently accessed data
# Standard-IA      | 11 9s      | 99.9%        | 30 days      | Infrequent access, retrieval fee
# One Zone-IA      | 11 9s      | 99.5%        | 30 days      | Non-critical, infrequent access
# Intelligent-Tier | 11 9s      | 99.9%        | None         | Unknown access patterns (auto-moves)
# Glacier Instant  | 11 9s      | 99.9%        | 90 days      | Archives, millisecond retrieval
# Glacier Flexible | 11 9s      | 99.99%       | 90 days      | Archives, 1min-12hr retrieval
# Glacier Deep Arc | 11 9s      | 99.99%       | 180 days     | Long-term archive, 12-48hr retrieval

# Upload to Glacier Instant Retrieval
aws s3 cp old-data.tar.gz s3://my-archive-bucket/ \
  --storage-class GLACIER_IR

# Change storage class of existing object
aws s3 cp s3://my-bucket/file.txt s3://my-bucket/file.txt \
  --storage-class STANDARD_IA --metadata-directive COPY

# Check storage class of an object
aws s3api head-object \
  --bucket my-bucket \
  --key file.txt \
  --query "StorageClass"

Presigned URLs & Metadata

# Presigned URLs — temporary, signed URLs for private objects (no auth required by caller)
# Use for: client-side downloads of private files, client-side uploads directly to S3

# Generate presigned GET URL (default 1 hour, max 7 days with SigV4)
aws s3 presign s3://my-private-bucket/report.pdf --expires-in 3600

# Generate presigned PUT URL (for client-side uploads — keeps AWS credentials server-side)
aws s3 presign s3://my-uploads-bucket/user/123/avatar.jpg \
  --expires-in 300 \
  --region us-east-1

# Using presigned URL for upload from client (curl example)
# curl -X PUT "<presigned-put-url>" \
#   -H "Content-Type: image/jpeg" \
#   --upload-file avatar.jpg

# Node.js SDK presigned URL
# import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
# import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
# const url = await getSignedUrl(
#   s3Client,
#   new GetObjectCommand({ Bucket: "my-bucket", Key: "report.pdf" }),
#   { expiresIn: 3600 }
# );

# Object metadata — key-value pairs stored with the object
# User-defined metadata keys must be prefixed with x-amz-meta-
aws s3api put-object \
  --bucket my-bucket \
  --key docs/report.pdf \
  --body report.pdf \
  --content-type "application/pdf" \
  --metadata "author=alice,version=2.1,project=myapp"

# Retrieve metadata (HEAD request — no body download)
aws s3api head-object --bucket my-bucket --key docs/report.pdf
S3

Permissions & Policies

S3 Permissions & Policies S3 access control has multiple layers: bucket policies (resource-based), IAM policies (identity-based), Block Public Access settings,

S3 Permissions & Policies

S3 access control has multiple layers: bucket policies (resource-based), IAM policies (identity-based), Block Public Access settings, and Access Points. Understanding when to use each is critical for both security and correct functionality.

Bucket Policies & Block Public Access

// Bucket policy — applied at the bucket level, allows cross-account access
// Evaluated alongside IAM policies; DENY always wins

// Allow public read for static website bucket (ACLs disabled, use bucket policy)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*"
    }
  ]
}

// Allow only CloudFront OAC (Origin Access Control) to read objects
// (Replace PUBLIC read policy once CloudFront is set up)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudFrontOAC",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudfront.amazonaws.com"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*",
      "Condition": {
        "StringEquals": {
          "AWS:SourceArn": "arn:aws:cloudfront::123456789012:distribution/EDFDVBD6EXAMPLE"
        }
      }
    }
  ]
}

// Allow a specific IAM role to read/write to a prefix
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {"AWS": "arn:aws:iam::123456789012:role/my-app-role"},
      "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
      "Resource": "arn:aws:s3:::my-app-bucket/uploads/*"
    },
    {
      "Effect": "Allow",
      "Principal": {"AWS": "arn:aws:iam::123456789012:role/my-app-role"},
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::my-app-bucket",
      "Condition": {"StringLike": {"s3:prefix": "uploads/*"}}
    }
  ]
}
# Apply bucket policy
aws s3api put-bucket-policy \
  --bucket my-website-bucket \
  --policy file://bucket-policy.json

# Block Public Access — 4 independent settings, all ON by default for new buckets
# BlockPublicAcls:        Reject requests to PUT public ACLs
# IgnorePublicAcls:       Ignore public ACLs already applied
# BlockPublicPolicy:      Reject requests to PUT bucket policy that grants public access
# RestrictPublicBuckets:  Restrict public/cross-account access even if policy allows it

# Check Block Public Access settings
aws s3api get-public-access-block --bucket my-bucket

# Disable Block Public Access for a static website bucket
# (Only do this if you intend public access — use OAC+CloudFront instead)
aws s3api put-public-access-block \
  --bucket my-website-bucket \
  --public-access-block-configuration \
    "BlockPublicAcls=false,IgnorePublicAcls=false,BlockPublicPolicy=false,RestrictPublicBuckets=false"

# Legacy ACLs — avoid for new buckets; use bucket policies instead
# "Bucket owner enforced" setting disables ACLs (recommended for new buckets)
aws s3api put-bucket-ownership-controls \
  --bucket my-bucket \
  --ownership-controls "Rules=[{ObjectOwnership=BucketOwnerEnforced}]"

IAM Policies & Access Points

// IAM policy for an application role — read/write to specific bucket prefix
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-app-uploads/*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::my-app-uploads",
      "Condition": {
        "StringLike": {
          "s3:prefix": ["", "uploads/", "uploads/*"]
        }
      }
    }
  ]
}
# VPC Endpoint for S3 — route S3 traffic within AWS network (no internet)
# Gateway endpoint: free; affects route tables; only S3 and DynamoDB
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-0abc123 \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-0abc123

# S3 Access Points — per-application access control on a shared bucket
# Useful when multiple teams/apps share a bucket but need isolated permissions
aws s3control create-access-point \
  --account-id 123456789012 \
  --name data-science-access-point \
  --bucket my-shared-data-lake \
  --vpc-configuration VpcId=vpc-0abc123

# Access via Access Point ARN
# s3://arn:aws:s3:us-east-1:123:accesspoint/data-science-access-point/key.csv

# Cross-account access — bucket policy grants access to another account
# Account B accesses Account A's bucket:
# Account A bucket policy allows Account B's role
# Account B IAM policy allows s3:* on Account A's bucket ARN
S3

Static Hosting & CloudFront

Static Hosting & CloudFront S3 static website hosting combined with CloudFront is the standard low-cost, high-performance deployment for SPAs and static sites.

Static Hosting & CloudFront

S3 static website hosting combined with CloudFront is the standard low-cost, high-performance deployment for SPAs and static sites. Using OAC (Origin Access Control) keeps the S3 bucket private while CloudFront serves it globally over HTTPS.

S3 Static Website Hosting

# Enable static website hosting
aws s3 website s3://my-website-bucket \
  --index-document index.html \
  --error-document 404.html

# Or via API (for more control)
aws s3api put-bucket-website \
  --bucket my-website-bucket \
  --website-configuration '{
    "IndexDocument": {"Suffix": "index.html"},
    "ErrorDocument": {"Key": "404.html"},
    "RoutingRules": [
      {
        "Condition": {"HttpErrorCodeReturnedEquals": "404"},
        "Redirect": {"ReplaceKeyWith": "index.html"}
      }
    ]
  }'

# SPA routing fix: redirect 404 to index.html (React Router, Vue Router etc.)
# In routing rules above, redirect 404s to index.html
# OR use CloudFront custom error response (preferred — more control)

# Get the website endpoint
aws s3api get-bucket-website --bucket my-website-bucket
# Endpoint: http://my-website-bucket.s3-website-us-east-1.amazonaws.com
# NOTE: S3 website endpoint does NOT support HTTPS — use CloudFront for HTTPS

# Deploy a React/Vue/Next.js static build
npm run build
aws s3 sync ./dist s3://my-website-bucket/ \
  --delete \
  --cache-control "max-age=31536000,immutable" \
  --exclude "index.html"
# Upload index.html separately with no-cache
aws s3 cp ./dist/index.html s3://my-website-bucket/index.html \
  --cache-control "no-cache,no-store,must-revalidate" \
  --content-type "text/html"

CloudFront Distribution Setup

# Create a CloudFront distribution with S3 origin (using OAC — recommended)
# Step 1: Create OAC (Origin Access Control)
aws cloudfront create-origin-access-control \
  --origin-access-control-config '{
    "Name": "my-website-oac",
    "Description": "OAC for my website bucket",
    "SigningProtocol": "sigv4",
    "SigningBehavior": "always",
    "OriginAccessControlOriginType": "s3"
  }'

# Step 2: Create distribution (simplified — use Console or CloudFormation for full config)
aws cloudfront create-distribution \
  --distribution-config file://cf-config.json

# cf-config.json key fields:
# {
#   "Origins": {
#     "Items": [{
#       "Id": "s3-origin",
#       "DomainName": "my-website-bucket.s3.us-east-1.amazonaws.com",
#       "S3OriginConfig": {"OriginAccessIdentity": ""},
#       "OriginAccessControlId": "<oac-id>"
#     }]
#   },
#   "DefaultCacheBehavior": {
#     "ViewerProtocolPolicy": "redirect-to-https",
#     "CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",  // CachingOptimized
#     "Compress": true
#   },
#   "CustomErrorResponses": {
#     "Items": [{"ErrorCode": 404, "ResponsePagePath": "/index.html", "ResponseCode": "200", "ErrorCachingMinTTL": 300}]
#   },
#   "Aliases": {"Items": ["www.example.com"]},
#   "ViewerCertificate": {"AcmCertificateArn": "<cert-arn>", "SslSupportMethod": "sni-only"}
# }

# Invalidate CloudFront cache (after deploy)
aws cloudfront create-invalidation \
  --distribution-id EDFDVBD6EXAMPLE \
  --paths "/*"

# Targeted invalidation (faster + cheaper — $0.005 per 1k paths after 1k free/mo)
aws cloudfront create-invalidation \
  --distribution-id EDFDVBD6EXAMPLE \
  --paths "/index.html" "/assets/app.*.js"

# Route 53 custom domain → CloudFront (A record alias)
# In Route 53: Create A record with "Alias to CloudFront distribution"
# Alias records are free; route to CloudFront distribution domain (xxx.cloudfront.net)

CloudFront Cache Policies & Behaviors

# AWS managed cache policies (use these before creating custom ones)
# CachingOptimized:     TTL 1 day-1 year; compresses; best for immutable assets
# CachingDisabled:      No caching; good for API origins
# CachingOptimizedForUncompressedObjects: like Optimized but no compression

# Managed policy IDs:
# CachingOptimized:    658327ea-f89d-4fab-a63d-7e88639e58f6
# CachingDisabled:     4135ea2d-6df8-44a3-9df3-4b5a84be39ad

# Multiple behaviors — route by path pattern
# Default (*): S3 origin → CachingOptimized → serve static files
# /api/*:       ALB origin → CachingDisabled → proxy to backend (no caching)

# Check cache hit rate in CloudWatch
# Metric: CacheHitRate in namespace AWS/CloudFront
# Low hit rate: check Vary headers, Cache-Control headers, query string forwarding

# Real-time logs → Kinesis Data Streams (for live analysis)
# Standard access logs → S3 (15 min delay, free)

# Geo restriction — block/allow by country
aws cloudfront update-distribution \
  --id EDFDVBD6EXAMPLE \
  --distribution-config file://updated-config.json
# In config: "Restrictions": {"GeoRestriction": {"RestrictionType": "blacklist", "Locations": ["CN","RU"]}}

# Signed URLs vs Signed Cookies:
# Signed URL:    restrict access to individual files (e.g. paid video files)
# Signed Cookie: restrict access to multiple files (e.g. all files in a subscription tier)
S3

Advanced Features & CLI

S3 Advanced Features Versioning, lifecycle policies, event notifications, and replication are the production-grade S3 features that enable data protection, cost

S3 Advanced Features

Versioning, lifecycle policies, event notifications, and replication are the production-grade S3 features that enable data protection, cost management, and event-driven architectures.

Versioning & Lifecycle Policies

# Versioning — keeps all versions of every object
# Once enabled, cannot be fully disabled (only suspended)
# Deleted objects get a "delete marker", not actually removed
# Previous versions can be restored by removing the delete marker

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-app-uploads \
  --versioning-configuration Status=Enabled

# List versions of an object
aws s3api list-object-versions \
  --bucket my-app-uploads \
  --prefix "user/123/avatar.jpg"

# Restore a specific version
aws s3api copy-object \
  --bucket my-app-uploads \
  --copy-source "my-app-uploads/user/123/avatar.jpg?versionId=abc123" \
  --key "user/123/avatar.jpg"

# Lifecycle policy — automate storage transitions and expiration
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-app-uploads \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "move-old-to-ia",
        "Status": "Enabled",
        "Filter": {"Prefix": "uploads/"},
        "Transitions": [
          {"Days": 30, "StorageClass": "STANDARD_IA"},
          {"Days": 90, "StorageClass": "GLACIER_IR"},
          {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
        ],
        "Expiration": {"Days": 2555}
      },
      {
        "ID": "delete-incomplete-multipart",
        "Status": "Enabled",
        "Filter": {},
        "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}
      },
      {
        "ID": "expire-old-versions",
        "Status": "Enabled",
        "Filter": {},
        "NoncurrentVersionExpiration": {"NoncurrentDays": 90}
      }
    ]
  }'

Event Notifications & S3 Select

# S3 Event Notifications — trigger on object create/delete/restore
# Destinations: Lambda, SQS, SNS, EventBridge (EventBridge supports more event types)

# Configure event notification (send to SQS on any object creation)
aws s3api put-bucket-notification-configuration \
  --bucket my-uploads-bucket \
  --notification-configuration '{
    "QueueConfigurations": [
      {
        "Id": "NewUploadNotification",
        "QueueArn": "arn:aws:sqs:us-east-1:123:upload-processing-queue",
        "Events": ["s3:ObjectCreated:*"],
        "Filter": {
          "Key": {
            "FilterRules": [
              {"Name": "prefix", "Value": "uploads/"},
              {"Name": "suffix", "Value": ".jpg"}
            ]
          }
        }
      }
    ]
  }'
# NOTE: SQS queue policy must allow s3.amazonaws.com to send messages

# Enable EventBridge integration (more flexible — pattern matching, multiple targets)
aws s3api put-bucket-notification-configuration \
  --bucket my-uploads-bucket \
  --notification-configuration '{"EventBridgeConfiguration": {}}'

# S3 Select — query structured data in-place (CSV, JSON, Parquet)
# Reduces data transfer by up to 80% compared to downloading whole object
aws s3api select-object-content \
  --bucket my-data-bucket \
  --key data/users.csv \
  --expression "SELECT s.name, s.email FROM S3Object s WHERE s.country = 'US'" \
  --expression-type SQL \
  --input-serialization '{"CSV": {"FileHeaderInfo": "USE", "FieldDelimiter": ","}}' \
  --output-serialization '{"CSV": {}}' \
  /dev/stdout

# S3 Multipart Upload — required for objects > 5 GB, recommended for > 100 MB
# Enables parallelism and retry of failed parts
aws s3api create-multipart-upload \
  --bucket my-bucket --key large-file.zip
# Upload parts (minimum 5 MB each except last)
# Complete multipart upload
aws s3api complete-multipart-upload \
  --bucket my-bucket --key large-file.zip \
  --upload-id abc123 \
  --multipart-upload file://parts.json

# aws s3 cp and sync handle multipart automatically for large files
# Control multipart threshold and chunk size
aws configure set default.s3.multipart_threshold 64MB
aws configure set default.s3.multipart_chunksize 16MB

Replication & Object Lock

# Replication — copies objects to another bucket (asynchronously)
# CRR (Cross-Region Replication): source and destination in different regions
# SRR (Same-Region Replication): same region (e.g. copy prod to dev account)
# Requires: versioning on BOTH source and destination buckets

# Create replication configuration
aws s3api put-bucket-replication \
  --bucket my-source-bucket \
  --replication-configuration '{
    "Role": "arn:aws:iam::123:role/s3-replication-role",
    "Rules": [
      {
        "ID": "replicate-uploads",
        "Status": "Enabled",
        "Filter": {"Prefix": "uploads/"},
        "Destination": {
          "Bucket": "arn:aws:s3:::my-dr-bucket-us-west-2",
          "StorageClass": "STANDARD_IA",
          "ReplicationTime": {
            "Status": "Enabled",
            "Time": {"Minutes": 15}
          },
          "Metrics": {"Status": "Enabled", "EventThreshold": {"Minutes": 15}}
        },
        "DeleteMarkerReplication": {"Status": "Enabled"}
      }
    ]
  }'
# ReplicationTime (RTC): guarantees 99.99% of objects replicated within 15 min (costs extra)

# Object Lock — WORM (Write Once Read Many) protection
# Prevent deletion or overwrite for a fixed retention period
# Use for: regulatory compliance, ransomware protection
# Must be enabled at bucket creation time
aws s3api create-bucket \
  --bucket my-compliant-bucket \
  --region us-east-1
aws s3api put-object-lock-configuration \
  --bucket my-compliant-bucket \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Days": 2555
      }
    }
  }'
# COMPLIANCE mode: cannot be deleted or shortened even by root user
# GOVERNANCE mode: can be overridden by users with s3:BypassGovernanceRetention permission

# S3 Batch Operations — apply an operation to billions of objects
# Operations: copy, PUT tags, invoke Lambda, restore from Glacier, replicate
aws s3control create-job \
  --account-id 123456789012 \
  --operation '{"S3PutObjectTagging": {"TagSet": [{"Key": "processed", "Value": "true"}]}}' \
  --manifest '{"Spec": {"Format": "S3BatchOperations_CSV_20180820"}, "Location": {"ObjectArn": "arn:aws:s3:::my-bucket/manifest.csv", "ETag": "abc"}}' \
  --report '{"Bucket": "arn:aws:s3:::my-reports", "ReportScope": "AllTasks", "Enabled": true}' \
  --role-arn arn:aws:iam::123:role/batch-ops-role \
  --priority 10 \
  --confirmation-required
S3

Fundamentals & Bucket Operations

AWS S3: Fundamentals & Bucket Operations Amazon S3 (Simple Storage Service) is object storage built for any amount of data. Objects are stored in buckets. Each

AWS S3: Fundamentals & Bucket Operations

Amazon S3 (Simple Storage Service) is object storage built for any amount of data. Objects are stored in buckets. Each object has a key (path-like name), data, and metadata. S3 is not a filesystem — there are no real directories, only key prefixes.

Core Concepts

  • Bucket: globally unique container; region-specific but namespace is global

  • Object: key + data + metadata; max object size 5TB (multipart for >100MB)

  • Key: full "path" like images/2024/photo.jpg — the slash is just part of the name

  • S3 URI format: s3://my-bucket/images/photo.jpg

  • Strong consistency: all operations (GET/PUT/DELETE) are strongly consistent since Dec 2020

AWS CLI — Bucket & Object Operations

# Create a bucket (region required outside us-east-1)
aws s3 mb s3://my-bucket --region eu-west-1

# List buckets / objects
aws s3 ls
aws s3 ls s3://my-bucket/
aws s3 ls s3://my-bucket/images/ --recursive --human-readable

# Copy files
aws s3 cp file.txt s3://my-bucket/uploads/file.txt
aws s3 cp s3://my-bucket/file.txt ./local-file.txt
aws s3 cp s3://src-bucket/ s3://dst-bucket/ --recursive

# Sync local dir to S3 (only changed files)
aws s3 sync ./dist s3://my-bucket/website/ --delete
aws s3 sync s3://my-bucket/backups/ ./backups/

# Move and delete
aws s3 mv s3://my-bucket/old.txt s3://my-bucket/new.txt
aws s3 rm s3://my-bucket/file.txt
aws s3 rm s3://my-bucket/folder/ --recursive

Object Metadata & Content-Type

# Upload with metadata and content-type
aws s3 cp index.html s3://my-bucket/ \
  --content-type "text/html; charset=utf-8" \
  --cache-control "max-age=31536000" \
  --metadata '{"x-app-version":"1.2.0"}'

# Set content-type on existing objects (requires copy to itself)
aws s3 cp s3://my-bucket/file.js s3://my-bucket/file.js \
  --metadata-directive REPLACE \
  --content-type "application/javascript" \
  --cache-control "max-age=86400"

Presigned URLs

# Generate a presigned GET URL (expires in 1 hour)
aws s3 presign s3://my-bucket/private/report.pdf --expires-in 3600

# Presigned PUT URL via SDK (CLI only supports GET)
# Use aws s3api for more control
// Node.js SDK v3
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';

const client = new S3Client({ region: 'eu-west-1' });

// Presigned GET — share private file for 1 hour
const getUrl = await getSignedUrl(client, new GetObjectCommand({
  Bucket: 'my-bucket',
  Key: 'private/report.pdf',
}), { expiresIn: 3600 });

// Presigned PUT — allow client to upload directly to S3
const putUrl = await getSignedUrl(client, new PutObjectCommand({
  Bucket: 'my-bucket',
  Key: `uploads/${crypto.randomUUID()}.jpg`,
  ContentType: 'image/jpeg',
}), { expiresIn: 300 });

Multipart Upload (large files)

# aws s3 cp handles multipart automatically for files > 8MB
# For fine-grained control use s3api:

# 1. Initiate
aws s3api create-multipart-upload --bucket my-bucket --key large-file.zip

# 2. Upload parts (each must be >= 5MB except last)
aws s3api upload-part --bucket my-bucket --key large-file.zip \
  --part-number 1 --upload-id <UploadId> --body part1.bin

# 3. Complete
aws s3api complete-multipart-upload --bucket my-bucket --key large-file.zip \
  --upload-id <UploadId> \
  --multipart-upload '{"Parts":[{"ETag":"...","PartNumber":1}]}'

# List and abort stuck uploads (important — they cost money)
aws s3api list-multipart-uploads --bucket my-bucket
aws s3api abort-multipart-upload --bucket my-bucket --key large-file.zip --upload-id <UploadId>
S3

Storage Classes, Versioning & Lifecycle

S3 Storage Classes, Versioning & Lifecycle S3 offers multiple storage classes optimized for different access patterns. Choosing the right class is the biggest c

S3 Storage Classes, Versioning & Lifecycle

S3 offers multiple storage classes optimized for different access patterns. Choosing the right class is the biggest cost lever. Lifecycle rules automate transitions and expirations.

Storage Classes Comparison

Class                         | Min Duration | Retrieval   | Use Case
------------------------------|--------------|-------------|---------------------------
S3 Standard                   | none         | immediate   | Frequently accessed data
S3 Intelligent-Tiering        | none         | immediate   | Unknown/changing access
S3 Standard-IA                | 30 days      | immediate   | Infrequent, still fast
S3 One Zone-IA                | 30 days      | immediate   | Reproducible infrequent
S3 Glacier Instant Retrieval  | 90 days      | milliseconds| Archives, quarterly access
S3 Glacier Flexible Retrieval | 90 days      | min–hours   | Backups, flexible timing
S3 Glacier Deep Archive       | 180 days     | 12–48 hours | Compliance, rarely needed

Setting Storage Class on Upload

# Upload directly to a specific class
aws s3 cp logs.tar.gz s3://my-bucket/archives/ \
  --storage-class GLACIER_IR   # Instant Retrieval

# Storage class flags:
# STANDARD | REDUCED_REDUNDANCY | STANDARD_IA | ONEZONE_IA
# INTELLIGENT_TIERING | GLACIER | DEEP_ARCHIVE | GLACIER_IR

Versioning

Versioning keeps all versions of an object. Once enabled on a bucket it cannot be fully disabled, only suspended. Deleting a versioned object adds a delete marker; the actual data remains.

# Enable versioning
aws s3api put-bucket-versioning --bucket my-bucket \
  --versioning-configuration Status=Enabled

# List all versions of an object
aws s3api list-object-versions --bucket my-bucket --prefix images/photo.jpg

# Download a specific version
aws s3api get-object --bucket my-bucket --key images/photo.jpg \
  --version-id abc123XYZ output.jpg

# Permanently delete a specific version (not just add delete marker)
aws s3api delete-object --bucket my-bucket --key images/photo.jpg \
  --version-id abc123XYZ

# Restore an accidentally deleted object by removing the delete marker
aws s3api delete-object --bucket my-bucket --key images/photo.jpg \
  --version-id <delete-marker-version-id>

Lifecycle Rules

Lifecycle rules automate transitions between storage classes and deletion of old objects. Apply them via the console or JSON policy.

{
  "Rules": [
    {
      "ID": "archive-logs",
      "Status": "Enabled",
      "Filter": { "Prefix": "logs/" },
      "Transitions": [
        { "Days": 30, "StorageClass": "STANDARD_IA" },
        { "Days": 90, "StorageClass": "GLACIER_IR" },
        { "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
      ],
      "Expiration": { "Days": 2555 }
    },
    {
      "ID": "clean-incomplete-uploads",
      "Status": "Enabled",
      "Filter": {},
      "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 }
    },
    {
      "ID": "expire-old-versions",
      "Status": "Enabled",
      "Filter": {},
      "NoncurrentVersionTransitions": [
        { "NoncurrentDays": 30, "StorageClass": "GLACIER_IR" }
      ],
      "NoncurrentVersionExpiration": { "NoncurrentDays": 90 }
    }
  ]
}
# Apply lifecycle config
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-bucket \
  --lifecycle-configuration file://lifecycle.json

# View current lifecycle config
aws s3api get-bucket-lifecycle-configuration --bucket my-bucket

S3 Intelligent-Tiering

Intelligent-Tiering automatically moves objects between access tiers based on usage. No retrieval fees, no minimum duration penalties. Small per-object monitoring fee. Best for data with unknown or changing access patterns.

  • Frequent Access tier: actively used objects

  • Infrequent Access tier: objects not accessed for 30 days

  • Archive Instant Access: not accessed for 90 days (optional activation)

  • Archive Access: not accessed for 90–730 days (optional, need restore)

  • Deep Archive Access: not accessed for 180–730+ days (optional)

S3

Security, IAM Policies & Access Control

S3 Security, IAM Policies & Access Control S3 security has multiple layers: Block Public Access settings (account and bucket level), bucket policies, IAM polici

S3 Security, IAM Policies & Access Control

S3 security has multiple layers: Block Public Access settings (account and bucket level), bucket policies, IAM policies, ACLs (legacy), and encryption. The default is private — everything requires explicit grants.

Block Public Access — Always Check This First

# View account-level Block Public Access settings
aws s3control get-public-access-block --account-id 123456789012

# Bucket-level
aws s3api get-public-access-block --bucket my-bucket

# These 4 settings block public access regardless of bucket policy:
# BlockPublicAcls        — ignores ACLs that grant public access
# IgnorePublicAcls       — ignores any existing public ACLs
# BlockPublicPolicy      — rejects bucket policies that grant public access
# RestrictPublicBuckets  — restricts access to authorized users/services only

Bucket Policies

Bucket policies are resource-based IAM policies attached to the bucket. They can grant access to other AWS accounts, specific IAM roles, CloudFront, and anonymous users.

// Allow CloudFront OAC to read objects (modern approach)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudFrontServicePrincipal",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudfront.amazonaws.com"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "StringEquals": {
          "AWS:SourceArn": "arn:aws:cloudfront::123456789012:distribution/ABCDEF123456"
        }
      }
    }
  ]
}
// Grant another AWS account read access
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::999999999999:root"
      },
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::my-bucket",
        "arn:aws:s3:::my-bucket/*"
      ]
    }
  ]
}
# Apply bucket policy
aws s3api put-bucket-policy --bucket my-bucket --policy file://policy.json

# Get current policy
aws s3api get-bucket-policy --bucket my-bucket | jq '.Policy | fromjson'

IAM Policies for S3 Access

// IAM policy: allow a specific app to read/write its prefix only
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
      "Resource": "arn:aws:s3:::my-bucket/app-data/${aws:username}/*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::my-bucket",
      "Condition": {
        "StringLike": {
          "s3:prefix": ["app-data/${aws:username}/*"]
        }
      }
    }
  ]
}

Encryption

  • SSE-S3 (default since Jan 2023): AWS manages keys, AES-256, no extra cost

  • SSE-KMS: your KMS key, audit trail via CloudTrail, extra API costs, per-request charges

  • SSE-C: you provide the key with every request, AWS never stores it

  • DSSE-KMS: dual-layer encryption with KMS (compliance use cases)

  • Client-side: encrypt before upload, AWS sees only ciphertext

# Upload with SSE-KMS
aws s3 cp sensitive.dat s3://my-bucket/ \
  --sse aws:kms \
  --sse-kms-key-id arn:aws:kms:us-east-1:123456789012:key/mrk-abc123

# Require encryption via bucket policy (deny unencrypted uploads)
# Add to bucket policy:
# "Condition": { "StringNotEquals": { "s3:x-amz-server-side-encryption": "aws:kms" } }

S3 Object Lock (WORM)

Object Lock prevents objects from being deleted or overwritten for a fixed period or indefinitely. Must be enabled at bucket creation.

  • Governance mode: only users with s3:BypassGovernanceRetention can delete

  • Compliance mode: nobody can delete, not even root — use for strict compliance

  • Legal hold: indefinite, overrides retention period, toggle on/off

VPC Endpoints & Access Logging

# Create Gateway endpoint to keep S3 traffic inside VPC (free)
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-abc123 \
  --service-name com.amazonaws.eu-west-1.s3 \
  --route-table-ids rtb-xyz789

# Enable S3 access logging to another bucket
aws s3api put-bucket-logging --bucket my-bucket \
  --bucket-logging-status '{
    "LoggingEnabled": {
      "TargetBucket": "my-logs-bucket",
      "TargetPrefix": "s3-access-logs/my-bucket/"
    }
  }'
S3

Static Hosting, CloudFront & Interview Questions

S3 Static Hosting, CloudFront & Interview Questions S3 Static Website Hosting S3 can serve static websites directly. The website endpoint differs from the REST

S3 Static Hosting, CloudFront & Interview Questions

S3 Static Website Hosting

S3 can serve static websites directly. The website endpoint differs from the REST API endpoint and supports index/error documents and redirect rules.

# Enable static website hosting
aws s3api put-bucket-website --bucket my-bucket \
  --website-configuration '{
    "IndexDocument": {"Suffix": "index.html"},
    "ErrorDocument": {"Key": "404.html"}
  }'

# Website endpoint format (not the REST API endpoint):
# http://my-bucket.s3-website-eu-west-1.amazonaws.com

# Must also allow public access (for a truly public site)
aws s3api put-bucket-policy --bucket my-bucket --policy '{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": "*",
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::my-bucket/*"
  }]
}'

CloudFront + S3 (Production Setup)

CloudFront distributes content globally via edge locations, adds HTTPS with custom domains, caching, and WAF integration. Use OAC (Origin Access Control) to keep S3 private.

# Modern approach: OAC (replaces OAI)
# 1. Create CloudFront distribution pointing to S3 origin
# 2. Enable OAC on origin (no public bucket policy needed)
# 3. CloudFront attaches sigv4 request to S3 — only CF can read

# Invalidate CloudFront cache after deployment
aws cloudfront create-invalidation \
  --distribution-id ABCDEF123456 \
  --paths "/*"

# Invalidate specific paths
aws cloudfront create-invalidation \
  --distribution-id ABCDEF123456 \
  --paths "/index.html" "/static/js/*"

S3 Replication

# Cross-Region Replication (CRR) — requires versioning on both buckets
aws s3api put-bucket-replication --bucket source-bucket \
  --replication-configuration '{
    "Role": "arn:aws:iam::123456789012:role/s3-replication-role",
    "Rules": [{
      "Status": "Enabled",
      "Filter": {"Prefix": ""},
      "Destination": {
        "Bucket": "arn:aws:s3:::dest-bucket-eu",
        "StorageClass": "STANDARD_IA"
      }
    }]
  }'
  • CRR (Cross-Region): disaster recovery, data sovereignty, latency reduction

  • SRR (Same-Region): log aggregation across accounts, dev/prod data sync

  • Replication does not apply to objects that existed before rule was created

  • Delete markers not replicated by default (configure DeleteMarkerReplication)

S3 Event Notifications

// Trigger Lambda on object upload
{
  "LambdaFunctionConfigurations": [{
    "LambdaFunctionArn": "arn:aws:lambda:eu-west-1:123:function:process-upload",
    "Events": ["s3:ObjectCreated:*"],
    "Filter": {
      "Key": {
        "FilterRules": [
          {"Name": "prefix", "Value": "uploads/"},
          {"Name": "suffix", "Value": ".jpg"}
        ]
      }
    }
  }]
}

S3 Transfer Acceleration

# Enable Transfer Acceleration (uses CloudFront edge for upload path)
aws s3api put-bucket-accelerate-configuration \
  --bucket my-bucket \
  --accelerate-configuration Status=Enabled

# Use accelerated endpoint for uploads from distant clients
aws s3 cp large-file.zip s3://my-bucket/ \
  --endpoint-url https://my-bucket.s3-accelerate.amazonaws.com

Interview Questions

  • S3 vs EFS vs EBS? S3=object storage (web scale, any size), EFS=shared NFS for EC2 (managed filesystem), EBS=block storage attached to single EC2 (like a hard drive)

  • How to make S3 access faster globally? CloudFront CDN for reads; S3 Transfer Acceleration for uploads; choose bucket region close to users

  • S3 consistency model? Strong read-after-write consistency for all operations since Dec 2020 — no need for consistency workarounds anymore

  • How to prevent accidental deletion? Enable versioning + Object Lock; use MFA Delete; restrict DeleteObject via IAM; lifecycle rules for noncurrent version retention

  • Cost optimization in S3? Lifecycle rules to cheaper storage classes; S3 Intelligent-Tiering for unknown access; analyze with S3 Storage Lens; delete incomplete multipart uploads; enable S3 Inventory to find unused objects

  • Cross-account S3 access? Bucket policy grants the other account; that account's IAM must also allow it (both resource and identity policy needed)

  • What is an S3 presigned URL? A time-limited signed URL allowing temporary GET or PUT without AWS credentials; generated server-side, used client-side

Keep your S3 knowledge sharp.

Save this stack to your personal DevRecall — add your own notes, track what you're learning, and share what you know with the community.

Get started — free forever