Scalability & Architecture
Scalability & Architecture Scalability Patterns Horizontal scaling — add more servers; requires stateless app (store sessions in Redis, not in-process) Load bal…
Scalability & Architecture
Scalability Patterns
Horizontal scaling — add more servers; requires stateless app (store sessions in Redis, not in-process)
Load balancing — distribute traffic (round-robin, least-connections, IP hash for sticky sessions)
Caching — CDN (static), Redis/Memcached (app), browser cache, HTTP cache headers
Database scaling — read replicas (read from replica, write to primary), sharding (horizontal partitioning), vertical scaling
Async processing — offload slow tasks (email, image processing) to queues (SQS, RabbitMQ, Bull)
CDN — serve static assets and cacheable content from edge locations near users
CAP Theorem
Distributed systems can only guarantee 2 of 3: Consistency (every read gets the most recent write), Availability (every request gets a non-error response), Partition Tolerance (system works despite network partitions). Since network partitions happen, real systems choose CP (PostgreSQL, MongoDB) or AP (DynamoDB, Cassandra in eventual consistency mode).
Common Components
Client → DNS → CDN (static/cached) → Load Balancer → App Servers → Cache → Database
↓
Message Queue → Workers
Key components:
┌──────────────────────────────────────────────────────────┐
│ DNS — Route 53, Cloudflare (geo-routing, failover)│
│ CDN — CloudFront, Fastly (static assets, caching) │
│ Load Balancer— ALB, Nginx (health checks, SSL termination) │
│ App Servers — stateless, auto-scaling group │
│ Cache Layer — Redis (sessions, hot data, rate limiting) │
│ Primary DB — PostgreSQL, MySQL (writes) │
│ Read Replica — for read-heavy queries │
│ Object Store — S3 (uploads, large files, backups) │
│ Message Queue— SQS, RabbitMQ (async jobs, decoupling) │
│ Workers — process jobs (emails, image resize, reports)│
│ Search — Elasticsearch (full-text, faceted search) │
│ Monitoring — Datadog, CloudWatch (metrics, alerts) │
│ Log Aggregator— Elasticsearch/Loki + Grafana │
└──────────────────────────────────────────────────────────┘Database Selection Guide
PostgreSQL/MySQL — structured data, complex queries, ACID, strong consistency
MongoDB — flexible schema, documents with nested data, rapid iteration
Redis — caching, sessions, rate limiting, leaderboards, pub/sub
Elasticsearch — full-text search, log analytics, faceted filtering
Cassandra/DynamoDB — massive scale write-heavy, time-series, multi-region
S3/Object Store — files, images, videos, backups, data lake