EC2

EC2 Fundamentals

EC2 Fundamentals Amazon EC2 (Elastic Compute Cloud) provides resizable compute capacity in the cloud. Understanding the instance lifecycle, metadata service, an…

EC2 Fundamentals

Amazon EC2 (Elastic Compute Cloud) provides resizable compute capacity in the cloud. Understanding the instance lifecycle, metadata service, and user data scripts is the baseline for working with EC2 programmatically.

Instance Lifecycle & AWS CLI Launch

# Instance states:
# pending → running → stopping → stopped → terminated
# running → rebooting → running (OS reboot, instance stays on same host)
# Stopped instances do NOT incur compute charges but EBS volumes still cost money

# Launch an instance with AWS CLI
aws ec2 run-instances \
  --image-id ami-0c02fb55956c7d316 \
  --instance-type t3.micro \
  --key-name my-key-pair \
  --security-group-ids sg-0a1b2c3d4e5f67890 \
  --subnet-id subnet-0bb1c79de3EXAMPLE \
  --associate-public-ip-address \
  --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=web-server},{Key=Env,Value=prod}]" \
  --count 1

# Start / Stop / Reboot / Terminate
aws ec2 start-instances --instance-ids i-1234567890abcdef0
aws ec2 stop-instances --instance-ids i-1234567890abcdef0
aws ec2 reboot-instances --instance-ids i-1234567890abcdef0
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0

# Describe running instances with filter
aws ec2 describe-instances \
  --filters "Name=instance-state-name,Values=running" "Name=tag:Env,Values=prod" \
  --query "Reservations[*].Instances[*].{ID:InstanceId,IP:PublicIpAddress,Type:InstanceType}"

# Connect via EC2 Instance Connect (no key pair needed, uses IAM)
aws ec2-instance-connect send-ssh-public-key \
  --instance-id i-1234567890abcdef0 \
  --instance-os-user ec2-user \
  --ssh-public-key file://~/.ssh/id_rsa.pub
ssh ec2-user@<public-ip>

Instance Metadata Service (IMDS)

Every EC2 instance can query its own metadata from 169.254.169.254. IMDSv2 (token-based) is the secure default — IMDSv1 allows SSRF attacks.

# IMDSv2 — get a session token first (TTL in seconds)
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Then use the token in subsequent requests
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/

# Useful metadata endpoints (prefix all with http://169.254.169.254/latest/meta-data/)
# instance-id                — i-1234567890abcdef0
# instance-type              — t3.micro
# local-ipv4                 — private IP
# public-ipv4                — public IP (absent if no public IP)
# public-hostname            — ec2-x-x-x-x.compute-1.amazonaws.com
# placement/availability-zone — us-east-1a
# iam/security-credentials/  — temporary credentials from instance role
# ami-id                     — the AMI used to launch

# Get IAM role credentials from instance (for use in scripts)
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/iam/security-credentials/MyRoleName

# Enforce IMDSv2 only (disable IMDSv1) — do this on all instances
aws ec2 modify-instance-metadata-options \
  --instance-id i-1234567890abcdef0 \
  --http-tokens required \
  --http-put-response-hop-limit 1

User Data Scripts

# User data runs as root on first boot only (re-run on every boot with cloud-init config)
# Pass as a script when launching:
aws ec2 run-instances \
  --image-id ami-0c02fb55956c7d316 \
  --instance-type t3.micro \
  --key-name my-key \
  --user-data file://bootstrap.sh

# bootstrap.sh — typical web server setup
#!/bin/bash
set -e
yum update -y
yum install -y nginx
systemctl enable nginx
systemctl start nginx

# Install Node.js 20 via nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
export NVM_DIR="/root/.nvm"
source "$NVM_DIR/nvm.sh"
nvm install 20
nvm use 20

# Log user data output
exec > >(tee /var/log/user-data.log | logger -t user-data -s 2>/dev/console) 2>&1

# View user data logs on the instance
cat /var/log/cloud-init-output.log
tail -f /var/log/user-data.log

# View user data that was passed to an instance
aws ec2 describe-instance-attribute \
  --instance-id i-1234567890abcdef0 \
  --attribute userData \
  --query "UserData.Value" --output text | base64 --decode

EC2

Instance Types & AMIs

Instance Types & AMIs Choosing the right instance type and AMI strategy directly impacts cost, performance, and operational complexity. Understanding instance f…

Instance Types & AMIs

Choosing the right instance type and AMI strategy directly impacts cost, performance, and operational complexity. Understanding instance families, pricing models, and AMI lifecycle is essential for production deployments.

Instance Families

# Instance type naming: <family><generation>.<size>
# e.g. t3.medium, m6i.xlarge, c7g.2xlarge, r6a.4xlarge

# General Purpose
# t*  — burstable (t2/t3/t4g); cheapest, throttled when CPU credits exhaust; great for low-traffic apps
# m*  — balanced CPU/memory (m5/m6i/m7i); the workload baseline

# Compute Optimized
# c*  — high CPU-to-memory ratio (c5/c6i/c7g); web serving, batch, gaming servers

# Memory Optimized
# r*  — high memory (r5/r6i); databases, in-memory caches
# x*  — extreme memory (x1e/x2gd); SAP HANA, in-memory analytics
# z1d — high frequency + NVMe local storage

# Storage Optimized
# i*  — NVMe SSD instance storage (i3/i4i); high IOPS databases
# d*  — HDD instance storage; data warehouses, Hadoop

# Accelerated Computing
# g*  — NVIDIA GPU (g4dn/g5); ML inference, video encoding
# p*  — NVIDIA GPU high-end (p3/p4); ML training
# inf* — AWS Inferentia chips; cheapest ML inference

# Processor suffixes
# (none) = Intel Xeon
# a      = AMD EPYC
# g      = AWS Graviton (ARM) — best price/performance
# i      = Intel Ice Lake

# Find instances by attribute using CLI
aws ec2 describe-instance-types \
  --filters "Name=memory-info.size-in-mib,Values=8192" \
  --query "InstanceTypes[*].{Type:InstanceType,vCPU:VCpuInfo.DefaultVCpus}" \
  --output table

Pricing Models

# On-Demand — pay per hour/second, no commitment; highest cost
# Use for: unpredictable workloads, dev/test, short-term

# Reserved Instances — 1 or 3 year commitment; up to 72% cheaper than on-demand
# Standard RI:    fixed instance type; biggest discount
# Convertible RI: can change instance family/OS; smaller discount (~54%)

# Savings Plans — flexible commitment by $/hr spend; up to 66% savings
# Compute SP: applies to any EC2, Lambda, Fargate — most flexible
# EC2 Instance SP: specific region+family commitment — cheaper than Compute SP

# Spot Instances — unused EC2 capacity; up to 90% cheaper; can be interrupted 2-min notice
# Use for: batch jobs, stateless workers, CI/CD runners, fault-tolerant workloads

# Launch a Spot instance
aws ec2 run-instances \
  --instance-type c5.xlarge \
  --image-id ami-0c02fb55956c7d316 \
  --instance-market-options "MarketType=spot,SpotOptions={MaxPrice=0.05,SpotInstanceType=one-time}"

# Get current spot prices
aws ec2 describe-spot-price-history \
  --instance-types c5.xlarge \
  --product-descriptions "Linux/UNIX" \
  --start-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --query "SpotPriceHistory[*].{AZ:AvailabilityZone,Price:SpotPrice}" \
  --output table

AMI Creation & Management

# Create an AMI from a running instance (creates EBS snapshot)
aws ec2 create-image \
  --instance-id i-1234567890abcdef0 \
  --name "my-app-server-2024-01-15" \
  --description "Production app server with v2.3.1" \
  --no-reboot   # Try not to reboot (may cause filesystem inconsistency)

# Wait until AMI is available
aws ec2 wait image-available --image-ids ami-0abc123def456789

# List your own AMIs
aws ec2 describe-images --owners self \
  --query "Images[*].{ID:ImageId,Name:Name,State:State,Date:CreationDate}" \
  --output table

# Copy AMI to another region (for DR or multi-region deploy)
aws ec2 copy-image \
  --source-region us-east-1 \
  --source-image-id ami-0abc123def456789 \
  --name "my-app-server-us-west-2" \
  --region us-west-2

# Share AMI with another AWS account
aws ec2 modify-image-attribute \
  --image-id ami-0abc123def456789 \
  --launch-permission "Add=[{UserId=123456789012}]"

# Find latest Amazon Linux 2023 AMI in current region
aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=al2023-ami-*" "Name=architecture,Values=x86_64" \
  --query "sort_by(Images, &CreationDate)[-1].ImageId" \
  --output text

# Deregister old AMI (first delete snapshots to stop storage charges)
aws ec2 deregister-image --image-id ami-oldimage123
aws ec2 delete-snapshot --snapshot-id snap-0abc123def456789

EC2

Security Groups & Networking

Security Groups & Networking EC2 networking is built on VPC. Security groups are the primary firewall at the instance level. Understanding Elastic IPs, ENIs, an…

Security Groups & Networking

EC2 networking is built on VPC. Security groups are the primary firewall at the instance level. Understanding Elastic IPs, ENIs, and subnet placement is required for reliable, secure architectures.

Security Groups

# Security groups are STATEFUL firewalls — return traffic is automatically allowed
# All outbound traffic is allowed by default
# All inbound traffic is denied by default (you add explicit allow rules)
# Multiple SGs can be attached to one instance (rules are unioned)

# Create a security group
aws ec2 create-security-group \
  --group-name web-server-sg \
  --description "HTTP/HTTPS and SSH access" \
  --vpc-id vpc-0abc123def456789

# Add inbound rules
# Allow HTTP from anywhere
aws ec2 authorize-security-group-ingress \
  --group-id sg-0a1b2c3d4e5f67890 \
  --protocol tcp --port 80 --cidr 0.0.0.0/0

# Allow HTTPS from anywhere
aws ec2 authorize-security-group-ingress \
  --group-id sg-0a1b2c3d4e5f67890 \
  --protocol tcp --port 443 --cidr 0.0.0.0/0

# Allow SSH from your IP only
aws ec2 authorize-security-group-ingress \
  --group-id sg-0a1b2c3d4e5f67890 \
  --protocol tcp --port 22 --cidr $(curl -s https://checkip.amazonaws.com)/32

# Allow traffic from another security group (e.g. app tier to DB tier)
aws ec2 authorize-security-group-ingress \
  --group-id sg-db123 \
  --protocol tcp --port 5432 \
  --source-group sg-app456

# SG vs NACL:
# Security Group: instance-level, stateful, allow rules only
# NACL (Network ACL): subnet-level, stateless, allow + deny rules, processed in order by rule number

Elastic IP & ENI

# Elastic IP — static public IPv4 address you own
# Charged when NOT associated with a running instance

# Allocate an Elastic IP
aws ec2 allocate-address --domain vpc

# Associate with an instance
aws ec2 associate-address \
  --instance-id i-1234567890abcdef0 \
  --allocation-id eipalloc-0abc123def456789

# Disassociate and release
aws ec2 disassociate-address --association-id eipassoc-0abc123def456789
aws ec2 release-address --allocation-id eipalloc-0abc123def456789

# ENI (Elastic Network Interface) — virtual NIC
# Each instance has a primary ENI; you can attach additional ones
# Secondary ENIs can be moved between instances (useful for failover)

# Create a secondary ENI in a specific subnet
aws ec2 create-network-interface \
  --subnet-id subnet-0bb1c79de3EXAMPLE \
  --groups sg-0a1b2c3d4e5f67890 \
  --description "secondary interface"

# Attach to instance
aws ec2 attach-network-interface \
  --network-interface-id eni-0abc123def456789 \
  --instance-id i-1234567890abcdef0 \
  --device-index 1

# Describe network interfaces
aws ec2 describe-network-interfaces \
  --filters "Name=attachment.instance-id,Values=i-1234567890abcdef0"

VPC Placement & NAT Gateway

# Public subnet: has route to Internet Gateway (IGW) — instances can have public IPs
# Private subnet: no route to IGW — instances need NAT Gateway for outbound internet

# Typical 3-tier VPC layout:
# Public subnets:  Load balancers, bastion hosts
# Private subnets: Application servers (EC2)
# Private subnets: Databases (RDS), caches (ElastiCache)

# Create a NAT Gateway in a public subnet (for private subnet outbound access)
aws ec2 create-nat-gateway \
  --subnet-id subnet-public-0abc123 \
  --allocation-id eipalloc-0abc123def456789   # Elastic IP for the NAT GW

# Update route table for private subnet to use NAT GW
aws ec2 create-route \
  --route-table-id rtb-0abc123def456789 \
  --destination-cidr-block 0.0.0.0/0 \
  --nat-gateway-id nat-0abc123def456789

# Check reachability from a private instance via NAT
# (from inside the instance)
curl -s https://checkip.amazonaws.com   # Should return NAT GW's EIP

# Placement Groups — control physical placement of instances
# cluster:  all in same rack (lowest latency, 10 Gbps between instances)
# spread:   different hardware (max HA, max 7 instances per AZ per group)
# partition: separate racks per partition (Hadoop, Cassandra, Kafka)
aws ec2 create-placement-group --group-name my-cluster --strategy cluster
aws ec2 run-instances --placement "GroupName=my-cluster" ...

EC2

Auto Scaling & Load Balancing

Auto Scaling & Load Balancing Auto Scaling Groups (ASG) combined with Application Load Balancers (ALB) are the standard pattern for elastic, highly available EC…

Auto Scaling & Load Balancing

Auto Scaling Groups (ASG) combined with Application Load Balancers (ALB) are the standard pattern for elastic, highly available EC2 deployments. Understanding Launch Templates, scaling policies, and target groups is essential.

Launch Templates & Auto Scaling Groups

# Create a Launch Template (replaces Launch Configurations — use LT)
aws ec2 create-launch-template \
  --launch-template-name web-server-lt \
  --version-description "v1" \
  --launch-template-data '{
    "ImageId": "ami-0c02fb55956c7d316",
    "InstanceType": "t3.medium",
    "KeyName": "my-key",
    "SecurityGroupIds": ["sg-0a1b2c3d4e5f67890"],
    "UserData": $(base64 -w0 bootstrap.sh),
    "IamInstanceProfile": {"Name": "my-ec2-role"},
    "TagSpecifications": [{
      "ResourceType": "instance",
      "Tags": [{"Key": "Name", "Value": "web-server"}]
    }]
  }'

# Create Auto Scaling Group
aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name web-asg \
  --launch-template "LaunchTemplateName=web-server-lt,Version=1" \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --vpc-zone-identifier "subnet-aaa111,subnet-bbb222,subnet-ccc333" \
  --target-group-arns arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-tg/abc123 \
  --health-check-type ELB \
  --health-check-grace-period 300

# Update desired capacity manually
aws autoscaling set-desired-capacity \
  --auto-scaling-group-name web-asg \
  --desired-capacity 5

# Describe instances in ASG
aws autoscaling describe-auto-scaling-instances \
  --query "AutoScalingInstances[?AutoScalingGroupName=='web-asg']"

Scaling Policies

# Target Tracking — simplest, most common (like a thermostat)
# Automatically adds/removes instances to keep a metric at a target value
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name cpu-target-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 60.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

# Step Scaling — fine-grained control with CloudWatch alarms
# Scale out by 2 when CPU > 70%, by 4 when CPU > 85%

# Scheduled Scaling — predictable load patterns
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name scale-up-morning \
  --recurrence "0 8 * * MON-FRI" \
  --desired-capacity 6 \
  --min-size 4

aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name scale-down-night \
  --recurrence "0 20 * * *" \
  --desired-capacity 2 \
  --min-size 2

# Instance refresh — rolling update of instances (e.g. after LT change)
aws autoscaling start-instance-refresh \
  --auto-scaling-group-name web-asg \
  --preferences "MinHealthyPercentage=90,InstanceWarmup=300"

ALB, Target Groups & Blue/Green Deploy

# Load Balancer types:
# ALB (Application LB) — HTTP/HTTPS, path/host-based routing, WebSockets, gRPC
# NLB (Network LB)     — TCP/UDP, ultra-low latency, static IPs, millions of requests/sec
# CLB (Classic LB)     — legacy, do not use for new projects

# Create ALB
aws elbv2 create-load-balancer \
  --name web-alb \
  --subnets subnet-public-aaa111 subnet-public-bbb222 \
  --security-groups sg-alb123 \
  --scheme internet-facing \
  --type application

# Create Target Group
aws elbv2 create-target-group \
  --name web-tg \
  --protocol HTTP \
  --port 8080 \
  --vpc-id vpc-0abc123 \
  --health-check-path /health \
  --health-check-interval-seconds 15 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3 \
  --target-type instance

# Create Listener with forward rule
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:...:loadbalancer/app/web-alb/abc \
  --protocol HTTPS --port 443 \
  --certificates "CertificateArn=arn:aws:acm:us-east-1:123:certificate/abc" \
  --default-actions "Type=forward,TargetGroupArn=arn:aws:...:targetgroup/web-tg/abc"

# Blue/Green deployment with ASG:
# 1. Create green ASG with new Launch Template version
# 2. Register green ASG with same ALB target group
# 3. Wait for green instances to pass health checks
# 4. Shift traffic by updating listener rules (weighted target groups)
# 5. Deregister blue ASG from target group, then terminate

# Weighted target groups (canary / gradual shift)
aws elbv2 modify-listener \
  --listener-arn arn:aws:...:listener/app/web-alb/abc \
  --default-actions Type=forward,ForwardConfig='{
    "TargetGroups": [
      {"TargetGroupArn": "arn:...:targetgroup/blue-tg/111", "Weight": 80},
      {"TargetGroupArn": "arn:...:targetgroup/green-tg/222", "Weight": 20}
    ]
  }'

EC2

Instances, AMIs & Instance Types

AWS EC2: Instances, AMIs & Instance Types EC2 (Elastic Compute Cloud) provides resizable virtual machines in the cloud. You choose the OS, CPU, memory, and stor…

AWS EC2: Instances, AMIs & Instance Types

EC2 (Elastic Compute Cloud) provides resizable virtual machines in the cloud. You choose the OS, CPU, memory, and storage. Unlike Lambda or Fargate, you manage the server.

AMIs (Amazon Machine Images)

An AMI is a template containing the OS, pre-installed software, and configuration used to launch an instance. AMIs are region-specific.

AWS-provided AMIs: Amazon Linux 2023, Ubuntu 22.04/24.04 LTS, Windows Server, RHEL, SUSE
AWS Marketplace: vendor AMIs (pre-configured Nginx, Bitnami stacks, SAP, etc.) — may have extra licensing cost
Custom AMIs: create from a running instance after installing your software — fastest way to launch identical instances
Community AMIs: public AMIs from the community — use with caution (verify publisher)

# Find latest Amazon Linux 2023 AMI
aws ssm get-parameter \
  --name /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64 \
  --query Parameter.Value --output text

# Find Ubuntu 24.04 LTS
aws ec2 describe-images \
  --owners 099720109477 \
  --filters "Name=name,Values=ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*" \
  --query 'sort_by(Images, &CreationDate)[-1].ImageId'

# Create AMI from running instance
aws ec2 create-image \
  --instance-id i-abc123 \
  --name "my-app-ami-v2" \
  --description "App server with Node.js 20 and nginx" \
  --no-reboot

Instance Type Naming

Format: <family><generation><attributes>.<size>

c7g.xlarge:
  c  = Compute-optimized family
  7  = 7th generation
  g  = Graviton (ARM) processor
  xl = xlarge (4 vCPU, 8 GB RAM)

Families:
  t  — Burstable general purpose (t3, t4g): dev/test, low baseline, burst CPU credits
  m  — General purpose (m6i, m7g): balanced CPU/RAM, most workloads
  c  — Compute optimized (c7g, c7i): CPU-intensive (ML inference, encoding, gaming)
  r  — Memory optimized (r7g, r7i): in-memory DBs, big data (up to 768 GB RAM)
  x  — Memory extreme (x2idn): SAP HANA, huge in-memory (up to 3 TB RAM)
  i  — Storage optimized (i4i): high IOPS NVMe SSDs, Cassandra, Redis
  g/p — GPU instances: ML training (p4), graphics rendering (g5)
  inf — AWS Inferentia: ML inference at low cost

Suffixes:
  a = AMD EPYC, g = Graviton (ARM), i = Intel Ice Lake
  d = local NVMe SSD, n = enhanced network, e = extra storage/RAM

Launching Instances

# Launch an instance
aws ec2 run-instances \
  --image-id ami-0abcdef1234567890 \
  --instance-type t3.medium \
  --key-name my-keypair \
  --security-group-ids sg-abc123 \
  --subnet-id subnet-private-1a \
  --iam-instance-profile Name=my-app-instance-profile \
  --block-device-mappings '[{
    "DeviceName": "/dev/xvda",
    "Ebs": {"VolumeSize": 30, "VolumeType": "gp3", "DeleteOnTermination": true}
  }]' \
  --user-data file://init.sh \
  --metadata-options "HttpTokens=required,HttpEndpoint=enabled" \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=app-server},{Key=Env,Value=prod}]' \
  --count 1

# Wait until running
aws ec2 wait instance-running --instance-ids i-abc123

User Data (Cloud-Init)

#!/bin/bash
# user-data script — runs as root on first boot
set -euo pipefail

# Update and install
dnf update -y
dnf install -y nginx git

# Install Node.js via nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
source /root/.nvm/nvm.sh
nvm install 20
nvm use 20

# Start nginx
systemctl enable --now nginx

# Signal CloudFormation or ASG that init is complete
/opt/aws/bin/cfn-signal -e $? --stack my-stack --resource MyASG --region eu-west-1

Instance Metadata Service (IMDSv2)

# IMDSv2: token-based (required, more secure than v1)
# Get token
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Use token to query metadata
curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id

curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/iam/security-credentials/my-role

# Get region from metadata
REGION=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/placement/region)

EC2

Networking: VPC, Security Groups & Elastic IPs

EC2 Networking: VPC, Security Groups & Elastic IPs VPC & Subnet Architecture VPC: your private network in AWS, spans all AZs in a region, CIDR block e.g. 10.0.0…

EC2 Networking: VPC, Security Groups & Elastic IPs

VPC & Subnet Architecture

VPC: your private network in AWS, spans all AZs in a region, CIDR block e.g. 10.0.0.0/16
Public subnet: has route to Internet Gateway, instances can have public IPs
Private subnet: no IGW route, use NAT Gateway for outbound internet, no inbound from internet
Default VPC: exists in every account/region, all subnets are public — do not use for production
Best practice: web servers in public subnets, app/DB servers in private subnets

# Check if instance is in public or private subnet
aws ec2 describe-subnets --subnet-ids subnet-abc123 \
  --query 'Subnets[0].{CIDR:CidrBlock,Public:MapPublicIpOnLaunch,AZ:AvailabilityZone}'

# Check route table for subnet (look for igw-* destination for internet)
aws ec2 describe-route-tables --filters Name=association.subnet-id,Values=subnet-abc123

Security Groups

Security groups are stateful firewalls at the instance level. Inbound and outbound rules are separate. Return traffic is automatically allowed. Rules specify port, protocol, and source (IP or another SG).

# Create security group
aws ec2 create-security-group \
  --group-name app-server-sg \
  --description "App server security group" \
  --vpc-id vpc-abc123

# Allow HTTPS from ALB security group
aws ec2 authorize-security-group-ingress \
  --group-id sg-appserver \
  --protocol tcp --port 443 \
  --source-group sg-alb   # Only from ALB, not the internet

# Allow SSH from bastion only
aws ec2 authorize-security-group-ingress \
  --group-id sg-appserver \
  --protocol tcp --port 22 \
  --source-group sg-bastion

# Allow PostgreSQL from app servers to DB
aws ec2 authorize-security-group-ingress \
  --group-id sg-database \
  --protocol tcp --port 5432 \
  --source-group sg-appserver

# View security group rules
aws ec2 describe-security-groups --group-ids sg-appserver \
  --query 'SecurityGroups[0].IpPermissions'

Elastic IPs

# Allocate Elastic IP
aws ec2 allocate-address --domain vpc
# Returns AllocationId and PublicIp

# Associate with instance
aws ec2 associate-address \
  --instance-id i-abc123 \
  --allocation-id eipalloc-abc123

# Disassociate (keep the EIP for later)
aws ec2 disassociate-address --public-ip 54.12.34.56

# Release EIP (stop being charged)
aws ec2 release-address --allocation-id eipalloc-abc123

# Note: EIPs cost money when NOT associated with a running instance

Session Manager (SSH Alternative)

SSM Session Manager provides browser-based or CLI shell access to instances WITHOUT opening port 22, key pairs, or bastion hosts. Requires SSM Agent (pre-installed on Amazon Linux 2023/Ubuntu) and instance profile with SSM permissions.

# Start session (no SSH required)
aws ssm start-session --target i-abc123

# Port forwarding via SSM (tunnel to RDS without VPN)
aws ssm start-session \
  --target i-abc123 \
  --document-name AWS-StartPortForwardingSessionToRemoteHost \
  --parameters '{"portNumber":["5432"],"localPortNumber":["5432"],"host":["rds-endpoint.eu-west-1.rds.amazonaws.com"]}'
# Now connect: psql -h localhost -p 5432 -U admin mydb

# Required IAM policy for instance profile
# AmazonSSMManagedInstanceCore (AWS managed policy)

Enhanced Networking & Placement Groups

Enhanced Networking (ENA): up to 100 Gbps, enabled by default on modern instance types
Placement Group - Cluster: instances in same AZ, low latency network — HPC, tightly coupled apps
Placement Group - Spread: instances on different hardware — max 7 per AZ, critical apps
Placement Group - Partition: groups of instances on separate hardware — Hadoop, Kafka, Cassandra

# Create cluster placement group
aws ec2 create-placement-group \
  --group-name my-hpc-cluster \
  --strategy cluster

# Launch into placement group
aws ec2 run-instances \
  --placement GroupName=my-hpc-cluster \
  ... # other params

EC2

Auto Scaling, Load Balancing & Cost Optimization

EC2: Auto Scaling, Load Balancing & Cost Optimization Launch Templates # Create launch template (replaces launch configurations) aws ec2 create-launch-template …

EC2: Auto Scaling, Load Balancing & Cost Optimization

Launch Templates

# Create launch template (replaces launch configurations)
aws ec2 create-launch-template \
  --launch-template-name my-app-lt \
  --version-description "v1" \
  --launch-template-data '{
    "ImageId": "ami-abc123",
    "InstanceType": "t3.medium",
    "IamInstanceProfile": {"Name": "my-app-role"},
    "SecurityGroupIds": ["sg-appserver"],
    "UserData": "'$(base64 -w 0 init.sh)'",
    "MetadataOptions": {"HttpTokens": "required"},
    "TagSpecifications": [{
      "ResourceType": "instance",
      "Tags": [{"Key": "Env", "Value": "prod"}]
    }]
  }'

Auto Scaling Groups (ASG)

# Create ASG
aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name my-app-asg \
  --launch-template LaunchTemplateName=my-app-lt,Version='$Latest' \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --vpc-zone-identifier "subnet-private-1a,subnet-private-1b,subnet-private-1c" \
  --target-group-arns arn:aws:elasticloadbalancing:...:targetgroup/my-tg/abc \
  --health-check-type ELB \
  --health-check-grace-period 120

# Target tracking scaling — keeps CPU at 50%
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name my-app-asg \
  --policy-name cpu-target-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {"PredefinedMetricType": "ASGAverageCPUUtilization"},
    "TargetValue": 50.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

# Scheduled scaling (e.g., scale up before business hours)
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name my-app-asg \
  --scheduled-action-name morning-scale-up \
  --recurrence "0 7 * * MON-FRI" \
  --desired-capacity 6 \
  --min-size 4

Application Load Balancer (ALB)

# Create ALB
aws elbv2 create-load-balancer \
  --name my-app-alb \
  --subnets subnet-public-1a subnet-public-1b \
  --security-groups sg-alb \
  --scheme internet-facing \
  --type application

# Create target group
aws elbv2 create-target-group \
  --name my-app-tg \
  --protocol HTTP --port 3000 \
  --vpc-id vpc-abc123 \
  --target-type instance \
  --health-check-path /health \
  --health-check-interval-seconds 30 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3

# Create HTTPS listener with routing rules
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:...:loadbalancer/app/my-app-alb/abc \
  --protocol HTTPS --port 443 \
  --certificates CertificateArn=arn:aws:acm:...:certificate/xyz \
  --default-actions Type=forward,TargetGroupArn=arn:...:targetgroup/my-app-tg/abc

Spot Instances

Spot Instances use spare EC2 capacity at up to 90% discount. AWS can reclaim them with 2-minute warning. Best for fault-tolerant workloads: batch jobs, CI/CD workers, ML training, video encoding.

# Launch Spot instance
aws ec2 run-instances \
  --instance-type c7g.xlarge \
  --instance-market-options '{"MarketType":"spot","SpotOptions":{"MaxPrice":"0.05","SpotInstanceType":"one-time"}}'

# Spot Fleet: mix of instance types for resilience
aws ec2 request-spot-fleet --spot-fleet-request-config '{
  "TargetCapacity": 10,
  "AllocationStrategy": "priceCapacityOptimized",
  "LaunchTemplateConfigs": [{
    "LaunchTemplateSpecification": {"LaunchTemplateName": "my-app-lt", "Version": "$Latest"},
    "Overrides": [
      {"InstanceType": "c7g.xlarge"},
      {"InstanceType": "c6g.xlarge"},
      {"InstanceType": "m7g.large"}
    ]
  }]
}'

Pricing Models

Model              | Discount | Commitment  | Use Case
-------------------|----------|-------------|------------------------------------
On-Demand          | 0%       | None        | Short-term, unpredictable workloads
Spot               | up to 90%| None        | Fault-tolerant batch/stateless
Reserved (1yr)     | ~40%     | 1 year      | Steady-state production servers
Reserved (3yr)     | ~60%     | 3 years     | Long-running stable workloads
Savings Plans      | ~40-66%  | 1 or 3 yr  | Flexible (any instance family)
Dedicated Host     | varies   | 1 or 3 yr  | License compliance (Oracle/Windows)

EC2

Storage, Monitoring & Interview Questions

EC2: Storage, Monitoring & Interview Questions EBS (Elastic Block Store) EBS is network-attached block storage for EC2. Persists independently of instance lifec…

EC2: Storage, Monitoring & Interview Questions

EBS (Elastic Block Store)

EBS is network-attached block storage for EC2. Persists independently of instance lifecycle (unless DeleteOnTermination is set). Tied to one AZ. Can be detached and reattached to another instance in the same AZ.

Volume Type | Max IOPS  | Max Throughput | Use Case
------------|-----------|----------------|----------------------------------
gp3         | 16,000    | 1,000 MB/s     | General purpose — default choice
gp2         | 16,000    | 250 MB/s       | Legacy, prefer gp3
io2 Block   | 256,000   | 4,000 MB/s     | Latency-critical DBs (Oracle, SQL)
st1 (HDD)   | 500       | 500 MB/s       | Big data, sequential reads
sc1 (HDD)   | 250       | 250 MB/s       | Archive, infrequent access (cheapest)

# Create and attach EBS volume
aws ec2 create-volume \
  --availability-zone eu-west-1a \
  --volume-type gp3 \
  --size 100 \
  --iops 4000 \
  --throughput 250 \
  --encrypted \
  --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=app-data}]'

aws ec2 attach-volume \
  --volume-id vol-abc123 \
  --instance-id i-abc123 \
  --device /dev/xvdf

# Mount on Linux
sudo mkfs.ext4 /dev/xvdf
sudo mkdir /data
sudo mount /dev/xvdf /data
# Add to /etc/fstab for persistence across reboots

# Create snapshot for backup
aws ec2 create-snapshot \
  --volume-id vol-abc123 \
  --description "Before migration snapshot"

# Resize volume (no downtime needed for gp3)
aws ec2 modify-volume --volume-id vol-abc123 --size 200
# Then extend filesystem: sudo resize2fs /dev/xvdf

Instance Store

NVMe SSD physically attached to the host — NOT network-attached like EBS
Extremely fast (millions of IOPS, GB/s throughput)
Data lost when instance stops, hibernates, or fails — ephemeral by design
Cannot be detached or snapshotted
Use for: temporary data, cache, scratch space, buffers
Available on: i4i, i3, d3, c5d, m5d, r5d instance types

EFS (Elastic File System)

Fully managed NFS shared across multiple EC2 instances and AZs
Scales automatically, pay per GB used (not provisioned)
Mount targets in each AZ — instances mount via DNS name
Use cases: shared config, content management, ML training data shared across nodes
EFS Standard vs EFS-IA (Infrequent Access): lifecycle policies move files automatically
Performance modes: General Purpose (default) vs Max I/O (high concurrency)

# Mount EFS
sudo apt-get install amazon-efs-utils
sudo mount -t efs -o tls fs-abc123:/ /mnt/efs

# Or via /etc/fstab
echo "fs-abc123:/ /mnt/efs efs _netdev,tls 0 0" >> /etc/fstab

CloudWatch & Systems Manager

# CloudWatch Agent for memory/disk metrics (not available by default)
# Install on Amazon Linux 2023:
sudo dnf install -y amazon-cloudwatch-agent

# Minimal config for memory and disk
cat > /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json << 'EOF'
{
  "metrics": {
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "mem": {"measurement": ["mem_used_percent"],"metrics_collection_interval": 60},
      "disk": {"measurement": ["disk_used_percent"],"resources": ["/"],"metrics_collection_interval": 60}
    }
  }
}
EOF
sudo systemctl enable --now amazon-cloudwatch-agent

# SSM Run Command — run commands on instances without SSH
aws ssm send-command \
  --instance-ids i-abc123 i-def456 \
  --document-name AWS-RunShellScript \
  --parameters commands='["sudo systemctl restart nginx"]'

Interview Questions

EC2 vs Lambda vs ECS vs EKS? EC2: full control, long-running, stateful. Lambda: event-driven, short tasks, no server management. ECS/EKS: containerized services, better for microservices.
What happens when you stop vs terminate? Stop: instance halted, EBS preserved, Elastic IP preserved, billed for storage only. Terminate: instance deleted, EBS deleted (if DeleteOnTermination=true), data gone.
How to troubleshoot SSH connection refused? Check security group allows port 22; check instance is running; check key pair matches; check sshd is running (console output); verify IAM role for SSM if using Session Manager.
What are placement groups for? Cluster: lowest latency between instances (HPC). Spread: separate hardware to reduce correlated failures. Partition: groups on separate hardware for distributed systems.
How does Auto Scaling know when to scale in? Scale-in protection, cooldown periods, deregistration delay on ALB. ASG terminates instances according to termination policy (default: oldest launch config, then closest to billing hour).
Reserved vs Savings Plans? Reserved: specific instance family/size/region. Savings Plans: commit to $/hour compute spend, applies across instance types, Lambda, and Fargate — more flexible.

EC2 notes for developers