One of the most powerful capabilities of cloud computing is the ability to automatically add or remove resources based on demand. AWS Auto Scaling does exactly that — it monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. Instead of over-provisioning servers to handle peak load, you scale out when traffic spikes and scale in when it drops, paying only for what you actually use.
Types of Auto Scaling in AWS
- EC2 Auto Scaling — Automatically adds or removes EC2 instances in an Auto Scaling Group (ASG)
- Application Auto Scaling — Scales other AWS resources: ECS tasks, DynamoDB tables, Lambda concurrency, Aurora replicas, and more
- AWS Auto Scaling (the service) — A unified interface to manage scaling across multiple services
EC2 Auto Scaling Groups (ASG)
An Auto Scaling Group is a collection of EC2 instances treated as a logical unit for scaling and management. You define:
- Launch Template — The configuration used to launch new instances (AMI, instance type, key pair, security groups, IAM role)
- Minimum capacity — The fewest instances that must always be running
- Maximum capacity — The hard ceiling on instance count
- Desired capacity — The current target number of instances
# Create a Launch Template
aws ec2 create-launch-template
--launch-template-name my-web-server
--version-description "v1"
--launch-template-data '{
"ImageId": "ami-0c94855ba95c71c99",
"InstanceType": "t3.micro",
"KeyName": "my-key-pair",
"SecurityGroupIds": ["sg-0abc12345def67890"],
"IamInstanceProfile": {"Name": "MyEC2WebRole"}
}'
# Create an Auto Scaling Group
aws autoscaling create-auto-scaling-group
--auto-scaling-group-name my-web-asg
--launch-template "LaunchTemplateName=my-web-server,Version=1"
--min-size 2
--max-size 10
--desired-capacity 3
--vpc-zone-identifier "subnet-aaa,subnet-bbb"
--target-group-arns arn:aws:elasticloadbalancing:us-east-1:ACCOUNT:targetgroup/my-tg/abc
Scaling Policies
Scaling policies define when and how to scale. AWS offers three main types:
1. Target Tracking Scaling (Recommended)
You set a target metric value (e.g. keep average CPU at 50%), and AWS automatically adds or removes instances to maintain that target. This is the simplest and most effective policy for most use cases:
aws autoscaling put-scaling-policy
--auto-scaling-group-name my-web-asg
--policy-name cpu-target-tracking
--policy-type TargetTrackingScaling
--target-tracking-configuration '{
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": 50.0,
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}'
2. Step Scaling
Triggers scaling actions based on the size of the metric breach. E.g. add 1 instance when CPU is 60–80%, add 3 instances when CPU exceeds 80%. More granular but more complex to configure.
3. Scheduled Scaling
Scale proactively based on known traffic patterns. If you know your application gets heavy traffic every weekday morning, pre-scale before the traffic arrives:
# Scale up every weekday morning at 7 AM UTC
aws autoscaling put-scheduled-update-group-action
--auto-scaling-group-name my-web-asg
--scheduled-action-name scale-up-morning
--recurrence "0 7 * * MON-FRI"
--desired-capacity 6
--min-size 4
# Scale down at 8 PM UTC
aws autoscaling put-scheduled-update-group-action
--auto-scaling-group-name my-web-asg
--scheduled-action-name scale-down-evening
--recurrence "0 20 * * MON-FRI"
--desired-capacity 2
--min-size 2
Attach a Load Balancer
Auto Scaling Groups work seamlessly with Application Load Balancers (ALB). New instances are automatically registered with the target group, and unhealthy instances are deregistered before termination. This ensures zero-downtime scaling:
# Attach an ALB target group to the ASG
aws autoscaling attach-load-balancer-target-groups
--auto-scaling-group-name my-web-asg
--target-group-arns arn:aws:elasticloadbalancing:us-east-1:ACCOUNT:targetgroup/my-tg/abc
Health Checks
Auto Scaling monitors instance health via EC2 status checks (default) or ELB health checks (recommended when using a load balancer). If an instance fails health checks, it is terminated and replaced automatically. Enable ELB health checks:
aws autoscaling update-auto-scaling-group
--auto-scaling-group-name my-web-asg
--health-check-type ELB
--health-check-grace-period 120
Cooldown Periods
Cooldowns prevent Auto Scaling from launching or terminating additional instances before the previous scaling activity has taken effect. Default cooldown is 300 seconds. Set scale-out cooldowns short (60s) to respond quickly to spikes, and scale-in cooldowns longer (300s) to avoid premature termination.
Warm Pools (Reduce Scale-Out Latency)
For applications with slow startup times, use Warm Pools — a pool of pre-initialized, stopped instances ready to be started quickly when needed. This dramatically reduces the time from scaling trigger to serving traffic.
Summary
AWS Auto Scaling ensures your application always has the right amount of compute capacity — no more, no less. Target Tracking policies are the right default for most web applications. Pair your ASG with an Application Load Balancer and proper health checks, and your application will handle traffic spikes and instance failures automatically, with no manual intervention required.