AWS Auto Scaling Groups

An Auto Scaling Group (ASG) is the mechanism that turns a static EC2 fleet into an elastic, self-healing one. ASGs launch and terminate instances on demand, replace unhealthy ones, and distribute them across AZs. Together with a load balancer and a launch template, this is the canonical “web tier” on AWS.

What an ASG is

An ASG is a controller that maintains a defined number of running EC2 instances matching a template, across a set of subnets (AZs). You give it:

What to launch → a Launch Template (or the older Launch Configuration)
Where to launch → subnets (each in one AZ)
How many → min / desired / max capacity
How to decide → scaling policies
How to check → health check config

And the ASG continually reconciles: if an instance dies, launch a replacement. If load grows, scale out. If load shrinks, scale in.

The three capacity numbers

Setting	Meaning
Min	Lower bound — never scale below this
Desired	Current target count
Max	Upper bound — never scale above this

Scaling policies adjust Desired; the ASG then launches or terminates to match. Min/Max act as guardrails.

Launch Templates vs Launch Configurations

Launch Template — modern, versioned, supports mixed instance types, Spot, newest features. Use this.
Launch Configuration — legacy, immutable (can’t edit — must create new one and replace), limited features. AWS will eventually deprecate.

A launch template captures:

AMI ID, instance type(s), key pair, security groups
IAM instance profile
User data (first-boot script)
EBS volume specs
Network interface config, metadata options (IMDSv2 enforcement)

Templates are versioned — point the ASG at a specific version or at $Latest / $Default.

Multi-AZ by default

You specify multiple subnets across AZs. The ASG spreads instances evenly across them, using AZ-rebalance internally. If an AZ goes down, surviving AZs pick up the load; new instances skip the failed AZ until it returns.

This is the fundamental HA unit on AWS. Combined with a load balancer, it gives zero-touch recovery from instance and AZ failures.

Health checks — what “unhealthy” means

Two sources:

EC2 status checks — hypervisor-level (hardware, reachability). On by default.
ELB health checks — the load balancer’s view of the target’s health. Must enable the “ELB” health check type on the ASG for this to propagate.

If a check fails, the ASG marks the instance unhealthy and terminates + replaces it. That’s the self-healing loop.

Health check grace period — how long after launch before checks count. Gives instances time to boot and bootstrap before being judged. Set it long enough for user-data + app startup.

Scaling policies

Target tracking (recommended default)

You set a metric and a target value; ASG does the math.

“Keep average CPU at 50%.” → ASG scales out when average climbs above, in when it drops below.

Works with: CPU, network in/out, ALB request count per target, custom metrics. Simplest, most predictable, handles the cool-downs internally.

Step scaling

Finer control — “if CPU > 70%, add 2 instances; if > 85%, add 4.” Useful when you want asymmetric behaviour or multiple threshold bands.

Simple scaling

“If CPU > 70%, add 1 instance.” One rule, no bands. Older style; step scaling is strictly more flexible.

Scheduled actions

“At 8am weekdays, set desired to 20. At 8pm, set to 5.” Cron-like capacity changes. Stack with dynamic policies.

Predictive scaling

ML-driven — looks at 14 days of history to forecast and pre-warm capacity. Good for predictable daily/weekly patterns. Combines with reactive policies.

Instance types — single vs mixed

Modern ASGs support mixed instance types and mixed purchase options in one group:

Pick several instance types (e.g. m5.large, m6i.large, m5a.large) — ASG picks what’s available
Split between On-Demand and Spot with a base count + percentage split
Capacity Rebalancing — proactively replace Spot instances before they’re reclaimed

Typical pattern for cost efficiency: “2 On-Demand baseline, everything else Spot, 4 instance types permitted.”

Lifecycle hooks

Custom logic at launch and termination:

EC2_INSTANCE_LAUNCHING — pause before marking InService (e.g. bootstrap config, register with service discovery)
EC2_INSTANCE_TERMINATING — pause before termination (e.g. deregister, drain connections, snapshot logs)

Hooks hold instances in a Pending:Wait or Terminating:Wait state for up to 1 hour. A Lambda or SQS consumer performs work, then completes the hook.

Without lifecycle hooks, termination is hard — an unhealthy instance gets killed immediately.

Termination policy — who gets killed?

When scaling in, the ASG picks which instance to terminate. Default priority:

Old launch template version
Oldest instance (helps “rolling” an ASG)
Instance closest to the next billing hour (cost optimisation from when billing was hourly; less relevant with per-second)

You can override with specific policies: NewestInstance, OldestInstance, ClosestToNextInstanceHour, Default, OldestLaunchConfiguration, AllocationStrategy.

Instance Scale-In Protection prevents a specific instance from being chosen for scale-in — useful for a “leader” or a long-running job.

Instance refresh — rolling updates

Instance Refresh is the built-in way to roll a fleet after a launch-template change. You set a minimum healthy percentage; ASG replaces instances in batches without breaching HA. Optionally:

Warm Pool — pre-warmed stopped instances that start-up fast (trading EBS costs for scale-out speed)
Checkpoints — pause mid-refresh for validation
Skip matching — skip instances already on the desired version

This is how you deploy a new AMI across hundreds of instances without downtime.

Interaction with Elastic Load Balancing

Attach the ASG to one or more target groups (ALB/NLB). As instances are launched, the ASG auto-registers them with the target group; terminations trigger deregistration with the target group’s deregistration delay (connection draining).

Health is the union — a target group considers an instance unhealthy, ELB health-check-enabled ASG terminates it, ASG launches a replacement, registers to target group, health check passes, begins receiving traffic. End-to-end self-healing.

Cooldowns — preventing flapping

A cooldown period after a scaling activity prevents another scale action from firing immediately. Target-tracking handles cooldowns implicitly; for step/simple scaling, set them deliberately (default 300s).

Poor cooldowns → scaling oscillation. Good cooldowns → stable equilibrium.

Cost

No charge for the ASG itself — you pay only for the instances it launches (plus the LB it’s attached to).
Savings Plans / Reserved Instances still apply — ASG just chooses from your shape.
Spot via mixed instances policy is the biggest lever for cost-sensitive stateless tiers.

Common pitfalls

Single-AZ ASG — defeats the primary purpose. Always span 2+ AZs.
Grace period too short — new instances get killed mid-bootstrap because ELB health checks fail before the app is up.
Desired > Max. AWS silently caps at Max and the scaling policy can’t push past it. Check when unexpected “not scaling.”
User-data failures are invisible without inspection. Log to CloudWatch via the agent; surface failures.
State on instances — ASGs terminate freely; don’t store anything durable on instance volumes. Persist to S3, EFS, RDS, or DynamoDB.
IAM instance profile missing. Instance boots without expected access; SSM doesn’t work; CloudWatch Agent can’t write.
Slow bootstrapping. A 4-minute user-data script means 4 minutes of load on surviving instances while the new one spins up. Bake more into the AMI (golden image pattern) or use a warm pool.
IMDS v1 still enabled. Set metadata options in the launch template to enforce IMDSv2.

Mental model

ASG = a control loop. Desired count is the setpoint; scaling policies adjust the setpoint; health checks + termination policy keep the actuals aligned.
Launch template = the “shape” of an instance. Versioned, declarative.
Target groups + ELB = the traffic layer that the ASG feeds.
Lifecycle hooks = the escape hatch for custom behaviour at transitions.
Instance Refresh = the “rolling deploy” primitive.

Used together (ASG + LT + ELB + health checks), you have a self-healing, elastic, multi-AZ compute tier that survives most failure modes without operator intervention — the AWS answer to “stateless horizontal scaling.”

IT Knowledge DB

Explorer

AWS Auto Scaling Groups

AWS Auto Scaling Groups

What an ASG is

The three capacity numbers

Launch Templates vs Launch Configurations

Multi-AZ by default

Health checks — what “unhealthy” means

Scaling policies

Target tracking (recommended default)

Step scaling

Simple scaling

Scheduled actions

Predictive scaling

Instance types — single vs mixed

Lifecycle hooks

Termination policy — who gets killed?

Instance refresh — rolling updates

Interaction with Elastic Load Balancing

Cooldowns — preventing flapping

Cost

Common pitfalls

Mental model

See also

Graph View

Table of Contents

Backlinks