AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling

This content originally appeared on DEV Community and was authored by Utkarsh Rastogi

Hey everyone! Today I'm diving deep into ECS services and auto-scaling. After setting up the load balancer on Day 4, it's time to deploy my FastAPI application with intelligent scaling that responds to real traffic patterns.

What We're Building Today

ECS Service that keeps containers running and healthy
Smart Auto-Scaling that maintains optimal performance (1-5 containers)
FastAPI Application with multiple endpoints for testing
CloudWatch Monitoring with email alerts
Load Testing Endpoints to validate scaling behavior

The Complete ECS Service with Auto-Scaling

Note: We already created the ECS cluster in our previous setup, so we'll focus on the service configuration.

Here's the full ECS service configuration with intelligent auto-scaling:

# infra/ecs/ecs_service.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Creating ECS Service for Learning Purpose'

Parameters:
  TeamNameValue:
    Type: String
    Description: TeamName Tag Value
    Default: "awslearner"
  EnvironmentValue:
    Type: String
    Description: Environment Tag Value
    Default: "dev"
  ServiceName:
    Type: String
    Description: Name of the ECS service
    Default: "learner-svc"
  ExecutionRoleName:
    Type: String
    Description: Name of the service execution role
    Default: "learner-ecs-role"
  TaskRoleName:
    Type: String
    Description: Name of the service execution role
    Default: "learner-ecs-task-exc-role"
  ImageARN:
    Type: AWS::SSM::Parameter::Value<String>
    Description: Image ARN
    Default: "/learner/imagearn/value"
  ECSCluster:
    Type: String
    Description: ECS Cluster Name
    Default: "learner-cluster"
  PublicSubnetIds:
    Type: AWS::SSM::Parameter::Value<String>
    Description: Subnet ID
    Default: "/learner/public/subnetids"
  SecurityGroup:
    Type: AWS::SSM::Parameter::Value<String>
    Description: Security Group
    Default: "/learner/public/sgid"
  TargetGroupArn:
    Type: AWS::SSM::Parameter::Value<String>
    Description: Target Group ARN
    Default: "/learner/target/value"
  AlertEmail:
    Type: String
    Description: Email address for alerts
    Default: <Provide Email Address>

Resources:
  # Storage for your container logs
  ECSLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub /ecs/${ServiceName}-logs
      RetentionInDays: 14
      Tags:
        - Key: Name
          Value: !Sub /ecs/${ServiceName}-logs
        - Key: TeamName
          Value: !Ref TeamNameValue
        - Key: Environment
          Value: !Ref EnvironmentValue

  # Blueprint that tells ECS how to run your containers
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Sub ${ServiceName}-task
      Cpu: 256                   
      Memory: 512       
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Sub arn:aws:iam::${AWS::AccountId}:role/${ExecutionRoleName}
      TaskRoleArn: !Sub arn:aws:iam::${AWS::AccountId}:role/${TaskRoleName}
      ContainerDefinitions:
        - Name: !Sub ${ServiceName}-container
          Image: !Ref ImageARN
          PortMappings:
            - ContainerPort: 80
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref ECSLogGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: ecs
      Tags:
        - Key: Name
          Value: !Sub ${ServiceName}-task
        - Key: TeamName
          Value: !Ref TeamNameValue
        - Key: Environment
          Value: !Ref EnvironmentValue

  # Service that keeps your containers running and healthy
  ECSService:
    Type: AWS::ECS::Service
    Properties:
      Cluster: !Ref ECSCluster
      ServiceName: !Sub ${ServiceName}-service
      TaskDefinition: !Ref TaskDefinition
      LaunchType: FARGATE
      DesiredCount: 1
      PropagateTags: SERVICE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED   
          Subnets: !Split 
            - ","
            - !Ref PublicSubnetIds
          SecurityGroups:
            - !Ref SecurityGroup
      LoadBalancers:
        - ContainerName: !Sub ${ServiceName}-container 
          ContainerPort: 80
          TargetGroupArn: !Ref TargetGroupArn
      Tags:
        - Key: Name
          Value: !Sub ${ServiceName}-service
        - Key: TeamName
          Value: !Ref TeamNameValue
        - Key: Environment
          Value: !Ref EnvironmentValue

  # Defines scaling limits for your containers (1-5 tasks)
  AutoScalingTarget:
    Type: AWS::ApplicationAutoScaling::ScalableTarget
    DependsOn: ECSService
    Properties:
      ServiceNamespace: ecs
      ResourceId: !Sub "service/${ECSCluster}/${ServiceName}-service"
      ScalableDimension: ecs:service:DesiredCount
      MinCapacity: 1          
      MaxCapacity: 5

  # Target tracking scaling policy - maintains CPU around 40%
  AutoScalingPolicy:
    Type: AWS::ApplicationAutoScaling::ScalingPolicy
    Properties:
      PolicyName: !Sub ${EnvironmentValue}-${ServiceName}-target-tracking
      PolicyType: TargetTrackingScaling
      ScalingTargetId: !Ref AutoScalingTarget
      TargetTrackingScalingPolicyConfiguration:
        TargetValue: 40.0                    # Target 40% CPU utilization
        PredefinedMetricSpecification:
          PredefinedMetricType: ECSServiceAverageCPUUtilization
        ScaleOutCooldown: 120               # Wait 2 minutes before scaling up
        ScaleInCooldown: 300                # Wait 5 minutes before scaling down
        DisableScaleIn: false

  # Email notification system for alerts
  AlertTopic:
    Type: AWS::SNS::Topic
    Properties:
      TopicName: !Sub ${EnvironmentValue}-${ServiceName}-alerts
      DisplayName: !Sub "${ServiceName} ECS Alerts"
      Tags:
        - Key: Name
          Value: !Sub ${EnvironmentValue}-${ServiceName}-alerts
        - Key: TeamName
          Value: !Ref TeamNameValue
        - Key: Environment
          Value: !Ref EnvironmentValue

  # Connects your email to the alert system
  AlertSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Protocol: email
      TopicArn: !Ref AlertTopic
      Endpoint: !Ref AlertEmail

  # Alert when CPU is high at maximum capacity
  CriticalCPUAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: !Sub ${EnvironmentValue}-${ServiceName}-CriticalCPU-AtMaxCapacity
      AlarmDescription: !Sub "CRITICAL: ${ServiceName} CPU >70% at max capacity (5 tasks)"
      MetricName: CPUUtilization
      Namespace: AWS/ECS
      Statistic: Average
      Period: 60
      EvaluationPeriods: 2
      Threshold: 70
      ComparisonOperator: GreaterThanThreshold
      AlarmActions:
        - !Ref AlertTopic
      Dimensions:
        - Name: ServiceName
          Value: !Sub ${ServiceName}-service
        - Name: ClusterName
          Value: !Ref ECSCluster

  # Alert when you reach maximum number of containers
  MaxCapacityAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: !Sub ${EnvironmentValue}-${ServiceName}-MaxCapacity-Reached
      AlarmDescription: !Sub "WARNING: ${ServiceName} reached maximum capacity (5 tasks)"
      MetricName: RunningTaskCount
      Namespace: AWS/ECS
      Statistic: Maximum
      Period: 60
      EvaluationPeriods: 1
      Threshold: 5
      ComparisonOperator: GreaterThanOrEqualToThreshold
      AlarmActions:
        - !Ref AlertTopic
      Dimensions:
        - Name: ServiceName
          Value: !Sub ${ServiceName}-service
        - Name: ClusterName
          Value: !Ref ECSCluster

FastAPI Application with Testing Endpoints

Here's my FastAPI application with endpoints designed to test different scenarios:

# source/app.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import logging
import time
import hashlib
import boto3
import threading
import multiprocessing

app = FastAPI()
logger = logging.getLogger("ecs_service")
active_requests = 0

# Initialize ECS client for monitoring
try:
    ecs_client = boto3.client('ecs')
except Exception as e:
    logger.warning(f"Could not initialize ECS client: {e}")
    ecs_client = None

class SubmitData(BaseModel):
    name: str = "User"

@app.get("/")
def home():
    """Simple welcome message"""
    return "Hello from ECS Fargate Service!"

@app.get("/api/health")
def health():
    """Health check endpoint for load balancer"""
    return {"status": "healthy"}

@app.post("/api/submit")
def api_submit(data: SubmitData):
    """Accepts user data and returns personalized message"""
    logger.info(f"Data received: {data.model_dump()}")
    return {"message": f"Happy learning, {data.name}!", "data": data.model_dump()}

@app.get("/api/load")
def generate_load():
    """Generates heavy CPU load for 60 seconds to test auto-scaling"""
    def cpu_intensive_task():
        start_time = time.time()
        while time.time() - start_time < 60:
            for _ in range(10000):
                hashlib.sha256(str(time.time()).encode()).hexdigest()
                sum(range(1000))

    # Use multiple threads to maximize CPU usage
    cpu_count = multiprocessing.cpu_count()
    threads = []
    for _ in range(cpu_count * 2):
        thread = threading.Thread(target=cpu_intensive_task)
        thread.start()
        threads.append(thread)

    for thread in threads:
        thread.join()

    logger.info("CPU load generation completed")
    return {"status": "load_generated", "duration": "60s", "threads": cpu_count * 2}

@app.get("/api/quickload")
def quick_load():
    """Generates 10-second CPU burst for quick scaling tests"""
    def burst_task():
        start_time = time.time()
        while time.time() - start_time < 10:
            for _ in range(50000):
                hashlib.sha256(str(time.time()).encode()).hexdigest()

    cpu_count = multiprocessing.cpu_count()
    threads = []
    for _ in range(cpu_count * 3):
        thread = threading.Thread(target=burst_task)
        thread.start()
        threads.append(thread)

    for thread in threads:
        thread.join()

    return {"status": "burst_completed", "duration": "10s"}

@app.get("/api/scalinginfo")
def get_scaling_info():
    """Returns current ECS service scaling status"""
    if not ecs_client:
        return {"error": "ECS client not available"}

    try:
        response = ecs_client.describe_services(
            cluster='learner-cluster',
            services=['learner-svc-service']
        )

        if response['services']:
            service = response['services'][0]
            return {
                "cluster": "learner-cluster",
                "service": "learner-svc-service",
                "desired_count": service['desiredCount'],
                "running_count": service['runningCount'],
                "pending_count": service['pendingCount'],
                "status": service['status'],
                "active_requests": active_requests
            }
        else:
            return {"error": "Service not found"}

    except Exception as e:
        logger.error(f"Error getting scaling info: {str(e)}")
        return {"error": "Unable to fetch scaling information"}

@app.get("/api/error")
def trigger_error():
    """Triggers 500 error for testing error handling"""
    logger.error("Intentional error triggered")
    raise HTTPException(status_code=500, detail="Internal server error")

@app.get("/api/notfound")
def not_found():
    """Triggers 404 error for testing not found responses"""
    logger.warning("Resource not found")
    raise HTTPException(status_code=404, detail="Resource not found")

Requirements File

# source/requirements.txt
fastapi==0.104.1
uvicorn==0.24.0
boto3==1.34.0
pydantic==2.5.0

Deployment Commands

Important: Before deploying the ECS service, we need to build and push our container image using the CodeBuild project we set up in Day 4.

Deploy in this order:

# 1. First, build and push your container image
# This uses the CodeBuild project from Day 4
aws codebuild start-build --project-name learner-project

# Wait for the build to complete (check in AWS Console or CLI)
# This typically takes 2-3 minutes

# 2. Deploy ECS service with auto-scaling
aws cloudformation deploy \
  --template-file infra/ecs/ecs_service.yaml \
  --stack-name AWSLearner-ECS-Stack \
  --capabilities CAPABILITY_NAMED_IAM

Auto-Scaling Scenario's

1. Check Current Status

curl http://your-alb-url/api/scalinginfo

Response:

{
  "desired_count": 1,
  "running_count": 1,
  "pending_count": 0,
  "status": "ACTIVE"
}

2. Trigger Heavy Load

curl http://your-alb-url/api/load

This creates 60 seconds of intense CPU load. Watch CloudWatch metrics to see:

CPU utilization spike to 80-90%
Auto-scaling trigger after 2 minutes
New containers start (desired_count increases)
Load distributes across containers
CPU drops back to target 40%

3. Quick Burst Test

curl http://your-alb-url/api/quickload

Perfect for testing rapid scaling response with a 10-second burst.

Postman Testing

GET /api/health - Health check status

POST /api/submit - Data submission with JSON body

GET /api/scalinginfo - Current container status

GET /api/load - Load generation response

GET /api/quickload - Quick burst response

GET /api/error - Error handling test

GET /api/notfound - 404 error test

Note: To see auto-scaling in action, you'll need to hit the load endpoints multiple times or use multiple browser tabs/terminals simultaneously to generate enough traffic that triggers the CPU threshold.

What Each Endpoint Does

Endpoint	Purpose	Response Time
`/api/health`	Load balancer health check	Instant
`/api/submit`	Data processing test	Instant
`/api/load`	Sustained CPU load (60s)	60 seconds
`/api/quickload`	CPU burst (10s)	10 seconds
`/api/scalinginfo`	Current container status	Instant
`/api/error`	Error handling test	Instant
`/api/notfound`	404 error test	Instant

Auto-Scaling Behavior

How it works:

Target: Maintains 40% CPU utilization
Scale Out: Adds containers when CPU > 40% (2-minute cooldown)
Scale In: Removes containers when CPU < 40% (5-minute cooldown)
Limits: 1-5 containers
Alerts: Email notifications at critical thresholds

Scaling Timeline:

0-2 minutes: High CPU detected, evaluation period
2-4 minutes: New container launching
4-6 minutes: Container healthy, receiving traffic
6+ minutes: Load distributed, CPU normalizes

Key Learnings

Target tracking scaling is much smarter than threshold-based scaling
Cooldown periods prevent rapid scaling that could cause instability
Email alerts provide peace of mind without constant monitoring
Load testing endpoints are essential for validating your setup
ECS Fargate eliminates server management completely

What's Next in This Series?

In this comprehensive series, we've learned how to deploy a complete containerized application from VPC to ECS service using Fargate. We covered:

VPC Setup with multi-AZ networking
Security Groups and IAM roles
ECR Repository for container images
CodeBuild Pipeline for CI/CD
Application Load Balancer for traffic distribution
ECS Service with intelligent auto-scaling

The auto-scaling system feels really robust now. I can throw traffic at it, watch it scale intelligently, and get notified if anything needs attention. Perfect foundation for a production workload!

Complete Day 5 Learning Summary

What we accomplished in Day 5:

Infrastructure Built

ECS Service with Fargate launch type (256 CPU, 512 MB RAM)
Task Definition with proper IAM roles and logging
Auto-Scaling Target (1-5 containers) with target tracking policy
CloudWatch Alarms for critical CPU and max capacity alerts
SNS Email Notifications for real-time monitoring

Application Features

7 FastAPI Endpoints for comprehensive testing
Load Testing Capabilities (60s sustained + 10s burst)
Real-time Monitoring with ECS service status
Error Handling and health checks
Structured Logging to CloudWatch

Auto-Scaling Intelligence

Target Tracking: Maintains 40% CPU utilization
Smart Cooldowns: 2min scale-out, 5min scale-in
Proportional Scaling: Responds to load intensity
Email Alerts: Critical thresholds and capacity warnings

Key Takeaway: We now have a fully automated, scalable containerized application that can handle real-world traffic patterns while maintaining cost efficiency and operational visibility.

💻 About Me

Hi! I'm Utkarsh, a Cloud Specialist & AWS Community Builder who loves turning complex AWS topics into fun chai-time stories ☕

👉 Explore more

This content originally appeared on DEV Community and was authored by Utkarsh Rastogi

Print Share Comment Cite Upload Translate Updates

APA

Utkarsh Rastogi | Sciencx (2025-08-23T13:13:56+00:00) AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling. Retrieved from https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/

MLA

" » AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling." Utkarsh Rastogi | Sciencx - Saturday August 23, 2025, https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/

HARVARD

Utkarsh Rastogi | Sciencx Saturday August 23, 2025 » AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling., viewed ,<https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/>

VANCOUVER

Utkarsh Rastogi | Sciencx - » AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/

CHICAGO

" » AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling." Utkarsh Rastogi | Sciencx - Accessed . https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/

IEEE

" » AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling." Utkarsh Rastogi | Sciencx [Online]. Available: https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/. [Accessed: ]

rf:citation

» AWS VPC to ECS – Day 5: ECS Service with Smart Auto-Scaling | Utkarsh Rastogi | Sciencx | https://www.scien.cx/2025/08/23/aws-vpc-to-ecs-day-5-ecs-service-with-smart-auto-scaling/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.