This content originally appeared on DEV Community and was authored by bikash119
This is Part 2 of a 3-part series on Deploy Docling to AWS ECS infrastructure. In Part 1, we covered the foundational networking and IAM setup required for this deployment.
Setting up Amazon ECS (Elastic Container Service) with EC2 instances can be complex, especially when you need GPU support for compute-intensive workloads. In this comprehensive guide, we'll walk through creating a robust, scalable ECS infrastructure using Auto Scaling Groups (ASG) and Launch Templates, specifically configured for GPU workloads.
Why Use Auto Scaling Groups with ECS?
Auto Scaling Groups provide several key benefits for ECS deployments:
- Automatic scaling based on demand and health checks
- High availability across multiple availability zones
- Cost optimization by scaling down during low usage periods
- Consistent tagging and configuration through Launch Templates
- Integration with ECS Capacity Providers for seamless container orchestration
Prerequisites
Before we begin, ensure you have:
- AWS CLI configured with appropriate permissions
- A VPC with public subnets already created - If you haven't set this up yet, please refer to Part 1 of this series where we cover the complete VPC setup including subnets, internet gateways, and route tables
- 
IAM roles properly configured - The ec2_instance_role-profilereferenced in our Launch Template was created in Part 1. If you skipped Part 1, you'll need to set up the necessary IAM roles and instance profiles
- Basic understanding of AWS ECS, EC2, and Auto Scaling concepts
💡 Note: This guide assumes you have the VPC ID (
$VPC_ID) and public subnet ID ($PUBLIC_SUBNET) from Part 1. If you need to retrieve these values, refer to the VPC setup section in Part 1.
Step 1: Create Security Infrastructure
Generate SSH Key Pair
First, let's create a key pair for secure access to our EC2 instances:
aws ec2 create-key-pair \
    --key-name ECS_Instance_Key \
    --tag-specifications 'ResourceType=key-pair,Tags=[{Key=Name,Value=ECS_Instance_Key}]' \
    --query 'KeyMaterial' \
    --output text > ECSInstanceKey.pem
# Secure the key file
chmod 400 ECSInstanceKey.pem
Create Security Group
Next, we'll create a security group to control network access. This security group will be associated with the VPC we created in Part 1:
ECS_SG_ID=$(aws ec2 create-security-group \
    --tag-specifications 'ResourceType=security-group,Tags=[{Key=Name,Value=ECS_Instance_SG}]' \
    --vpc-id $VPC_ID \
    --group-name ECS_Instance_SG \
    --description "SG for ECS Instance" \
    --query "GroupId" \
    --output text)
# Add tags to the security group
aws ec2 create-tags --resources $ECS_SG_ID --cli-input-json file://tags.json
📝 Reminder: The
$VPC_IDvariable should contain the VPC ID from Part 1. If you need to find your VPC ID, you can use the command provided in Part 1 or run:aws ec2 describe-vpcs --filters "Name=tag:Name,Values=your-vpc-name"
Configure Security Group Rules
⚠️ Security Notice: The following rule allows SSH access from anywhere on the internet. For production environments, restrict this to your specific IP range or use AWS Systems Manager Session Manager for more secure access.
aws ec2 authorize-security-group-ingress \
    --group-id $ECS_SG_ID \
    --protocol tcp \
    --port 22 \
    --cidr 0.0.0.0/0
Step 2: Prepare User Data Script
Create a user data script that configures the ECS agent with GPU support:
#!/bin/bash
echo ECS_CLUSTER=docling-ecs-cluster >> /etc/ecs/ecs.config
echo ECS_BACKEND_HOST=https://ecs.us-east-1.amazonaws.com >> /etc/ecs/ecs.config
echo ECS_ENABLE_GPU_SUPPORT=true >> /etc/ecs/ecs.config
Save this as user-data.sh and encode it in base64:
base64 user-data.sh
Step 3: Get the Optimal GPU AMI
AWS provides optimized AMIs for ECS with GPU support. Let's fetch the latest recommended AMI ID:
aws ssm get-parameters \
    --names /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended \
    --region $(aws configure get region)
Step 4: Create Launch Template
Launch Templates provide a way to store launch parameters so that you don't have to specify them every time you launch an instance. Create a JSON file called ec2-launch-template.json:
{
  "ImageId": "ami-0372b2cc554a36da2",
  "InstanceType": "g4dn.xlarge",
  "KeyName": "ECS_Instance_Key",
  "IamInstanceProfile": {
    "Name": "ec2_instance_role-profile"
  },
  "NetworkInterfaces": [
    {
      "AssociatePublicIpAddress": true,
      "DeleteOnTermination": true,
      "DeviceIndex": 0,
      "SubnetId": "<subnet id of public subnet",
      "Groups": ["value of $ECS_SG_ID"]
    }
  ],
  "UserData": "<replace_with_base64_encoded_user-data.sh>",
  "BlockDeviceMappings": [
    {
      "DeviceName": "/dev/xvda",
      "Ebs": {
        "VolumeSize": 30,
        "VolumeType": "gp3",
        "DeleteOnTermination": true,
        "Encrypted": true
      }
    }
  ],
  "TagSpecifications": [
    {
      "ResourceType": "instance",
      "Tags": [
        {
          "Key": "Name",
          "Value": "ECS-Instance"
        }
      ]
    },
    {
      "ResourceType": "volume",
      "Tags": [
        {
          "Key": "Name",
          "Value": "ECS-Instance-Volume"
        }
      ]
    }
  ],
  "Monitoring": {
    "Enabled": true
  },
  "MetadataOptions": {
    "HttpTokens": "required",
    "HttpPutResponseHopLimit": 2,
    "HttpEndpoint": "enabled"
  }
}
⚠️ Important Configuration Notes:
- Replace the
SubnetIdwith your actual public subnet ID from Part 1- Replace the
Groupsarray with your actual security group ID (the$ECS_SG_IDwe just created)- The
IamInstanceProfilename (ec2_instance_role-profile) was created in Part 1 - make sure this matches your actual IAM instance profile name
Key Configuration Highlights
- 
GPU Instance Type: g4dn.xlargeprovides NVIDIA T4 GPU support
- EBS Encryption: Enabled for data security
- Enhanced Monitoring: Enabled for better observability
- IMDSv2: Enforced for improved instance metadata security
- GP3 Storage: Latest generation EBS for better price/performance
Now create the launch template:
EC2_LAUNCH_TEMPLATE_ID=$(aws ec2 create-launch-template \
    --launch-template-name DoclingLaunchTemplate \
    --tag-specifications 'ResourceType=launch-template,Tags=[{Key=Name,Value=ECS_EC2_Launch_Template}]' \
    --launch-template-data file://ec2-launch-template.json \
    --query "LaunchTemplate.LaunchTemplateId" \
    --output text)
# Add additional tags
aws ec2 create-tags --resources $EC2_LAUNCH_TEMPLATE_ID --cli-input-json file://tags.json
Step 5: Create ECS Cluster
Important: Create the ECS cluster before launching EC2 instances. The ECS agent on the instances needs an existing cluster to register with.
aws ecs create-cluster \
    --cluster-name docling-ecs-cluster \
    --tags key=Name,value=doclingECSCluster
# Get cluster ARN for tagging
DOCLING_CLUSTER_ARN=$(aws ecs describe-clusters \
    --clusters docling-ecs-cluster \
    --query "clusters[].clusterArn" \
    --output text)
# Add additional tags
aws ecs tag-resource \
    --resource-arn $DOCLING_CLUSTER_ARN \
    --tags file://cluster-tags.json
Step 6: Create Auto Scaling Group
Create the Auto Scaling Group with zero desired capacity initially. This ASG will use the public subnet we configured in Part 1:
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name ECS_Asg \
    --launch-template LaunchTemplateId=$EC2_LAUNCH_TEMPLATE_ID,Version='$Latest' \
    --min-size 0 \
    --max-size 0 \
    --desired-capacity 0 \
    --vpc-zone-identifier $PUBLIC_SUBNET \
    --tags Key=Name,Value=ECS_AutoScaling
📋 Note: The
$PUBLIC_SUBNETvariable should contain the subnet ID from Part 1. If you need to retrieve your subnet ID, refer to the VPC setup section in Part 1.
Configure Auto Scaling Group Tags
Create an asg-tags.json file for propagating tags to launched instances:
[
   {
        "ResourceId": "ECS_Asg",
        "ResourceType": "auto-scaling-group",
        "Key": "Purpose",
        "Value": "DoclingSetup",
        "PropagateAtLaunch": true
    },
    {
        "ResourceId": "ECS_Asg",
        "ResourceType": "auto-scaling-group",
        "Key": "Environment",
        "Value": "Dev",
        "PropagateAtLaunch": true
    }
]
Apply the tags:
aws autoscaling create-or-update-tags --tags file://asg-tags.json
Step 7: Testing and Verification
Launch an Instance
Scale up the Auto Scaling Group to launch an instance:
aws autoscaling update-auto-scaling-group \
    --auto-scaling-group-name ECS_Asg \
    --min-size 1 \
    --max-size 1 \
    --desired-capacity 1
Monitor Scaling Activities
Check the status of the launch activity:
aws autoscaling describe-scaling-activities \
    --auto-scaling-group-name ECS_Asg \
    --query 'Activities[?StatusCode==`InProgress`]'
Verify Tag Propagation
Confirm that tags from the ASG were propagated to the EC2 instance:
aws ec2 describe-instances --filters "Name=tag-key,Values=Purpose"
Get Instance IP Address
Find the IP address of the launched EC2 instance for SSH access:
aws ec2 describe-instances \
    --filters "Name=instance-state-name,Values=running" "Name=tag-key,Values=Purpose" \
    --query "Reservations[*].Instances[*].[InstanceId,InstanceType,PrivateIpAddress,PublicIpAddress]" \
    --output table
This command will display a table showing the Instance ID, Instance Type, Private IP Address, and Public IP Address of all running instances tagged with the "Purpose" key. Make note of the Public IP Address as you'll need it for SSH access in the next step.
Verify ECS Agent
SSH into the launched instance and check the ECS agent status:
# SSH into the instance using the generated key
ssh -i ECSInstanceKey.pem ec2-user@<INSTANCE_PUBLIC_IP>
# Check ECS agent status
sudo systemctl status ecs
# Verify ECS agent container is running
sudo docker ps
Verify Cluster Registration
Check if the instance successfully registered with the ECS cluster:
aws ecs list-container-instances --cluster docling-ecs-cluster
Best Practices and Production Considerations
Security Enhancements
- 
Restrict SSH Access: Replace 0.0.0.0/0with your specific IP range
- Use AWS Systems Manager: Consider Session Manager for secure shell access
- Enable VPC Flow Logs: Monitor network traffic for security analysis
- Use Secrets Manager: Store sensitive configuration data securely
Monitoring and Logging
- CloudWatch Container Insights: Enable for comprehensive ECS monitoring
- Custom Metrics: Set up custom CloudWatch metrics for your applications
- Log Aggregation: Use CloudWatch Logs or a centralized logging solution
Cost Optimization
- Spot Instances: Consider using Spot Instances for cost savings
- Mixed Instance Types: Use multiple instance types in your ASG
- Scheduled Scaling: Implement time-based scaling policies
- ECS Capacity Providers: Use for automatic scaling based on resource utilization
Conclusion
You've successfully created a scalable, GPU-enabled ECS infrastructure using Auto Scaling Groups and Launch Templates. This setup builds upon the foundational networking and IAM infrastructure we established in Part 1 and provides:
- Automated scaling based on demand
- Consistent instance configuration through Launch Templates
- Proper tagging for resource management and cost allocation
- GPU support for compute-intensive workloads
- High availability and fault tolerance
- Integration with the VPC and security architecture from Part 1
The infrastructure is now ready to host containerized applications that require GPU processing power, with the flexibility to scale automatically based on your workload demands.
What's Next?
In Part 3 of this series, we'll cover:
- Creating and deploying ECS Task Definitions
- Setting up ECS Services for your applications
- Implementing Application Load Balancer for traffic distribution
- Advanced ECS configurations and monitoring
Stay tuned for the final part where we'll bring everything together with actual application deployment!
Series Navigation
- Part 1: Foundation - Networking & IAM - VPC setup, subnets, security groups, and IAM roles
- Part 2: ECS EC2 with Auto Scaling (Current) - Launch templates, Auto Scaling Groups, and ECS cluster setup
- Part 3: Application Deployment (Coming Soon) - Task definitions, services, and load balancers
Next Steps
Consider implementing:
- ECS Services and Task Definitions for your applications
- Application Load Balancer for distributing traffic
This foundation, combined with the networking setup from Part 1, provides a robust starting point for deploying docling to AWS ECS cluster as containerized applications on AWS ECS with GPU support.
This content originally appeared on DEV Community and was authored by bikash119
 
	
			bikash119 | Sciencx (2025-09-08T04:34:54+00:00) Deploying GPU-Enabled ECS EC2 Instances with Auto Scaling Groups and Launch Templates. Retrieved from https://www.scien.cx/2025/09/08/deploying-gpu-enabled-ecs-ec2-instances-with-auto-scaling-groups-and-launch-templates/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.
