Skip to main content
CloudWatch Logs captures output from your ECS containers.

View Logs

Tail logs in real-time:
aws logs tail /ecs/{infra_name}-prd --follow
Search recent logs:
aws logs filter-log-events \
  --log-group-name /ecs/{infra_name}-prd \
  --filter-pattern "ERROR" \
  --start-time $(date -d '1 hour ago' +%s)000
Replace {infra_name} with your infra_name from settings.py (e.g., agentos-aws-template).

ECS Service Status

View service status and recent events:
aws ecs describe-services \
  --cluster {infra_name}-prd \
  --services {infra_name}-prd-service \
  --query 'services[0].{status:status,running:runningCount,desired:desiredCount,events:events[:5]}'
List running tasks:
aws ecs list-tasks --cluster {infra_name}-prd

What Success Looks Like

After a successful deployment, logs show:
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000
Health check passing:
INFO:     192.168.x.x - "GET /health HTTP/1.1" 200 OK

Warning Signs

Log PatternMeaningAction
database is lockedDuckDB concurrency issueReduce workers to 1
connection refusedCan’t reach RDSCheck security group
OOMKilledOut of memoryIncrease task memory
CannotPullContainerErrorECR auth expiredRe-run auth_ecr.sh
SIGTERM then restart loopHealth check failingCheck app logs for errors

Health Checks

The load balancer checks /health every 30 seconds.
Target StatusMeaning
healthyTask passing health checks
unhealthyHealth check failing
drainingTask being replaced
If unhealthy, check:
  1. Container logs for startup errors
  2. Security group allows port 8000 from ALB
  3. Database connectivity (DB_HOST, DB_PASS)

Log Retention

CloudWatch retains logs indefinitely by default. Set a retention policy to control costs:
aws logs put-retention-policy \
  --log-group-name /ecs/{infra_name}-prd \
  --retention-in-days 30
RetentionMonthly Cost (10GB/day)
7 days~$3
30 days~$15
90 days~$45

Alerts (Optional)

Create a CloudWatch alarm for task failures:
aws cloudwatch put-metric-alarm \
  --alarm-name "{infra_name}-task-failures" \
  --metric-name "FailedTasks" \
  --namespace "AWS/ECS" \
  --statistic Sum \
  --period 300 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --dimensions Name=ClusterName,Value={infra_name}-prd \
  --evaluation-periods 1 \
  --alarm-actions [YOUR_SNS_TOPIC_ARN]
See AWS SNS documentation to create a notification topic.