Fireworks AI on Amazon ECS
What is Amazon ECS?
Amazon Elastic Container Service (ECS) is AWS’s fully managed container orchestration service that enables you to deploy, manage, and scale containerized applications. ECS integrates deeply with the AWS ecosystem, providing native support for GPU-accelerated workloads, Auto Scaling, Application Load Balancers, and CloudWatch monitoring.Why Fireworks AI on ECS?
Enterprise Security & Compliance
Deploy Fireworks inference entirely within your own Virtual Private Cloud with VPC-native architecture, internal-only API endpoints, and data that never leaves your AWS environment. Meet strict regulatory requirements for healthcare, financial services, and government workloads while leveraging existing AWS Enterprise Discount Programs and reserved instances.Performance at Scale
Access Fireworks’ full optimization stack including FireOptimizer, adaptive caching, and speculative decoding. Deploy on the latest GPU instance types (H100, H200, B200, etc) with automatic scaling to handle high throughput / low-latency workloads within the same VPC.Main Deployment Steps
Deploying Fireworks AI on ECS is a three-phase process that balances automation with control: Phase 1: Foundation Infrastructure (Automated)- Run shell script to create S3 bucket, security groups, and IAM roles via Terraform
- Establish secure foundation for your deployment
- Upload your model to the S3 bucket
- Push Fireworks container image to Amazon ECR
- Store metering key in AWS Secrets Manager
- Run shell script to deploy ECS cluster, Application Load Balancer, and Auto Scaling Group
- Results in production-ready internal inference endpoint