asyncio library. It also includes retry logic for handling 429 errors that Fireworks returns when the server is overloaded.
General optimization recommendations
Based on our benchmarks, we recommend the following:- Use a client library optimized for high concurrency, such as httpx in Python or http.Agent in Node.js.
- Use the
AsyncFireworksclient for high-concurrency workloads. - Increase concurrency until performance stops improving or you observe too many
429errors.
Code example: Optimal concurrent requests (Python)
Install the Fireworks Python SDK:The SDK is currently in alpha. Use the
--pre flag when installing to get the latest version.asyncio and the AsyncFireworks client:
main.py
- Uses
AsyncFireworksfor non-blocking async requests with optimized connection pooling - Uses
asyncio.Semaphoreto control concurrency to avoid overwhelming the server