Fireworks offers a Priority tier for workloads that require higher reliability, as well as a Fast mode for workloads that require higher speeds.Documentation Index
Fetch the complete documentation index at: https://fireworks.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Priority tier
Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be rate limited. To use priority tier, setservice_tier to "priority" (OpenAI-compatible chat completions only):
Fast mode
Fast mode is a high speed configuration, useful for interactive applications that require fast response speeds, at a higher price point. It is not a different model and the quality of the model remains the same. Fast mode is available for select models. To use Fast mode, change themodel id as listed below.
| Model | model id |
|---|---|
| Kimi K2.6 Turbo | accounts/fireworks/routers/kimi-k2p6-turbo |
| GLM 5.1 Fast | accounts/fireworks/routers/glm-5p1-fast |
Related
- Serverless quickstart
- Text models
- Anthropic compatibility — Priority uses OpenAI-compatible chat completions; the Anthropic
messagesAPI does not supportservice_tier.