Serverless Priority and Fast

Priority tier and Fast mode are in Preview. The features, pricing, and availability may change - we welcome your feedback!

Fireworks offers a Priority tier for workloads that require higher reliability, as well as a Fast mode for workloads that require higher speeds.

Priority tier

Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be rate limited. To use priority tier, set service_tier to "priority" (OpenAI-compatible chat completions only):

curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p5",
    "service_tier": "priority",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Priority tier is available on select models. Models and pricing are listed on the Pricing page.

Fast mode

Fast mode is a high speed configuration, useful for interactive applications that require fast response speeds, at a higher price point. It is not a different model and the quality of the model remains the same. Fast mode is available for select models. To use Fast mode, change the model id as listed below.

Model	`model` id
Kimi K2.6 Turbo	`accounts/fireworks/routers/kimi-k2p6-turbo`
GLM 5.1 Fast	`accounts/fireworks/routers/glm-5p1-fast`

curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p6-turbo",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Pricing is listed on the Pricing page.

Serverless quickstart
Text models
Anthropic compatibility — Priority uses OpenAI-compatible chat completions; the Anthropic messages API does not support service_tier.

Get Started

Deployments

Models & Inference

Fine Tuning

Fire Pass

Administration

Security & Compliance

Integrations

Priority tier

Fast mode

Get Started

Deployments

Models & Inference

Fine Tuning

Fire Pass

Administration

Security & Compliance

Integrations

Documentation Index

​Priority tier

​Fast mode

​Related

Priority tier

Fast mode

Related