Overview
The Responses API is designed for building conversational applications and complex workflows. It allows you to:- Continue conversations: Maintain context across multiple turns without resending the entire history.
- Use external tools: Integrate with external services and data sources through MCP/SSE tools (server-executed) or function tools (client-executed).
- Stream responses: Receive results as they are generated, enabling real-time applications.
- Control tool usage: Set limits on tool calls with
max_tool_callsparameter. - Manage data retention: Choose whether to store conversations (default) or opt-out with
store=false.
Basic Usage
You can interact with the Response API using the Fireworks Python SDK or by making direct HTTP requests.Creating a Response
To start a new conversation, you use theclient.responses.create method. For a complete example, see the getting started notebook.
OpenAI SDK
Using Function Tools
Function tools follow the OpenAI-compatible format and are returned to the client for execution.OpenAI SDK
Continuing a Conversation with previous_response_id
To continue a conversation, you can use the previous_response_id parameter. This tells the API to use the context from a previous response, so you don’t have to send the entire conversation history again. For a complete example, see the previous response ID notebook.
OpenAI SDK
Streaming Responses
For real-time applications, you can stream the response as it’s being generated. For a complete example, see the streaming example notebook.OpenAI SDK
Cookbook Examples
For more in-depth examples, check out the following notebooks:- General MCP Examples
- Using
previous_response_id - Streaming Responses
- Using
store=False - MCP with Streaming
Storing Responses
By default, responses are stored and can be referenced by their ID. You can disable this by settingstore=False. If you do this, you will not be able to use the previous_response_id to continue the conversation. For a complete example, see the store=False notebook.
OpenAI SDK
Deleting Stored Responses
When responses are stored (the default behavior withstore=True), you can immediately delete them from storage using the DELETE endpoint. This permanently removes the conversation data.
Response Structure
All response objects include the following fields:id: Unique identifier for the response (e.g.,resp_abc123...)created_at: Unix timestamp when the response was createdstatus: Status of the response (typically"completed")model: The model used to generate the responseoutput: Array of message objects, tool calls, and tool outputsusage: Token usage information:prompt_tokens: Number of tokens in the promptcompletion_tokens: Number of tokens in the completiontotal_tokens: Total tokens usedprompt_tokens_details: Details about prompt tokens:cached_tokens: Number of prompt tokens served from cache
previous_response_id: ID of the previous response in the conversation (if any)store: Whether the response was stored (boolean)max_tool_calls: Maximum number of tool calls allowed (if set)