Overview
FiretitanServiceClient is the recommended direct SDK entry point. In the managed path, it creates or reattaches the FireTitan trainer, optional reference trainer, and optional inference deployment, then returns Tinker-compatible training and sampling clients.
For most direct SDK code, create it with FiretitanServiceClient.from_firetitan_config(...). The bare constructor is still useful when you already have a trainer endpoint URL, but that is a lower-level compatibility path.
FiretitanServiceClient
from_firetitan_config(...)
Create a lazy SDK-managed service. The trainer and deployment are provisioned on the first client call, usually create_training_client(...):
| Field | Type | Default | Description |
|---|---|---|---|
api_key | str | None | FIREWORKS_API_KEY | Fireworks API key. |
base_url | str | None | https://api.fireworks.ai | Control-plane URL. |
inference_url | str | None | None | Optional inference gateway URL. |
base_model | str | — | Fireworks base model resource name. |
tokenizer_model | str | None | None | HuggingFace tokenizer name used by get_tokenizer() and sampler setup. |
lora_rank | int | 0 | 0 for full-parameter training; positive value for LoRA. |
training_shape_id | str | None | None | User-facing training shape ID. The SDK resolves the pinned version. |
reference_training_shape_id | str | None | None | Optional separate forward-only reference trainer shape. |
trainer_job_id | str | None | None | Reattach to an existing trainer instead of creating one. |
reference_trainer_job_id | str | None | None | Reattach to an existing reference trainer. |
create_deployment | bool | True | Whether to create or reattach an inference deployment. Set False for trainer-only SFT/DPO-style loops. |
deployment_id | str | None | None | Create or reattach an inference deployment for sampling and weight sync. |
deployment_shape | str | None | Linked shape | Optional deployment shape override. Usually inherited from the training shape. |
trainer_replica_count | int | None | None | Data-parallel HSDP replicas for the trainer. |
replica_count | int | 1 | Inference deployment replicas. |
cleanup_trainer_on_close | bool | False | Delete the SDK-managed policy trainer when service.close() runs. |
cleanup_reference_trainer_on_close | bool | True | Delete SDK-managed separate reference trainers when released/closed. |
cleanup_deployment_on_close | "scale_to_zero" | "delete" | None | None | Optional deployment cleanup action on close. |
Bare constructor
base_url is the trainer endpoint URL from TrainerServiceEndpoint.base_url. Use this only when you intentionally manage trainer lifecycle yourself. New user code should use from_firetitan_config(...).
create_training_client(base_model, lora_rank, user_metadata)
Creates a FiretitanTrainingClient for training operations:
| Parameter | Type | Default | Description |
|---|---|---|---|
base_model | str | — | Must match the trainer job’s base_model |
lora_rank | int | 0 | Must match trainer creation config (0 for full-parameter) |
user_metadata | dict[str, str] | None | None | Optional run metadata |
Connecting to an existing trainer
If you already have a running trainer (e.g. from a previous session), connect directly by URL:create_base_training_client(base_model, user_metadata=None)
Creates a base-only client on the same trainer session. Use this as a frozen reference for LoRA KL/reference logprobs without launching a separate forward-only trainer:
forward_backward, forward_backward_custom, or optim_step on this client; it is for reference forward passes only.
create_reference_client(base_model, lora_rank=0, user_metadata=None)
Create a frozen reference client for KL/DPO baseline logprobs:
reference_training_shape_id, or explicit reference_trainer_job_id use a separate forward-only reference trainer owned by the service.
create_sampling_client(model_path=None, ...)
Return a Tinker-shaped sampling client backed by the SDK-managed deployment. When model_path is provided, the SDK first syncs that sampler snapshot to the deployment:
create_deployment_sampler(model_path=None, tokenizer=None, concurrency_controller=None)
Return the FireTitan-native DeploymentSampler directly. Use this when you need tokenized completions, inference logprobs, routing matrices, or adaptive concurrency:
hotload_sampler_snapshot(model_path)
Low-level method for syncing a previously saved sampler snapshot into the SDK-managed deployment without constructing a sampler:
FiretitanTrainingClient
The training client returned bycreate_training_client(). Core training RPCs like forward(...), forward_backward_custom(...), optim_step(...), save_state(...), and load_state_with_optimizer(...) return futures. Fireworks convenience helpers like save_weights_for_sampler_ext(...), list_checkpoints(), and resolve_checkpoint_path(...) return concrete values directly.
forward(datums, loss_type)
Forward-only pass (no gradient computation). Useful for computing reference logprobs in GRPO/DPO:
Built-in loss types like
"cross_entropy" require datums with target_tokens in loss_fn_inputs. Datums built with datum_from_model_input_weights will fail. Use the target-token tinker.Datum example in Loss Functions for built-in losses, or use forward_backward_custom with the weight-based format in Building datums and the custom-loss pattern in Example: simple cross-entropy.forward_backward_custom(datums, loss_fn)
Forward + backward with your custom loss function. See Loss Functions for details:
output="embedding" and choose pooling="mean" or "last"; your loss function then receives pooled embedding tensors instead of logprobs:
optim_step(adam_params, grad_accumulation_normalization=None)
Apply optimizer update after accumulating gradients:
grad_accumulation_normalization to control how accumulated gradients are normalized. The default None leaves gradients unchanged. Pass GradAccNormalization.NUM_LOSS_TOKENS, GradAccNormalization.NUM_SEQUENCES, or GradAccNormalization.NONE rather than raw strings. See the cookbook skill reference for operational guidance.
save_weights_for_sampler(name, ttl_seconds=None, checkpoint_type=None)
Save serving-compatible sampler weights and return a future. This is the normal Tinker-shaped API:
path is a public snapshot identity, not a raw storage URI.
save_weights_for_sampler_ext(name, checkpoint_type, ttl_seconds)
Fireworks-specific extension that returns a concrete SaveSamplerResult instead of a future:
| Parameter | Type | Default | Description |
|---|---|---|---|
name | str | — | Checkpoint name (auto-suffixed with session ID) |
checkpoint_type | str | None | None | "base" for full weights, "delta" for incremental |
ttl_seconds | int | None | None | Auto-delete checkpoint after this many seconds |
On full-parameter training, only
checkpoint_type="base" produces a promotable blob; "delta" cannot be promoted. LoRA is always promotable. See Checkpoint kinds for the full promotability matrix.save_weights_for_sampler_ext saves the snapshot only; it does not mutate a deployment. To serve the snapshot, pass result.snapshot_name to the managed service weight-sync path, or use create_sampling_client(model_path=...) / create_deployment_sampler(model_path=...), which sync and return a sampler.
save_state(name, ttl_seconds=None, timeout=None)
Save full train state (weights + optimizer) for resume:
| Parameter | Type | Default | Description |
|---|---|---|---|
name | str | — | Checkpoint name |
ttl_seconds | int | None | None | Auto-delete checkpoint after this many seconds |
timeout | float | None | None | If set, block until the save completes or the timeout expires |
load_state_with_optimizer(name)
Restore full train state (weights + optimizer) from a checkpoint:
load_state(name)
Load model weights from a checkpoint without restoring optimizer state. The optimizer is reset so the next optim_step starts fresh:
load_adapter(adapter_ref)
Load a Hugging Face PEFT adapter model into the current LoRA session. Pass a Fireworks model resource name for a promoted adapter, such as accounts/<ACCOUNT_ID>/models/<ADAPTER_MODEL_ID>. This is a weights-only warm start; it does not restore optimizer state, scheduler state, or data cursor.
list_checkpoints()
List available DCP checkpoints from the trainer. Returns a list[str]:
resolve_checkpoint_path(checkpoint_name, source_job_id)
Resolve a checkpoint path for cross-job resume:
SaveSamplerResult
Returned bysave_weights_for_sampler_ext:
| Field | Type | Description |
|---|---|---|
path | str | Snapshot name from trainer |
snapshot_name | str | Session-qualified name for weight sync operations |
GradAccNormalization
Enum for the advancedoptim_step grad_accumulation_normalization parameter:
| Enum | Wire value | Description |
|---|---|---|
GradAccNormalization.NUM_LOSS_TOKENS | "num_loss_tokens" | Normalize by total loss tokens across accumulated micro-batches |
GradAccNormalization.NUM_SEQUENCES | "num_sequences" | Normalize by total sequences across accumulated micro-batches |
GradAccNormalization.NONE | "none" | Explicit no normalization (raw gradient sum) |
Related guides
- Training and Sampling — managed service training + sampler refresh walkthrough
- Loss Functions — built-in and custom loss functions
- Saving and Loading — checkpoint details