Launch Training

This guide covers how to launch your RFT job using either the Eval Protocol CLI or the Fireworks web UI.

Prerequisites

Before launching an RFT job, ensure you have:

Dataset prepared and uploaded

Your dataset must be in JSONL format with prompts (system and user messages). Each line represents one training example.Upload via CLI:

eval-protocol create dataset my-dataset --file dataset.jsonl

Or via the Fireworks dashboard.

Evaluator created

Your reward function must be tested and uploaded. For local evaluators, upload via pytest:

cd evaluator_directory
pytest test_evaluator.py -vs

The test automatically registers your evaluator with Fireworks. For remote evaluators, deploy your HTTP service first.

Fireworks API key configured

Set your API key as an environment variable:

export FIREWORKS_API_KEY="fw_your_api_key_here"

Or store it in a .env file in your project directory.

Base model selected

Choose a base model that supports fine-tuning. Popular options:

accounts/fireworks/models/llama-v3p1-8b-instruct - Good balance of quality and speed
accounts/fireworks/models/qwen3-0p6b - Fast training for experimentation
accounts/fireworks/models/llama-v3p1-70b-instruct - Best quality, slower training

Check available models at fireworks.ai/models.

Option A: CLI with Eval Protocol (Recommended)

The Eval Protocol CLI provides the fastest, most reproducible way to launch RFT jobs.

Quick start

From your evaluator directory, run:

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct

That’s it! This command automatically:

Uploads your evaluator code (if not already uploaded)
Uploads your dataset (if dataset.jsonl exists)
Creates and launches the RFT job

The CLI automatically detects your evaluator and dataset from the current directory. No need to specify IDs manually.

Step-by-step walkthrough

Install Eval Protocol CLI

pip install eval-protocol

Verify installation:

eval-protocol --version

Set up authentication

Configure your Fireworks API key:

export FIREWORKS_API_KEY="fw_your_api_key_here"

Or create a .env file:

FIREWORKS_API_KEY=fw_your_api_key_here

Test your evaluator locally

Before training, verify your evaluator works:

cd evaluator_directory
pytest test_evaluator.py -vs

This runs your evaluator on test data and automatically registers it with Fireworks.

Create the RFT job

From your evaluator directory:

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model my-math-solver

The CLI will:

Upload evaluator code (if changed)
Upload dataset (if changed)
Create the RFT job
Display dashboard links for monitoring

Expected output:

Created Reinforcement Fine-tuning Job
   name: accounts/your-account/reinforcementFineTuningJobs/abc123

Dashboard Links:
   Evaluator: https://app.fireworks.ai/dashboard/evaluators/your-evaluator
   Dataset:   https://app.fireworks.ai/dashboard/datasets/your-dataset
   RFT Job:   https://app.fireworks.ai/dashboard/fine-tuning/reinforcement/abc123

Monitor training

Click the RFT Job link to watch training progress in real-time. See Monitor Training for details.

Common CLI options

Customize your RFT job with these flags: Model and output:

--base-model accounts/fireworks/models/llama-v3p1-8b-instruct  # Base model to fine-tune
--output-model my-custom-name                                   # Name for fine-tuned model

Training parameters:

--epochs 2                    # Number of training epochs (default: 1)
--learning-rate 5e-5          # Learning rate (default: 1e-4)
--lora-rank 16                # LoRA rank (default: 8)
--batch-size 65536            # Batch size in tokens (default: 32768)

Rollout (sampling) parameters:

--inference-temperature 0.8   # Sampling temperature (default: 0.7)
--inference-n 8               # Number of rollouts per prompt (default: 4)
--inference-max-tokens 4096   # Max tokens per response (default: 2048)
--inference-top-p 0.95        # Top-p sampling (default: 1.0)
--inference-top-k 50          # Top-k sampling (default: 40)

Remote environments:

--remote-server-url https://your-evaluator.example.com  # For remote rollout processing

Force re-upload:

--force                       # Re-upload evaluator even if unchanged

See all options:

eval-protocol create rft --help

Examples

Fast experimentation (small model, 1 epoch):

eval-protocol create rft \
  --base-model accounts/fireworks/models/qwen3-0p6b \
  --output-model quick-test

High-quality training (more rollouts, higher temperature):

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --output-model high-quality-model \
  --inference-n 8 \
  --inference-temperature 1.0

Remote environment (for multi-turn agents):

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --remote-server-url https://your-agent.example.com \
  --output-model remote-agent

Multiple epochs with custom learning rate:

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --epochs 3 \
  --learning-rate 5e-5 \
  --output-model multi-epoch-model

Option B: Web UI

The Fireworks dashboard provides a visual interface for creating RFT jobs with guided parameter selection.

Navigate to Fine-Tuning

Go to Fireworks Dashboard
Click Fine-Tuning in the left sidebar
Click Fine-tune a Model

Fine-tuning dashboard showing list of jobs

Select Reinforcement Fine-Tuning

Choose Reinforcement as the tuning method
Select your base model from the dropdown

The UI shows only models that support fine-tuning. Popular choices appear at the top.

Not sure which model to choose? Start with llama-v3p1-8b-instruct for a good balance of quality and speed.

Configure Dataset

Upload new dataset or select existing from your account
Preview dataset entries to verify format
The UI validates your JSONL format automatically

Each dataset row should have messages array:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is 25 * 4?"}
  ]
}

Select Evaluator

Choose from your uploaded evaluators
Preview evaluator code and test results
View recent evaluation metrics

If you haven’t uploaded an evaluator yet, you’ll need to do that first via CLI:

pytest test_evaluator.py -vs

For remote evaluators, you’ll enter your server URL in the environment configuration section.

Set Training Parameters

Configure how the model learns:Core parameters:

Output model name: Custom name for your fine-tuned model
Epochs: Number of passes through the dataset (start with 1)
Learning rate: How fast the model updates (use default 1e-4)
LoRA rank: Model capacity (8-16 for most tasks)
Batch size: Training throughput (use default 32k tokens)

The UI shows helpful tooltips for each parameter. See Parameter Tuning for detailed guidance.

Configure Rollout Parameters

Control how the model generates responses during training:

Temperature: Sampling randomness (0.7 for balanced exploration)
Top-p: Probability mass cutoff (0.9-1.0)
Top-k: Token candidate limit (40 is standard)
Number of rollouts (n): Responses per prompt (4-8 recommended)
Max tokens: Maximum response length (2048 default)

Higher temperature and more rollouts increase exploration but also cost.

Review and Launch

Review all settings in the summary panel
See estimated training time and cost
Click Start Fine-Tuning to launch

The dashboard will redirect you to the job monitoring page where you can track progress in real-time.

UI vs CLI comparison

Feature	CLI (eval-protocol)	Web UI
Speed	Fast - single command	Slower - multiple steps
Automation	Easy to script and reproduce	Manual process
Parameter discovery	Need to know flag names	Guided with tooltips
Batch operations	Easy to launch multiple jobs	One at a time
Reproducibility	Excellent - save commands	Manual tracking needed
Best for	Experienced users, automation	First-time users, exploration

Start with the UI to learn the options, then switch to CLI for faster iteration and automation.

Using `firectl` CLI (Alternative)

For users already familiar with Fireworks firectl, you can create RFT jobs directly:

firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --dataset accounts/your-account/datasets/my-dataset \
  --evaluator accounts/your-account/evaluators/my-evaluator \
  --output-model my-finetuned-model

Differences from eval-protocol:

Requires fully qualified resource names (accounts/…)
Must manually upload evaluators and datasets first
More verbose but offers finer control
Same underlying API as eval-protocol

See firectl documentation for all options.

Job validation

Before starting training, Fireworks validates your configuration:

Dataset format validation

✅ Valid JSONL format
✅ Each line has messages array
✅ Messages have role and content fields
✅ File size within limits
❌ Missing fields → error with specific line numbers
❌ Invalid JSON → syntax error details

Evaluator validation

✅ Evaluator code syntax is valid
✅ Required dependencies are available
✅ Entry point function exists
✅ Test runs completed successfully
❌ Import errors → missing dependencies
❌ Syntax errors → code issues

Resource availability

✅ Sufficient GPU quota
✅ Base model supports fine-tuning
✅ Account has RFT permissions
❌ Insufficient quota → request increase
❌ Invalid model → choose different base model

Parameter validation

✅ Parameters within valid ranges
✅ Compatible parameter combinations
❌ Invalid ranges → error with allowed values
❌ Conflicting options → resolution guidance

If validation fails, you’ll receive specific error messages with instructions to fix the issues.

Common errors and fixes

Invalid dataset format

Error: Dataset validation failed: invalid JSON on line 42Fix:

Open your JSONL file
Check line 42 for JSON syntax errors
Common issues: missing quotes, trailing commas, unescaped characters
Validate JSON at jsonlint.com

Error: Missing required field 'messages'Fix: Each dataset row must have a messages array:

{"messages": [{"role": "user", "content": "..."}]}

Evaluator not found

Error: Evaluator 'my-evaluator' not found in accountFix:

Upload your evaluator first:

cd evaluator_directory
pytest test_evaluator.py -vs

Or specify evaluator ID if using UI:
- Check Evaluators dashboard
- Copy exact evaluator ID

Insufficient quota

Error: Insufficient GPU quota for this jobFix:

Check your current quota at Account Settings
Request a quota increase through the dashboard
Or choose a smaller base model to reduce GPU requirements

Parameter out of range

Error: Learning rate 1e-2 outside valid range [1e-5, 5e-4]Fix: Adjust the parameter to be within the allowed range:

--learning-rate 1e-4  # Use default value

See Parameter Reference for all valid ranges.

Evaluator build timeout

Error: Evaluator build timed out after 10 minutesFix:

Check build logs in Evaluators dashboard
Common issues:
- Large dependencies taking too long to install
- Network issues downloading packages
- Syntax errors in requirements.txt
Wait for build to complete, then run create rft again
Consider splitting large dependencies or using lighter alternatives

What happens after launching

Once your job is created, here’s what happens:

Job queued

Your job enters the queue and waits for available GPU resources. Queue time depends on current demand.Status: PENDING

Dataset validation

Fireworks validates your dataset to ensure it meets format requirements and quality standards. This typically takes 1-2 minutes.Status: VALIDATING

Training starts

The system begins generating rollouts, evaluating them, and updating model weights. You’ll see:

Rollout generation and evaluation
Reward curves updating in real-time
Training loss decreasing

Status: RUNNING

Monitor progress

Track training via the dashboard. See Monitor Training for details on interpreting metrics and debugging issues.Status: RUNNING → COMPLETED

Job completes

When training finishes, your fine-tuned model is ready for deployment.Status: COMPLETEDNext: Deploy your model for inference.

Advanced configuration

Weights & Biases integration

Track training metrics in W&B for deeper analysis:

eval-protocol create rft \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --wandb-project my-rft-experiments \
  --wandb-entity my-org

Set WANDB_API_KEY in your environment first.

Custom checkpoint frequency

Save intermediate checkpoints during training:

firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-8b-instruct \
  --checkpoint-frequency 500  # Save every 500 steps
  ...

Available in firectl only.

Multi-GPU acceleration

Speed up training with multiple GPUs:

firectl create rftj \
  --base-model accounts/fireworks/models/llama-v3p1-70b-instruct \
  --accelerator-count 4  # Use 4 GPUs
  ...

Recommended for large models (70B+).

Custom timeout

For evaluators that need more time:

firectl create rftj \
  --rollout-timeout 300  # 5 minutes per rollout
  ...

Default is 60 seconds. Increase for complex evaluations.

Next steps

Monitor training

Track job progress, inspect rollouts, and debug issues

Parameter tuning

Learn how to adjust parameters for better results

Parameter reference

Quick lookup of all available parameters and ranges

Deploy your model

Deploy your fine-tuned model for inference

Get Started

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

Prerequisites

Option A: CLI with Eval Protocol (Recommended)

Quick start

Step-by-step walkthrough

Common CLI options

Examples

Option B: Web UI

UI vs CLI comparison

Using `firectl` CLI (Alternative)

Job validation

Common errors and fixes

What happens after launching

Advanced configuration

Next steps

Monitor training

Parameter tuning

Parameter reference

Deploy your model

Get Started

Deployments

Models & Inference

Fine Tuning

Administration

Security & Compliance

Integrations

​Prerequisites

​Option A: CLI with Eval Protocol (Recommended)

​Quick start

​Step-by-step walkthrough

​Common CLI options

​Examples

​Option B: Web UI

​UI vs CLI comparison

​Using firectl CLI (Alternative)

​Job validation

​Common errors and fixes

​What happens after launching

​Advanced configuration

​Next steps

Monitor training

Parameter tuning

Parameter reference

Deploy your model

Prerequisites

Option A: CLI with Eval Protocol (Recommended)

Quick start

Step-by-step walkthrough

Common CLI options

Examples

Option B: Web UI

UI vs CLI comparison

Using `firectl` CLI (Alternative)

Job validation

Common errors and fixes

What happens after launching

Advanced configuration

Next steps