CLI Reference · serve · train · adapters

Run Kiln from the terminal.

Use the CLI when you want a headless server, repeatable scripts, or quick diagnostics against a running Kiln instance. These examples assume a running Kiln server backed by Qwen/Qwen3.5-4B; see Quickstart for setup and model download details. If you want visual status, adapter controls, or training monitoring, open http://127.0.0.1:8420/ui instead.

Command chooser

If you want to...

Start the server

Point Kiln at your model directory, then serve the OpenAI-compatible API.

export KILN_MODEL_PATH=./Qwen3.5-4B
kiln serve

Verify readiness

Use the readable tree for humans or JSON for scripts and CI probes.

kiln health
kiln health --json

Submit training

Send SFT corrections, GRPO scored completions, or check job progress.

kiln train sft --file corrections.jsonl --adapter support-bot
kiln train grpo --file grpo-batch.json --adapter support-bot
kiln train status

Manage adapters

List adapters, load a saved LoRA, or unload the active adapter.

kiln adapters list
kiln adapters load support-bot
kiln adapters unload

Validate config

Check a TOML config before using it with kiln serve --config.

kiln config --file kiln.toml

Server

Start serving Qwen3.5-4B

Point KILN_MODEL_PATH at the local model directory, then start the OpenAI-compatible server. Running kiln with no subcommand starts the server just like kiln serve.

export KILN_MODEL_PATH=./Qwen3.5-4B
kiln serve

The default server listens on 127.0.0.1:8420; open /ui there for the dashboard.

Configuration

Use config files and model IDs

--config (or -c) loads a TOML config file. --served-model-id changes the model name returned by /v1/models and accepted by OpenAI-compatible clients.

kiln config
kiln config --file kiln.toml
kiln serve -c kiln.toml
kiln serve --config kiln.toml
kiln serve --served-model-id qwen3.5-4b-local

Use kiln config to validate built-in defaults plus KILN_* environment overrides, or kiln config --file / kiln config -f to validate a TOML file before starting the server.

Logging

Tune startup output

Kiln uses the configured log level by default. Add global -v / --verbose for debug startup detail, repeat it as -vv for trace-level kernel and scheduler detail, or use -q / --quiet when you only want warnings and errors. Add --help when you want the exact flags from the installed binary; the examples below stay focused on copy-paste startup and health commands.

kiln -v serve
kiln -vv serve
kiln -q health

Put verbosity flags before or after the subcommand; they are global CLI options and are mutually exclusive between verbose and quiet modes.

Health

Check server readiness

kiln health prints a readable tree with model, adapter, scheduler, and training status.

kiln health
kiln health --url http://localhost:8420

Point --url at a remote, Tailscale, or reverse-proxied server when Kiln runs on another machine.

Use --json in scripts or CI probes when you want the raw health payload.

kiln health --json \
  | python3 -m json.tool

Training

Submit SFT and GRPO jobs

Training commands talk to an already-running server. SFT reads JSONL with one chat correction example per line, each with a messages array. GRPO reads one JSON request/batch with groups; each group has prompt messages plus candidate completions containing text and reward scores.

SFT corrections

kiln train sft \
  --file corrections.jsonl \
  --adapter support-bot
kiln train sft --file corrections.jsonl --adapter support-bot --url http://gpu-box:8420

Each JSONL line is one chat correction with a messages array. Add --epochs, --lr, or --lora-rank when you need to override defaults.

GRPO rewards

kiln train grpo \
  --file grpo-batch.json \
  --adapter support-bot

Use /v1/completions/batch or another generator to create prompts and candidate completions first, score them, then submit the scored groups to kiln train grpo. See the GRPO Guide for reward-loop examples.

Queue status

kiln train status
kiln train status --job-id train_123
kiln train status --url http://gpu-box:8420

Use the dashboard when you want visual progress and recent job history. The same --url flag works for SFT, GRPO, and status commands.

Adapters

Manage LoRA adapters

Adapter commands call the running server's adapter API. Use them in scripts; use /ui when you want upload, download, merge, or safer visual confirmation before deleting.

List, load, unload

kiln adapters list
kiln adapters list --url http://gpu-box:8420
kiln adapters load support-bot
kiln adapters unload
kiln adapters unload support-bot

The named unload form is accepted for backwards compatibility; the server unloads the active adapter. Add --url to target a remote server.

Delete

kiln adapters delete support-bot

Delete removes an adapter through the server. Prefer the UI for one-off manual cleanup.