Quickstart · 5-minute path · Qwen3.5-4B

Run Kiln, open the UI, and send one chat request.

Start with the Desktop App if you want the shortest path. Use a server binary or Docker when you want a terminal-first setup. Every path serves the same single target model: Qwen/Qwen3.5-4B.

Reader map

Know where to stop

Basic path

Get Kiln running first

Follow prerequisites, install, model download, server start, /health, and /ui. If chat works, the first-run path is complete.

Optional learning

Train after verification

Use SFT, GRPO, and adapter workflows only after the server is healthy and you have sent one chat request.

Advanced reference

Return when integrating

Use the API Reference and CLI Reference for tools, batch generation, adapter import/export, merge, composition, and webhooks.

1

Prerequisites + choose one path

Install Kiln

Desktop App · recommended

Download Kiln Desktop v0.2.15, then choose or download the Qwen3.5-4B model in the app and start the server from the GUI.

Platform Installer Size
macOS Apple Silicon Kiln.Desktop_0.2.15_aarch64.dmg 8.5 MB
Windows Kiln.Desktop_0.2.15_x64-setup.exe (NSIS) 4.5 MB
Windows Kiln.Desktop_0.2.15_x64_en-US.msi (MSI) 6.8 MB
Linux Kiln.Desktop_0.2.15_amd64.deb 8.8 MB
Linux Kiln.Desktop_0.2.15_amd64.AppImage 85.7 MB

Desktop and server release lines intentionally differ: desktop-v0.2.15 is the latest Desktop app release, and the app downloads and verifies the latest kiln-v* server binary for you.

Linux x86_64 · CUDA 12.4

KILN_VERSION=$(curl -fsSL https://api.github.com/repos/ericflo/kiln/releases/latest | sed -n 's/.*"tag_name": "kiln-v\([^"]*\)".*/\1/p')
curl -L -o kiln-linux-cuda.tar.gz \
  "https://github.com/ericflo/kiln/releases/download/kiln-v${KILN_VERSION}/kiln-${KILN_VERSION}-x86_64-unknown-linux-gnu-cuda124.tar.gz"
tar -xzf kiln-linux-cuda.tar.gz

Linux x86_64 · Vulkan 1.2

KILN_VERSION=$(curl -fsSL https://api.github.com/repos/ericflo/kiln/releases/latest | sed -n 's/.*"tag_name": "kiln-v\([^"]*\)".*/\1/p')
curl -L -o kiln-linux-vulkan.tar.gz \
  "https://github.com/ericflo/kiln/releases/download/kiln-v${KILN_VERSION}/kiln-${KILN_VERSION}-x86_64-unknown-linux-gnu-vulkan.tar.gz"
tar -xzf kiln-linux-vulkan.tar.gz

Use this on AMD/Intel Linux systems where vulkaninfo --summary lists the GPU.

macOS Apple Silicon · Metal

KILN_VERSION=$(curl -fsSL https://api.github.com/repos/ericflo/kiln/releases/latest | sed -n 's/.*"tag_name": "kiln-v\([^"]*\)".*/\1/p')
curl -L -o kiln-macos.tar.gz \
  "https://github.com/ericflo/kiln/releases/download/kiln-v${KILN_VERSION}/kiln-${KILN_VERSION}-aarch64-apple-darwin-metal.tar.gz"
tar -xzf kiln-macos.tar.gz

Windows x86_64 · CUDA 12.4

$KilnVersion = ((Invoke-RestMethod https://api.github.com/repos/ericflo/kiln/releases/latest).tag_name -replace '^kiln-v', '')
curl.exe -L -o kiln-windows.zip `
  "https://github.com/ericflo/kiln/releases/download/kiln-v$KilnVersion/kiln-$KilnVersion-x86_64-pc-windows-msvc-cuda124.zip"
Expand-Archive .\kiln-windows.zip -DestinationPath .\kiln
2

Model path

Download Qwen3.5-4B

Point KILN_MODEL_PATH at a local checkout of Qwen/Qwen3.5-4B.

pip install huggingface-hub
huggingface-cli download Qwen/Qwen3.5-4B --local-dir ./Qwen3.5-4B
export KILN_MODEL_PATH=./Qwen3.5-4B
3

Start

Run the server

Server binaries bind to 127.0.0.1:8420 by default.

KILN_MODEL_PATH=./Qwen3.5-4B ./kiln serve

Docker (Linux + NVIDIA Container Toolkit)

docker run --gpus all -p 8420:8420 \
  -e KILN_MODEL_PATH=/models/Qwen3.5-4B \
  -v "$PWD/Qwen3.5-4B:/models/Qwen3.5-4B:ro" \
  ghcr.io/ericflo/kiln-server:latest serve
4

Verify

Open the UI, check health, send chat

Open the dashboard

Visit http://127.0.0.1:8420/ui to inspect status, adapters, training jobs, and quick inference from the dashboard.

Check /health

kiln health
curl -s http://127.0.0.1:8420/health \
  | python3 -m json.tool

Send chat

curl -s http://127.0.0.1:8420/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "max_tokens": 64,
    "temperature": 0.7
  }' | python3 -m json.tool

This is the first inference checkpoint: get one response before moving on to SFT or GRPO.

Kiln dashboard showing server status, adapters, training, and quick inference controls
Use the dashboard as your status checkpoint before starting adapter or training workflows.

If /ui, /health, or the chat request fails, use Troubleshooting to check the model path, binary download, CUDA/Vulkan/Metal setup, Docker, and health checks before retrying.

Prefer terminal-first checks? See the CLI Reference for kiln health, training, and adapter commands.