Server status, metrics, UI, and config
/health — server health and diagnostics./v1/health — /v1 compatibility alias for the same health and diagnostics response./metrics — Prometheus metrics for latency, throughput, memory, and training progress./ui — embedded dashboard for status, adapters, training, and chat./v1/stats/decode — live decode tokens/sec and inter-token latency stats used by the dashboard./v1/stats/recent-requests — bounded recent chat-completion history for the dashboard's request panel./v1/models — list the served model./v1/config — return the current server configuration.Check runtime configuration
curl -s http://localhost:8420/v1/config | python3 -m json.tool
Reports detected VRAM, KV-cache sizing and FP8 state, checkpointing, and memory budget.