Commands
Complete reference for all blxbench commands.
This reference assumes the blxbench command is available (install globally as @bitslix/blxbench).
Interactive TUI
For the visual TUI interface, see TUI Guide.
Running Benchmarks
# Start interactive TUI
blxbenchFrom TUI
The TUI is command-driven. Type /help to list available commands and use Tab to complete suggestions.
| Command | Description |
|---|---|
/show | Show current run configuration |
/set provider or /provider | Pick a provider interactively |
| `/set models [list | id,id]or/models` |
| `/set categories [* | a,b]` |
/set levels easy,medium,hard | Filter difficulty levels |
/set limit N | Limit tests per category |
/set ratelimit RPM-or-off | Throttle provider calls |
| `/set fail-fast [on | off]` |
| `/set report html | json |
/set output-dir PATH | Write reports somewhere other than ~/.blxbench/reports |
/report clear | Clear generated reports in the default report directory |
| `/report submit on | off` |
/resume | Open a past run (recent report.json under the active results directory) to review or upload again |
/auth login, /auth logout, /auth whoami | Manage web account credentials |
/playwright status/install/uninstall | Manage Playwright Chromium |
/run | Start the benchmark |
After a run (TUI)
When a benchmark finishes, the TUI shows a summary and log until you leave the run screen. While you are still on that screen:
qorEsc— Return to the command shellsorr— Manually upload the generatedreport.jsonto the public leaderboard (independent of/report submit on|off). Requires sign-in and a Bencher, Founder, or Admin role for public submit.
If auto-upload was off or failed, use s / r to try again. A duplicate run_id is rejected by the server (HTTP 409) — you need a new run to create a new public entry.
The /resume command lists recent reports under the same directory as the runner: the default ~/.blxbench/reports/ (or the path from /set output-dir). Select one to open a read-only “replay” view; s / r upload that file, q / Esc return to the shell.
See TUI for the full walkthrough.
Headless Mode
Run benchmarks without the TUI when stdout is not a TTY, or force it with --headless. Pass options directly — there is no run subcommand:
blxbench --headless --provider <alias> --models <model-id> [more-model-ids...]See Headless Mode for CI/CD integration.
Options
| Flag | Description | Default |
|---|---|---|
--provider | Provider alias | opr (OpenRouter) |
--models | Model ID(s) | (required) |
--api-key | Sets BLXBENCH_API_KEY for this process | — |
--tests-dir | Path to tests directory | Built-in tests |
--category | Filter categories | All |
--level | Filter difficulty | All |
--limit | Max tests per category | All |
--save-json | Output JSON path | Auto |
--fail-fast | Stop on first failure | false |
--ratelimit | Requests per minute | 7 (when flag has no value) |
--dotenv-path | Custom .env file | .env |
--clear | Clear the default report directory | false |
--install-chromium | Install Playwright | false |
--skip-render-validation | Skip UI render stage for coding_ui | false |
--submit | Upload report after run | false |
Utility Commands
Version
blxbench --version
blxbench -VPrints blxbench <semver>. The TUI footer shows the same version as v<semver>.
Clear Results
blxbench --headless --clearRemoves generated artifacts while preserving ranking files.
By default, reports live in ~/.blxbench/reports/ on Linux/macOS and %USERPROFILE%\.blxbench\reports\ on Windows.
Install Chromium
blxbench --headless --install-chromiumDownloads Playwright Chromium for UI rendering tests.
In the TUI, the same setup is available as /playwright install. Use /playwright status to check whether Chromium is already detected.
Environment Variables
| Variable | Description |
|---|---|
OPENROUTER_API_KEY | OpenRouter (opr) |
OPENAI_API_KEY | OpenAI adapter (oai) |
HF_TOKEN | Hugging Face (hgf) |
TOGETHER_API_KEY | Together (tgr) |
PORTKEY_API_KEY | Portkey (ptk) |
CLOUDFLARE_API_TOKEN | Cloudflare (cfr) |
BLXBENCH_API_KEY | BLXBench API key for headless submit |
BLXBENCH_SUBMIT | Set to 1 or true to upload after a headless run |
Examples
Run all tests (OpenRouter):
blxbench --headless --provider opr --models openai/gpt-5.4-miniRun specific categories:
blxbench --headless --provider opr --models openai/gpt-5.4-mini --category speed reasoningLimit test count:
blxbench --headless --provider opr --models openai/gpt-5.4-mini --limit 5Upload results:
blxbench --headless --provider opr --models openai/gpt-5.4-mini --submit