BLXBench Docs
BLXBench Docs
LeaderboardOur TestsSponsor / PartnershipDocumentationInstallationQuick StartTUICommandsHeadless ModeConfigurationLeaderboardOur TestsAccountAboutFAQSupport

Commands

Complete reference for all blxbench commands.

This reference assumes the blxbench command is available (install globally as @bitslix/blxbench).

Interactive TUI

For the visual TUI interface, see TUI Guide.

Running Benchmarks

# Start interactive TUI
blxbench

From TUI

The TUI is command-driven. Type /help to list available commands and use Tab to complete suggestions.

CommandDescription
/showShow current run configuration
/set provider or /providerPick a provider interactively
`/set models [listid,id]or/models`
`/set categories [*a,b]`
/set levels easy,medium,hardFilter difficulty levels
/set limit NLimit tests per category
/set ratelimit RPM-or-offThrottle provider calls
`/set fail-fast [onoff]`
`/set report htmljson
/set output-dir PATHWrite reports somewhere other than ~/.blxbench/reports
/report clearClear generated reports in the default report directory
`/report submit onoff`
/resumeOpen a past run (recent report.json under the active results directory) to review or upload again
/auth login, /auth logout, /auth whoamiManage web account credentials
/playwright status/install/uninstallManage Playwright Chromium
/runStart the benchmark

After a run (TUI)

When a benchmark finishes, the TUI shows a summary and log until you leave the run screen. While you are still on that screen:

  • q or Esc — Return to the command shell
  • s or r — Manually upload the generated report.json to the public leaderboard (independent of /report submit on|off). Requires sign-in and a Bencher, Founder, or Admin role for public submit.

If auto-upload was off or failed, use s / r to try again. A duplicate run_id is rejected by the server (HTTP 409) — you need a new run to create a new public entry.

The /resume command lists recent reports under the same directory as the runner: the default ~/.blxbench/reports/ (or the path from /set output-dir). Select one to open a read-only “replay” view; s / r upload that file, q / Esc return to the shell.

See TUI for the full walkthrough.

Headless Mode

Run benchmarks without the TUI when stdout is not a TTY, or force it with --headless. Pass options directly — there is no run subcommand:

blxbench --headless --provider <alias> --models <model-id> [more-model-ids...]

See Headless Mode for CI/CD integration.

Options

FlagDescriptionDefault
--providerProvider aliasopr (OpenRouter)
--modelsModel ID(s)(required)
--api-keySets BLXBENCH_API_KEY for this process—
--tests-dirPath to tests directoryBuilt-in tests
--categoryFilter categoriesAll
--levelFilter difficultyAll
--limitMax tests per categoryAll
--save-jsonOutput JSON pathAuto
--fail-fastStop on first failurefalse
--ratelimitRequests per minute7 (when flag has no value)
--dotenv-pathCustom .env file.env
--clearClear the default report directoryfalse
--install-chromiumInstall Playwrightfalse
--skip-render-validationSkip UI render stage for coding_uifalse
--submitUpload report after runfalse

Utility Commands

Version

blxbench --version
blxbench -V

Prints blxbench <semver>. The TUI footer shows the same version as v<semver>.

Clear Results

blxbench --headless --clear

Removes generated artifacts while preserving ranking files.

By default, reports live in ~/.blxbench/reports/ on Linux/macOS and %USERPROFILE%\.blxbench\reports\ on Windows.

Install Chromium

blxbench --headless --install-chromium

Downloads Playwright Chromium for UI rendering tests.

In the TUI, the same setup is available as /playwright install. Use /playwright status to check whether Chromium is already detected.

Environment Variables

VariableDescription
OPENROUTER_API_KEYOpenRouter (opr)
OPENAI_API_KEYOpenAI adapter (oai)
HF_TOKENHugging Face (hgf)
TOGETHER_API_KEYTogether (tgr)
PORTKEY_API_KEYPortkey (ptk)
CLOUDFLARE_API_TOKENCloudflare (cfr)
BLXBENCH_API_KEYBLXBench API key for headless submit
BLXBENCH_SUBMITSet to 1 or true to upload after a headless run

Examples

Run all tests (OpenRouter):

blxbench --headless --provider opr --models openai/gpt-5.4-mini

Run specific categories:

blxbench --headless --provider opr --models openai/gpt-5.4-mini --category speed reasoning

Limit test count:

blxbench --headless --provider opr --models openai/gpt-5.4-mini --limit 5

Upload results:

blxbench --headless --provider opr --models openai/gpt-5.4-mini --submit

TUI

BLXBench interactive Terminal User Interface.

Headless Mode

Running benchmarks in automated environments.

On this page

Interactive TUIRunning BenchmarksFrom TUIAfter a run (TUI)Headless ModeOptionsUtility CommandsVersionClear ResultsInstall ChromiumEnvironment VariablesExamples