BLXBench - Quick Start

You need the blxbench command on your PATH. Install the published package @bitslix/blxbench first (see Installation).

Step 1: Create Your `.env`

Create a .env file in the directory where you run blxbench. OpenRouter is the default provider alias (opr):

OPENROUTER_API_KEY=

SUMMARY_PROVIDER=openrouter
SUMMARY_MODEL=qwen/qwen3-235b-a22b-2507

VALIDATION_MODEL=openai/gpt-5.4-mini

Fill OPENROUTER_API_KEY with your OpenRouter key. SUMMARY_PROVIDER and SUMMARY_MODEL enable AI-generated run summaries, while VALIDATION_MODEL is used to validate coding_ui fixtures. Without VALIDATION_MODEL, those validation checks are skipped.

Provider keys are different from your BLXBench web key:

OPENROUTER_API_KEY, OPENAI_API_KEY, and similar keys pay and authenticate with the model provider.
BLXBENCH_API_KEY is only for --submit uploads to the leaderboard.

Step 2: Run a Benchmark (headless)

Outside an interactive terminal, or when you pass --headless, blxbench runs without the TUI. There is no run subcommand — pass flags directly:

blxbench --headless --provider opr --models openai/gpt-5.4-mini

(--provider defaults to opr if omitted.) This runs the suite against the given model id(s).

Step 3: Filter Tests (Optional)

Run only specific categories or levels:

# Only speed tests
blxbench --headless --provider opr --models openai/gpt-5.4-mini --category speed

# Only easy difficulty
blxbench --headless --provider opr --models openai/gpt-5.4-mini --level easy

Filtered runs are great for iterating; they do not satisfy the public leaderboard rules. To --submit a ranked result you need a full, unfiltered run with no fail-fast — see Public submission rules.

Step 4: View Results

After the run completes, results are saved under ~/.blxbench/reports/:

# Find the newest report
find ~/.blxbench/reports -name report.json -o -name index.html

On Windows, the same directory is %USERPROFILE%\.blxbench\reports\. Use /set output-dir PATH in the TUI if you want reports somewhere else for a single run.

TUI: interactive run, manual upload, report list

If you start blxbench in an interactive terminal (no --headless), you get the full TUI. After a run:

The summary shows the path to index.html (unless you use /set report json) and shortcuts d (open report details in the TUI: text replay + c for charts), s / r (upload report.json), and q / Esc (back to the shell).
Use /report submit on|off to control automatic upload when the report is written, or leave it off and use s or r to manually upload the same report.json.
Use /report list to pick a previous run, then Enter for the same replay view (charts with c, m to switch models when you ran several).

Details: TUI and Commands — After a run.

Step 5: Upload to Leaderboard (Optional)

To submit a report after a headless run:

export BLXBENCH_API_KEY=your-key
blxbench --headless --provider opr --models openai/gpt-5.4-mini --submit

Uploads require an account, a BLXBench API key, and a paid pass tier that includes submission quota (see Account and Pass / pricing). Even with all of that, --submit only works when the generated report is eligible (full suite, no narrowing filters, completed run — see Public submission rules).

Common Options

Flag	Description
`--headless`	Force non-TUI mode (optional when stdout is not a TTY)
`--provider`	Provider alias: `opr`, `oai`, `hgf`, `tgr`, `ptk`, `cfr` (see docs)
`--models`	One or more model IDs for that provider
`--category`	Filter by fixture category (e.g. `speed`, `security`, `coding_ui`)
`--level`	Filter by difficulty (`easy`, `medium`, `hard`)
`--limit`	Max tests per category
`--save-json`	Custom output path for JSON results
`--fail-fast`	Stop on first failure
`--submit`	POST `report.json` after the run (needs API key + quota)

Next Steps

blxbench Reference — All available commands
Updating blxbench — Keep the CLI current (npm, pnpm, Bun)
Understanding Results — How to read the results

You need the blxbench command on your PATH. Install the published package @bitslix/blxbench first (see Installation).

Step 1: Create Your `.env`

Create a .env file in the directory where you run blxbench. OpenRouter is the default provider alias (opr):

OPENROUTER_API_KEY=

SUMMARY_PROVIDER=openrouter
SUMMARY_MODEL=qwen/qwen3-235b-a22b-2507

VALIDATION_MODEL=openai/gpt-5.4-mini

Provider keys are different from your BLXBench web key:

OPENROUTER_API_KEY, OPENAI_API_KEY, and similar keys pay and authenticate with the model provider.
BLXBENCH_API_KEY is only for --submit uploads to the leaderboard.

Step 2: Run a Benchmark (headless)

Outside an interactive terminal, or when you pass --headless, blxbench runs without the TUI. There is no run subcommand — pass flags directly:

blxbench --headless --provider opr --models openai/gpt-5.4-mini

(--provider defaults to opr if omitted.) This runs the suite against the given model id(s).

Step 3: Filter Tests (Optional)

Run only specific categories or levels:

# Only speed tests
blxbench --headless --provider opr --models openai/gpt-5.4-mini --category speed

# Only easy difficulty
blxbench --headless --provider opr --models openai/gpt-5.4-mini --level easy

Step 4: View Results

After the run completes, results are saved under ~/.blxbench/reports/:

# Find the newest report
find ~/.blxbench/reports -name report.json -o -name index.html

On Windows, the same directory is %USERPROFILE%\.blxbench\reports\. Use /set output-dir PATH in the TUI if you want reports somewhere else for a single run.

TUI: interactive run, manual upload, report list

If you start blxbench in an interactive terminal (no --headless), you get the full TUI. After a run:

The summary shows the path to index.html (unless you use /set report json) and shortcuts d (open report details in the TUI: text replay + c for charts), s / r (upload report.json), and q / Esc (back to the shell).
Use /report submit on|off to control automatic upload when the report is written, or leave it off and use s or r to manually upload the same report.json.
Use /report list to pick a previous run, then Enter for the same replay view (charts with c, m to switch models when you ran several).

Details: TUI and Commands — After a run.

Step 5: Upload to Leaderboard (Optional)

To submit a report after a headless run:

export BLXBENCH_API_KEY=your-key
blxbench --headless --provider opr --models openai/gpt-5.4-mini --submit

Common Options

Flag	Description
`--headless`	Force non-TUI mode (optional when stdout is not a TTY)
`--provider`	Provider alias: `opr`, `oai`, `hgf`, `tgr`, `ptk`, `cfr` (see docs)
`--models`	One or more model IDs for that provider
`--category`	Filter by fixture category (e.g. `speed`, `security`, `coding_ui`)
`--level`	Filter by difficulty (`easy`, `medium`, `hard`)
`--limit`	Max tests per category
`--save-json`	Custom output path for JSON results
`--fail-fast`	Stop on first failure
`--submit`	POST `report.json` after the run (needs API key + quota)

Next Steps

blxbench Reference — All available commands
Updating blxbench — Keep the CLI current (npm, pnpm, Bun)
Understanding Results — How to read the results

Quick Start

Step 1: Create Your `.env`

Step 2: Run a Benchmark (headless)

Step 3: Filter Tests (Optional)

Step 4: View Results

TUI: interactive run, manual upload, report list

Step 5: Upload to Leaderboard (Optional)

Common Options

Next Steps

On this page

Quick Start

Step 1: Create Your `.env`

Step 2: Run a Benchmark (headless)

Step 3: Filter Tests (Optional)

Step 4: View Results

TUI: interactive run, manual upload, report list

Step 5: Upload to Leaderboard (Optional)

Common Options

Next Steps

On this page