Configuration
Configure blxbench via files, environment variables, and flags.
The CLI is distributed as the npm package @bitslix/blxbench; configuration below applies to the blxbench command in your shell.
Configuration Files
.env File
BLXBench loads configuration from a .env file in the current directory or the path specified with --dotenv-path:
# Example keys (use the env vars required by your chosen adapter)
OPENROUTER_API_KEY=sk-or-...
OPENAI_API_KEY=sk-...
BLXBENCH_API_KEY=your-blxbench-keyProvider API keys are used locally to call model APIs. BLXBENCH_API_KEY is a separate web-app key used only when uploading reports with --submit or BLXBENCH_SUBMIT=1.
Results Directory
By default, generated reports are saved to ~/.blxbench/reports/ (%USERPROFILE%\.blxbench\reports on Windows). This path is used by both the installed native binary and local development builds, so reports do not depend on the current working directory being writable.
In headless mode, --save-json writes an additional JSON export to a custom path:
blxbench --headless --save-json ./custom-results.jsonIn the TUI, use /set output-dir PATH to change the report directory for the interactive run.
Provider Configuration
The installed CLI includes the official provider adapters in the native bundle. In the source repo they live under packages/benchmark-core/adapters/. Each adapter exposes a provider alias (argument in meta.json):
| Alias | Adapter | Typical env var |
|---|---|---|
opr | OpenRouter | OPENROUTER_API_KEY |
oai | OpenAI | OPENAI_API_KEY |
hgf | Hugging Face | HF_TOKEN |
tgr | Together | TOGETHER_API_KEY |
ptk | Portkey | PORTKEY_API_KEY |
cfr | Cloudflare | CLOUDFLARE_API_TOKEN |
The default --provider is opr. Model ids are whatever that endpoint accepts (e.g. OpenRouter-style vendor/model-name).
Test Filters
Categories
Filter tests by category (folder names under packages/benchmark-core/tests/):
| Category | Role |
|---|---|
speed | Latency-sensitive correctness |
security | Safe outputs and vulnerability awareness |
reasoning | Structured / numeric reasoning |
debugging | Small patches and bug fixes |
refactoring | Behavior-preserving edits |
hallucination | Grounding under tricky prompts |
coding_ui | HTML artifacts + optional Playwright render |
Difficulty Levels
Filter by difficulty:
| Level | Description |
|---|---|
easy | Lighter fixtures |
medium | Representative difficulty |
hard | Stricter / longer tasks |
Legacy German labels (easy, …) are normalized to these ids.
Advanced Options
Playwright Configuration
For coding_ui (and other HTML render checks), Playwright Chromium should be installed:
blxbench --headless --install-chromiumIn the TUI, use:
/playwright install
/playwright statusPlaywright stores Chromium in its normal per-user browser cache:
| OS | Default cache |
|---|---|
| Linux | ~/.cache/ms-playwright |
| macOS | ~/Library/Caches/ms-playwright |
| Windows | %LOCALAPPDATA%\ms-playwright |
Skip render validation if Chromium is missing:
blxbench --headless --skip-render-validationRate Limiting
| Value | Behavior |
|---|---|
| Unset | No rate limiting |
--ratelimit | Default RPM |
--ratelimit 30 | Custom RPM |
Fail Fast
Stop on first test failure:
blxbench --headless --fail-fastConfiguration Priority
Roughly: built-in defaults → values from .env → environment variables → flags (flags win when both apply).
Custom Tests Directory
Use a custom test tree:
blxbench --headless --tests-dir ./my-tests --provider opr --models openai/gpt-5.4-miniYour directory should mirror the fixture layout expected by benchmark-core. See Our Tests for how catalog entries map to files.