About
What BLXBench is and why it exists.
Mission
BLXBench provides independent, reproducible benchmarks for AI models. We believe in:
- Transparency — All tests are open source
- Reproducibility — Anyone can run the same benchmarks
- Fairness — No provider pays for placement
What We Do
BLXBench evaluates AI models across focused fixture categories:
- Speed — How fast does the model respond?
- Security — Does it refuse harmful requests appropriately?
- Reasoning — Can it handle complex logical tasks?
- Debugging — Can it identify and fix bugs?
- Refactoring — Can it improve code while preserving behavior?
- Hallucination — Does it stay grounded under tricky prompts?
- Coding UI — Can it generate and validate UI artifacts?
How It Works
Running Benchmarks
Anyone can run benchmarks using blxbench (install from npm as @bitslix/blxbench — see Installation):
blxbench --headless --provider opr --models openai/gpt-5.4-miniResults can be submitted to appear on the public leaderboard.
Scoring
Models are scored on:
- Earned score vs max score per test, rolled up to categories and overall
- Rankings use that aggregate; categories with more tests contribute proportionally more to the headline percentage
No Paid Placements
BLXBench does not accept payment for leaderboard placement. Results are based purely on benchmark performance.
Team
BLXBench is operated by bitslix.com.
Contact
See Support for help.