AI Model Benchmark Leaderboard
BLXBench
Category-aware model rankings from local BLXBench runs, grouped by task domain, difficulty level, pass rate, and latency.
Top score
Executed tests
Est. API spend
Top decode
Categories
Levels
RankDetailModelPassScoreLatencytok/sCost