BLXBenchBLXBench UI
blxbench
BLXBenchBLXBench UI

Benchmark

Levels

Misc

DocsDownload blxbenchOur TestsPassSponsor / Partnership

Benchmarks

Levels

Misc

DocsDownload blxbenchOur TestsPassSponsor / Partnership
Updated Apr 25, 10:23 PM6 models372 fixtures
blxbench

AI Model Benchmark Leaderboard

BLXBench

Category-aware model rankings from local BLXBench runs, grouped by task domain, difficulty level, pass rate, and latency.

RunByWhenTestsCost
run_fa781eBApr 25, 10:23 PM7$0.02run_8e2200BApr 25, 10:23 PM366$0.12run_5434c2BApr 25, 10:23 PM7$0.00run_f274e6BApr 25, 10:23 PM14$0.00run_c83c0dBApr 25, 10:23 PM7$0.00Show all runs (10)
Top score
Ling 2.6 1t81.6
Executed tests
372 available fixtures436
Est. API spend
Sum of per-model costs from overall_ranking$0.14
Top decode
Deepseek V4 Pro3067.2 tok/s
Categories
Coding Ui / Debugging / Hallucination / Reasoning / Refactoring / Security / Speed7
Levels
easy / medium / hard3

Benchmark

OverallAll levels

RankDetailModelPassScoreLatencytok/sCost
1ILing 2.6 1tinclusionai/ling-2.6-1t:free6/781.63.31s106.7$0.00
2MMinimax M2.5minimax/minimax-m2.5:free5/773.634.81s767.0$0.00
3BQianfan Ocr Fastbaidu/qianfan-ocr-fast:free4/759.33.29s1476.6$0.00
4DDeepseek V4 Flashdeepseek/deepseek-v4-flash2/728.617.75s563.2$0.0019
5DDeepseek V4 Prodeepseek/deepseek-v4-pro103/39426.17.53s3067.2$0.14
6THy3 Previewtencent/hy3-preview:free1/146.314.80s46.5$0.00
I

Selected model

Ling 2.6 1t

inclusionai/ling-2.6-1t:free

Score81.6
Pass rate85.7
Tests6/7
Avg latency3.31s
TTFT971 ms
Decode106.7 tok/s
Slice cost$0.00
Runs1
Coding Ui
71.0
Debugging
0.0
Hallucination
100.0
Reasoning
100.0
Refactoring
100.0
Security
100.0
Speed
100.0
Run context

Aggregated from 1 run with last observation Apr 25, 10:23 PM.

Open model detail

BLXBench

Community driven leaderboardPublic benchmark runner — run in your environment, share results with the community.

© 2026 BLXBench by bitslix.com

ProvenanceAggregated from user runs
Scope6 / 7 / 372
Latestrun_fa781e / 7 / $0.02
TermsPrivacy