Test fixture
Roblox OpenGameEval tasks. Special category: visible in breakdowns, excluded from Overall.
The model receives the prompt (and optional system message). The run uses scorer roblox_open_eval with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.
[Roblox OpenGameEval] I tried to make the cars more slippery so they could drift, but now they feel even more stuck to the road and difficult to turn. It feels like the friction went up instead of down. What's wrong?
{
"input_script": "[embedded in CLI runner]",
"upstream_path": "DebugEvals/004_reduce_car_friction_enable_sliding_bug_2.lua",
"upstream_sha256": "3a12fa12b2d98eb68e84c27ffc4a9f2fd1093c3ece66c4a70755e57e5e0879c5",
"scenario_name": "004_reduce_car_friction_enable_sliding_bug_2",
"place": "racing.rbxl",
"eval_kind": "debug"
}temperature
0
max_tokens
1
timeout (s)
900
type
scored
file
004_reduce_car_friction_enable_sliding_bug_2.json