Test fixture
Code transformation while preserving behavior and intent.
The model receives the prompt (and optional system message). The run uses scorer rubric_json_metrics with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.
Return JSON only with keys plan, refactor, tests. Search code builds filter objects through repeated mutation branches for status, owner, tags, and date range. Refactor into composable filter builders.
{
"metrics": {
"intent": {
"checks": [
{
"contains": [
"filter builders"
]
},
{
"contains": [
"repeated mutation"
]
},
{
"contains": [
"composable"
]
}
]
},
"visible": {
"checks": [
{
"contains": [
"status"
]
},
{
"contains": [
"owner"
]
},
{
"contains": [
"tags"
]
},
{
"contains": [
"date range"
]
}
]
},
"hidden": {
"checks": [
{
"contains": [
"same query"
]
},
{
"contains": [
"empty filters"
]
},
{
"contains": [
"tests"
]
}
]
}
}
}temperature
0
max_tokens
500
timeout (s)
120
type
scored
file
refactor-search-filter-builder.json