Test fixture

Debug-Regex-No-Reset-V2

Debuggingv2 — Resilienceeasyscorer: rubric_json_metrics

Bug fixes, edge conditions, and minimal patch accuracy.

How it is scored

The model receives the prompt (and optional system message). The run uses scorer rubric_json_metrics with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.

User prompt

Return JSON only with keys diagnosis, fix, tests. A regex literal compiled with the g flag is stored in a module-level variable and reused across multiple test calls; the lastIndex is not reset between calls so alternating invocations produce wrong match results. Identify the stateful regex bug and fix it.

Scorer config

{
  "metrics": {
    "repro": {
      "checks": [
        {
          "contains": [
            "lastIndex"
          ]
        },
        {
          "contains": [
            "g flag"
          ]
        },
        {
          "contains": [
            "reused"
          ]
        },
        {
          "contains": [
            "alternating"
          ]
        }
      ]
    },
    "hidden": {
      "checks": [
        {
          "contains": [
            "lastIndex = 0"
          ]
        },
        {
          "contains": [
            "reset"
          ]
        },
        {
          "contains": [
            "new RegExp"
          ]
        }
      ]
    },
    "diagnose": {
      "checks": [
        {
          "contains": [
            "stateful regex"
          ]
        },
        {
          "contains": [
            "global flag"
          ]
        },
        {
          "contains": [
            "lastIndex not reset"
          ]
        }
      ]
    }
  }
}

Run parameters

temperature

max_tokens

420

timeout (s)

120

type

scored

file

debug-regex-no-reset-v2.json

← PreviousDebug-Regex-Global-State-V2

Next →Debug-Retry-Thundering-Herd-V2