Test fixture
Latency-sensitive tasks where concise correct output matters.
The model receives the prompt (and optional system message). The run uses scorer contains_all with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.
Write an in-depth technical explanation of a least-recently-used (LRU) cache suitable for senior engineer interview prep. Requirements: - Use exactly these headings in order: ## Motivation ## Structure ## Operations ## Complexity - Include step-by-step Python-like pseudocode for get and put - Discuss why combining a hash map with a doubly linked list achieves typical O(1) operations - Write multiple paragraphs per section; be exhaustive, not terse. Write in English.
{
"expected_contains": [
"## Motivation",
"## Complexity",
"LRU",
"O(1)",
"hash"
]
}temperature
0
max_tokens
2048
timeout (s)
300
type
throughput
file
speed_perf_throughput_data_structures.json