Runner
Source files: src/runner/types.ts, src/runner/adapters/, src/runner/models.ts, src/runner/execute.ts
The runner invokes LLM CLIs as child processes. Each LLM has a different CLI interface — Claude takes prompts as positional arguments (claude --print "prompt"), while Codex, Gemini, and OpenCode accept them via stdin. The adapter pattern abstracts these differences behind a common RunnerAdapter interface, so the execution loop doesn’t know or care which LLM it’s talking to. It just calls adapter.buildCommand(prompt, model) and gets back a uniform CommandSpec that the process spawner can execute.
Tests run sequentially because LLM CLIs are rate-limited — parallel invocation would trigger throttling or errors. Each test goes through the full cycle (build prompt, spawn process, capture output, parse results) before the next one starts.
CommandSpec
Section titled “CommandSpec”Describes how to invoke an LLM CLI:
interface CommandSpec { command: string; // CLI binary name (e.g. "claude") args: string[]; // Command-line arguments stdin?: string; // Optional stdin input}RunnerAdapter
Section titled “RunnerAdapter”Interface every adapter implements:
interface RunnerAdapter { buildCommand(prompt: string, model?: string): CommandSpec; parseRawOutput(raw: string): string;}Adapter pattern
Section titled “Adapter pattern”Each supported LLM CLI has a dedicated adapter in src/runner/adapters/:
| Adapter | CLI | Prompt delivery | parseRawOutput behaviour |
|---|---|---|---|
claude.ts | claude | Positional arg (--print <prompt>) | Extracts .result or .text from JSON wrapper, falls back to raw |
codex.ts | codex | stdin | Pass-through |
gemini.ts | gemini | stdin | Pass-through |
opencode.ts | opencode | stdin | Pass-through |
Adapter registry
Section titled “Adapter registry”src/runner/adapters/index.ts maps runner names to adapter instances:
const adapterRegistry: Record<string, RunnerAdapter> = { claude: claudeAdapter, codex: codexAdapter, gemini: geminiAdapter, opencode: opencodeAdapter,};
function resolveAdapter(runner: string): RunnerAdapterThrows with a clear error listing supported runners if the name is unknown.
Model matrix
Section titled “Model matrix”src/runner/models.ts maps each (runner, capability) pair to a concrete model ID:
| Runner | high | balanced | fast |
|---|---|---|---|
| claude | claude-opus-4-6 | claude-sonnet-4-6 | claude-haiku-4-5-20251001 |
| codex | o3 | o4-mini | gpt-4.1-mini |
| gemini | gemini-2.5-pro | gemini-2.5-flash | gemini-2.5-flash-lite |
| opencode | o3 | o4-mini | gpt-4.1-mini |
Execution loop
Section titled “Execution loop”executeTests() in src/runner/execute.ts is the core orchestrator:
- Resolves the adapter and model from config
- Iterates over tests sequentially (LLM CLIs are rate-limited)
- For each test:
- Builds the prompt
- Builds the CLI command via the adapter
- Spawns the process (
src/utils/process.ts) - Parses the response
- Retries up to 3 times on empty responses
- Fires progress callbacks for terminal output
- Aggregates all results into a
RunResult
RunResult
Section titled “RunResult”interface RunResult { tests: TestRunResult[]; summary: { total: number; passed: number; failed: number; errored: number; invalid: number; skipped: number; }; status: "pass" | "fail" | "error"; timestamp: string;}Status precedence
Section titled “Status precedence”The overall run status follows: error > fail > pass. Invalid and skipped results don’t affect the overall status.
Progress callbacks
Section titled “Progress callbacks”The execution loop accepts optional callbacks for live terminal feedback:
| Callback | Fires when |
|---|---|
onTestStart | A test begins execution |
onTestRetry | An empty response triggers a retry |
onTestComplete | A test finishes (success or failure) |
onDebugOutput | Raw stdout/stderr is captured (debug mode only) |