Skip to content

Runner

Source files: src/runner/types.ts, src/runner/adapters/, src/runner/models.ts, src/runner/execute.ts

The runner invokes LLM CLIs as child processes. Each LLM has a different CLI interface — Claude takes prompts as positional arguments (claude --print "prompt"), while Codex, Gemini, and OpenCode accept them via stdin. The adapter pattern abstracts these differences behind a common RunnerAdapter interface, so the execution loop doesn’t know or care which LLM it’s talking to. It just calls adapter.buildCommand(prompt, model) and gets back a uniform CommandSpec that the process spawner can execute.

Tests run sequentially because LLM CLIs are rate-limited — parallel invocation would trigger throttling or errors. Each test goes through the full cycle (build prompt, spawn process, capture output, parse results) before the next one starts.

Describes how to invoke an LLM CLI:

interface CommandSpec {
command: string; // CLI binary name (e.g. "claude")
args: string[]; // Command-line arguments
stdin?: string; // Optional stdin input
}

Interface every adapter implements:

interface RunnerAdapter {
buildCommand(prompt: string, model?: string): CommandSpec;
parseRawOutput(raw: string): string;
}

Each supported LLM CLI has a dedicated adapter in src/runner/adapters/:

AdapterCLIPrompt deliveryparseRawOutput behaviour
claude.tsclaudePositional arg (--print <prompt>)Extracts .result or .text from JSON wrapper, falls back to raw
codex.tscodexstdinPass-through
gemini.tsgeministdinPass-through
opencode.tsopencodestdinPass-through

src/runner/adapters/index.ts maps runner names to adapter instances:

const adapterRegistry: Record<string, RunnerAdapter> = {
claude: claudeAdapter,
codex: codexAdapter,
gemini: geminiAdapter,
opencode: opencodeAdapter,
};
function resolveAdapter(runner: string): RunnerAdapter

Throws with a clear error listing supported runners if the name is unknown.

src/runner/models.ts maps each (runner, capability) pair to a concrete model ID:

Runnerhighbalancedfast
claudeclaude-opus-4-6claude-sonnet-4-6claude-haiku-4-5-20251001
codexo3o4-minigpt-4.1-mini
geminigemini-2.5-progemini-2.5-flashgemini-2.5-flash-lite
opencodeo3o4-minigpt-4.1-mini

executeTests() in src/runner/execute.ts is the core orchestrator:

  1. Resolves the adapter and model from config
  2. Iterates over tests sequentially (LLM CLIs are rate-limited)
  3. For each test:
    • Builds the prompt
    • Builds the CLI command via the adapter
    • Spawns the process (src/utils/process.ts)
    • Parses the response
    • Retries up to 3 times on empty responses
    • Fires progress callbacks for terminal output
  4. Aggregates all results into a RunResult
interface RunResult {
tests: TestRunResult[];
summary: {
total: number;
passed: number;
failed: number;
errored: number;
invalid: number;
skipped: number;
};
status: "pass" | "fail" | "error";
timestamp: string;
}

The overall run status follows: error > fail > pass. Invalid and skipped results don’t affect the overall status.

The execution loop accepts optional callbacks for live terminal feedback:

CallbackFires when
onTestStartA test begins execution
onTestRetryAn empty response triggers a retry
onTestCompleteA test finishes (success or failure)
onDebugOutputRaw stdout/stderr is captured (debug mode only)