Parser

Source file: src/parser/result.ts

The parser solves a fundamental problem with LLM output — it’s unpredictable. Even when the prompt explicitly says “return raw JSON, no markdown fencing, no extra text,” models routinely wrap responses in ```json code fences, add conversational preamble like “Here are the results:”, or return a single object instead of the expected array. The parser uses a fallback pipeline: try the most likely format first (raw JSON), and if it doesn’t work, try progressively looser extraction strategies (fenced code blocks, bracket matching). This means the tool handles messy LLM output gracefully instead of failing on format deviations that are, in practice, completely normal.

`TestResult` type

type TestStatus = "pass" | "fail" | "error" | "invalid" | "skip";

interface TestResult {
  status: TestStatus;
  id?: string;
  expectation?: string;  // fail only
  observed?: string;     // fail only
  location?: string;     // fail only
  resolution?: string;   // fail only
  error?: string;        // error only
}

Parsing pipeline

LLM output is unpredictable. The parser uses a fallback pipeline that tries multiple extraction strategies in order, stopping at the first success.

`parseResults(rawOutput)` — primary entry point

Returns an array of TestResult. Tries in order:

Step	Strategy	Example input it handles
1	Raw JSON array	`[{"id":"a","status":"pass"}]`
2	Raw JSON object (wrap in array)	`{"id":"a","status":"pass"}`
3	Fenced JSON (array or object)	```json\n[...]\n```
4	Bracket-matched JSON array	`Here are results: [...] Hope that helps!`
5	Bracket-matched JSON object	`Here is the result: {...}`
6	Error fallback	`"Could not parse LLM response as JSON"`

`parseResult(rawOutput)` — single result

Same pipeline but returns a single TestResult instead of an array.

`normalizeResult(parsed)`

Once raw JSON is extracted, each object is normalized:

Status validation — must be one of pass, fail, error, invalid, skip. Unknown statuses become an error.
ID extraction — id field is preserved if present.
Fail-specific fields — expectation, observed, location, resolution are only kept for fail status.
Type coercion — all preserved fields must be strings; non-string values are silently dropped.

Design rationale

Why multiple strategies? LLMs frequently ignore instructions to return raw JSON. They wrap output in markdown fences, add explanatory text, or return a single object instead of an array. The fallback pipeline handles all these cases without requiring the user to retry.
Why not just regex? Bracket matching (indexOf("{") to lastIndexOf("}")) is more reliable than regex for extracting JSON from mixed content, since JSON can contain nested braces.
Empty response handling is done at the execution layer (retry logic), not in the parser. The parser returns an error result for empty input, and the executor decides whether to retry.

Parser

TestResult type

Parsing pipeline

parseResults(rawOutput) — primary entry point

parseResult(rawOutput) — single result

normalizeResult(parsed)

Design rationale

`TestResult` type

`parseResults(rawOutput)` — primary entry point

`parseResult(rawOutput)` — single result

`normalizeResult(parsed)`