Skip to content

Parser

Source file: src/parser/result.ts

The parser solves a fundamental problem with LLM output — it’s unpredictable. Even when the prompt explicitly says “return raw JSON, no markdown fencing, no extra text,” models routinely wrap responses in ```json code fences, add conversational preamble like “Here are the results:”, or return a single object instead of the expected array. The parser uses a fallback pipeline: try the most likely format first (raw JSON), and if it doesn’t work, try progressively looser extraction strategies (fenced code blocks, bracket matching). This means the tool handles messy LLM output gracefully instead of failing on format deviations that are, in practice, completely normal.

type TestStatus = "pass" | "fail" | "error" | "invalid" | "skip";
interface TestResult {
status: TestStatus;
id?: string;
expectation?: string; // fail only
observed?: string; // fail only
location?: string; // fail only
resolution?: string; // fail only
error?: string; // error only
}

LLM output is unpredictable. The parser uses a fallback pipeline that tries multiple extraction strategies in order, stopping at the first success.

parseResults(rawOutput) — primary entry point

Section titled “parseResults(rawOutput) — primary entry point”

Returns an array of TestResult. Tries in order:

StepStrategyExample input it handles
1Raw JSON array[{"id":"a","status":"pass"}]
2Raw JSON object (wrap in array){"id":"a","status":"pass"}
3Fenced JSON (array or object)```json\n[...]\n```
4Bracket-matched JSON arrayHere are results: [...] Hope that helps!
5Bracket-matched JSON objectHere is the result: {...}
6Error fallback"Could not parse LLM response as JSON"

Same pipeline but returns a single TestResult instead of an array.

Once raw JSON is extracted, each object is normalized:

  1. Status validation — must be one of pass, fail, error, invalid, skip. Unknown statuses become an error.
  2. ID extractionid field is preserved if present.
  3. Fail-specific fieldsexpectation, observed, location, resolution are only kept for fail status.
  4. Type coercion — all preserved fields must be strings; non-string values are silently dropped.
  • Why multiple strategies? LLMs frequently ignore instructions to return raw JSON. They wrap output in markdown fences, add explanatory text, or return a single object instead of an array. The fallback pipeline handles all these cases without requiring the user to retry.
  • Why not just regex? Bracket matching (indexOf("{") to lastIndexOf("}")) is more reliable than regex for extracting JSON from mixed content, since JSON can contain nested braces.
  • Empty response handling is done at the execution layer (retry logic), not in the parser. The parser returns an error result for empty input, and the executor decides whether to retry.