Skip to content

semtest-runner (Internal)

Architecture and contributor documentation

semtest-runner is a semantic testing tool that evaluates codebases against Markdown specifications using LLMs. Instead of writing unit tests in code, you write expectations in plain language Markdown. The tool sends those expectations to an LLM CLI (like Claude, Codex, or Gemini) along with access to your codebase, and the LLM evaluates whether the code meets each specification. Results come back as structured JSON that gets turned into reports and CI signals.

Traditional tests verify code paths — function inputs and outputs, branching logic, integration behaviour. Semantic tests verify intent. They describe what the codebase should look like, what patterns should be followed, what conventions should hold — things that are hard to express as unit tests. A semantic test might say “all API routes should use the auth middleware” or “the config schema should validate runner names.” The LLM reads the actual code and judges compliance.

When you run semtest run, data flows through a linear pipeline:

Config → Discover → Prompt → Execute → Parse → Validate → Report
  1. Config — Load and validate semtest.config.ts
  2. Discover — Find .md test files in the configured directory
  3. Prompt — Wrap each test file’s content in structured LLM instructions
  4. Execute — Invoke the LLM CLI as a child process, one test at a time
  5. Parse — Extract structured JSON from the LLM’s response
  6. Validate — Check for suite-level issues (duplicate IDs, missing IDs)
  7. Report — Generate Markdown and JSON reports, print terminal summary