semtest-runner (Internal)

Architecture and contributor documentation

What is semtest-runner?

semtest-runner is a semantic testing tool that evaluates codebases against Markdown specifications using LLMs. Instead of writing unit tests in code, you write expectations in plain language Markdown. The tool sends those expectations to an LLM CLI (like Claude, Codex, or Gemini) along with access to your codebase, and the LLM evaluates whether the code meets each specification. Results come back as structured JSON that gets turned into reports and CI signals.

How it differs from traditional testing

Traditional tests verify code paths — function inputs and outputs, branching logic, integration behaviour. Semantic tests verify intent. They describe what the codebase should look like, what patterns should be followed, what conventions should hold — things that are hard to express as unit tests. A semantic test might say “all API routes should use the auth middleware” or “the config schema should validate runner names.” The LLM reads the actual code and judges compliance.

The pipeline

When you run semtest run, data flows through a linear pipeline:

Config → Discover → Prompt → Execute → Parse → Validate → Report

Config — Load and validate semtest.config.ts
Discover — Find .md test files in the configured directory
Prompt — Wrap each test file’s content in structured LLM instructions
Execute — Invoke the LLM CLI as a child process, one test at a time
Parse — Extract structured JSON from the LLM’s response
Validate — Check for suite-level issues (duplicate IDs, missing IDs)
Report — Generate Markdown and JSON reports, print terminal summary