Technical Overview

semtest-runner is a pipeline-based CLI tool. Data flows forward through discrete stages, each transforming it into the input the next stage needs. The mental model: load config, find test files, construct prompts, invoke LLMs, parse responses, validate results, generate reports, print a summary. Each stage is a separate module with clear inputs and outputs — the config module produces a SemtestConfig, discovery produces SemanticTest[], the runner produces RunResult, and so on. This makes the codebase easy to navigate: if you want to understand how prompts work, you read src/prompt/builder.ts. If you want to understand how LLM output is parsed, you read src/parser/result.ts.

Execution flow

When a user runs semtest run, the following pipeline executes:

CLI parses flags and arguments via Commander
Config Loader finds and validates semtest.config.ts using jiti + Zod
Discovery scans the test directory for .md files (or resolves specific file paths)
Execute Loop iterates over each test sequentially:
- Prompt Builder constructs the full LLM prompt from the test file content
- Adapter Registry resolves the correct CLI adapter (claude, codex, gemini, opencode)
- Model Resolver maps the capability level (high/balanced/fast) to a concrete model ID
- Process Spawner invokes the LLM CLI as a child process
- Parser extracts JSON results from the raw LLM output (with fallback strategies)
- Retries up to 3 times on empty responses
Validation checks for duplicate IDs, missing IDs, and invalid results
Reports are generated (Markdown + JSON) and written to the output directory
Terminal Output prints a summary with colour-coded results
Exit code is determined: 0 = pass, 1 = fail, 2 = error

Module diagram

The diagram below shows how modules connect. Dashed lines indicate optional or feedback paths.

Legend

Colour	Category	Modules
Blue	Entry point	CLI
Amber	Core pipeline	Config, Discovery, Execute Loop, Prompt Builder, Adapter Registry, Model Resolver, Process Spawner, Parser, Validation
Green	Output layer	MD Report, JSON Report, Terminal Output, Debug Output

Key types

The most important types flow through the pipeline:

Type	Module	Purpose
`SemtestConfig`	config/schema	Validated configuration object
`SemanticTest`	discovery/tests	Discovered test file with name, path, and content
`CommandSpec`	runner/types	Command + args + optional stdin for a CLI invocation
`RunnerAdapter`	runner/types	Interface each LLM adapter implements
`TestResult`	parser/result	Parsed result for a single test scenario
`TestRunResult`	runner/execute	Result tied to its source file
`RunResult`	runner/execute	Full run output with summary and status
`CIResult`	report/json	JSON report shape consumed by CI
`ValidationResult`	validation/results	Validation issues found post-run