Skip to content

Writing Tests

Semantic tests are files that describe expected behaviour of a codebase in natural language. They can be any file type — .md, .txt, .pdf, .json, .ts, or anything else an LLM can parse. Instead of asserting on specific function outputs, they describe what should be true about the code, and an LLM evaluates whether the codebase satisfies those expectations.

Test files live in the directory configured by tests in your config (default: semantic-tests/). The runner discovers all files in this directory.

semtests/
├── adapter-pattern.md
├── cli-interface.txt
├── config-schema.md
├── esm-compliance.md
└── project-structure.txt

A semantic test file should have:

  1. A heading with an ID — used by the LLM to identify the test scenario
  2. An expectation section — what the code should do
  3. A behaviour section — specific details to verify
# Project Structure - id: project-structure
## Expectation
The project should follow a well-organized directory layout with all
TypeScript source code under `src/`, organized into logical subdirectories
for each concern: `config`, `discovery`, `prompt`, `parser`, `runner`,
`report`, `output`, `validation`, and `utils`.
## Behaviour
The `src/` directory should contain subdirectories for each module:
`config/` (Zod schema and config loader), `discovery/` (test file discovery),
`prompt/` (LLM prompt construction), `parser/` (JSON result parsing),
`runner/` (LLM CLI adapters and execution), `report/` (Markdown and JSON
report generation), `output/` (terminal progress and summary),
`validation/` (result validation), and `utils/` (process and filesystem
helpers).
# Adapter Pattern - id: adapter-pattern
## Expectation
The runner module should implement an adapter pattern where each supported
LLM CLI (claude, codex, gemini, opencode) has a dedicated adapter
conforming to a shared `RunnerAdapter` interface, with a registry that
resolves the correct adapter by runner name.
## Behaviour
Four adapter files should exist under `src/runner/adapters/`: `claude.ts`,
`codex.ts`, `gemini.ts`, and `opencode.ts`. Each adapter should implement
the `RunnerAdapter` interface, which defines `buildCommand()` and
`parseRawOutput()`. The Claude adapter should pass the prompt as a
positional argument (not via stdin), while the other adapters should pass
the prompt via stdin.
  • Be specific — mention exact file paths, function names, and expected types when possible
  • Use IDs — include id: my-id in headings so the LLM extracts consistent identifiers
  • One concern per file — each file should test one aspect of the codebase
  • Describe observable facts — focus on what can be verified by reading the code (file existence, exports, types, patterns)
  • Avoid implementation details — describe what the code does, not how it does it internally (unless that’s what you’re testing)