Running Tests
Basic usage
Section titled “Basic usage”# Run all tests in the configured directorysemtest run
# Run specific test filessemtest run auth-middleware.md api-routes.txt
# Run with full pathssemtest run semtests/auth-middleware.mdWhen file arguments are provided, they’re resolved against cwd first, then against the configured tests directory.
CLI flags
Section titled “CLI flags”| Flag | Type | Default | Description |
|---|---|---|---|
--timestamp | boolean | false | Generate a timestamped copy of the Markdown report |
--include-passing | boolean | false | Include passing tests in the Markdown report |
--strict | boolean | false | Exit code 2 if validation issues are found |
--skip-validation | boolean | false | Skip post-run validation entirely |
--extensions <exts> | string | (all files) | Comma-separated file extensions (e.g. .md,.txt) |
--debug | boolean | false | Log raw LLM output to {output}/debug/ |
Config file options
Section titled “Config file options”All CLI flags can also be set in semtest.config.ts:
import { defineConfig } from "@thulanek/semtest-runner";
export default defineConfig({ tests: "semtests/", output: "semantic-test-results/", llm: { runner: "claude", capability: "balanced", }, strict: true, debug: true, timestamp: true, includePassing: false, extensions: [".md", ".txt"],});Flag precedence
Section titled “Flag precedence”CLI flags always override config file values:
CLI flag > config file > schema defaultFor example, if the config has strict: true but you run semtest run without --strict, strict mode is still enabled. But if you explicitly pass a flag, it wins.
Exit codes
Section titled “Exit codes”| Code | Meaning | When |
|---|---|---|
0 | Pass | All tests passed |
1 | Fail | At least one test failed (but no errors) |
2 | Error | LLM subprocess error, parse error, or --strict with validation issues |
Precedence: error (2) > fail (1) > pass (0)
Debug mode
Section titled “Debug mode”When --debug is enabled:
- A
debug/directory is created inside the output directory - For each test file, a JSON file is written containing all retry attempts
- Each attempt includes the raw
stdout,stderr, andexitCodefrom the LLM CLI
semtest run --debugThis is useful for diagnosing unexpected LLM responses or retry behaviour.