Define the invariants specification in TOML
Intent: Establish a declarative, version-controlled file where architectural rules are expressed as structured records. Each invariant carries an ID, area tag, kind (must / allowed), a natural-language statement, optional file-glob scope, and an optional reviewer hint.
Affected files: invariants/invariants.toml
Evidence
@@ -0,0 +1,34 @@
+version = 1
+name = "pika-project-invariants"
+
+[[invariant]]
+id = "PIKACI-001"
+area = "pikaci"
+kind = "allowed"
+statement = "pikaci may depend on pika-cloud."
+scope = [
+ "crates/pikaci/**",
+ "crates/pika-news/**",
+]
+
+[[invariant]]
+id = "PIKACI-002"
+...
+kind = "must"
+statement = "The reusable pikaci execution layer does not hardcode Pika-specific CI lanes, targets, or package-specific test commands."
+
+[[invariant]]
+id = "PIKACI-003"
+...
+statement = "Pika-specific path filters and lane catalogs live outside the reusable pikaci execution layer."
The file invariants/invariants.toml is the single source of truth for every architectural rule the project wants to enforce via LLM review.
Format highlights
| Field | Purpose |
version | Schema version (must be 1). |
id | Unique, human-readable identifier (e.g. PIKACI-001). |
kind | "must" = required property; "allowed" = permitted dependency / coupling. |
scope | Array of file globs the reviewer should focus on. |
hint | Free-text guidance steering the LLM toward non-obvious checks. |
Three invariants are shipped initially, all in the pikaci area:
- PIKACI-001 (allowed) — pikaci may depend on pika-cloud.
- PIKACI-002 (must) — The reusable execution layer must not hard-code Pika-specific lanes or test commands.
- PIKACI-003 (must) — Pika-specific path filters and lane catalogs must live outside the execution layer.
Because the hint field appears only on PIKACI-002 and PIKACI-003, reviewers (human or LLM) get extra direction only where the check is subtle.
Implement the invariant review script
Intent: Provide the end-to-end orchestration that loads the TOML spec, constructs a structured prompt, invokes the Codex CLI with a JSON output schema, and prints a human-readable pass/fail report.
Affected files: scripts/check_invariants.py
Evidence
@@ -0,0 +1,275 @@
+ROOT = Path(__file__).resolve().parents[1]
+DEFAULT_SPEC_PATH = ROOT / "invariants" / "invariants.toml"
@@ ... @@
+def load_spec(path: Path) -> dict[str, Any]:
+ with path.open("rb") as handle:
+ spec = tomllib.load(handle)
+ if spec.get("version") != 1:
+ raise SystemExit(f"unsupported invariants spec version in {path}")
@@ ... @@
+def run_codex_review(
+ prompt: str,
+ schema: dict[str, Any],
+ model: str | None,
+ verbose: bool = False,
+) -> tuple[dict[str, Any], str]:
+ ...
+ cmd = [
+ "codex",
+ "-a", "never",
+ "exec",
+ "--sandbox", "workspace-write",
+ "--ephemeral",
+ ...
@@ ... @@
+def print_report(spec: dict[str, Any], report: dict[str, Any]) -> int:
+ results = validate_report(spec, report)
+ ...
+ return 1 if failures else 0
scripts/check_invariants.py is the heart of the feature. It is structured as five composable stages:
1. Argument parsing (parse_args)
Accepts --spec (path to TOML), --model (Codex model override, also via PIKA_INVARIANTS_CODEX_MODEL), --json-out (persist raw report), and --verbose.
2. Spec loading and validation (load_spec)
Uses tomllib (Python 3.11+) to parse the TOML. Validates:
version == 1
- At least one
[[invariant]] entry
- No duplicate IDs
kind is "must" or "allowed"
statement is a non-empty string
scope is a string array when present
Every validation failure calls raise SystemExit(...) with a descriptive message, keeping the script suitable for CI where a non-zero exit must be informative.
3. Prompt construction (build_prompt)
Assembles a structured natural-language prompt containing:
- The repo root path for context
- Each invariant rendered with id, area, kind, statement, scope, and optional hint
- Explicit instructions to grade as
pass/fail, default to fail on ambiguity, and return JSON
4. Schema generation (output_schema)
Builds a JSON Schema (draft-07) that constrains Codex output to exactly the expected invariant IDs with grades, rationales, and 1–3 evidence file paths. This is passed to Codex via --output-schema to guarantee structured output.
5. Codex invocation and report rendering
run_codex_review shells out to the codex CLI with --sandbox workspace-write, --ephemeral, and -a never (no approval required). The JSON report is written to a temp file and read back.
validate_report checks completeness and uniqueness of the returned IDs, then re-sorts results to match the spec order.
print_report outputs a human-friendly summary and returns exit code 1 if any invariant failed — making it CI-friendly.
Wire the check into the Just task system
Intent: Make the invariant review discoverable and runnable through the project's existing Just-based developer workflow.
Affected files: just/checks.just, justfile
Evidence
@@ -34,6 +34,10 @@ pre-commit-full: pre-commit
+# Run the Codex-backed architecture invariant review.
+invariants:
+ python3 ./scripts/check_invariants.py
@@ -160,6 +162,10 @@ qa:
+# Run the Codex-backed architecture invariant review.
+invariants:
+ @just checks::invariants
@@ -28,6 +28,8 @@ info:
+ @echo " Architecture invariants:"
+ @echo " just invariants"
Two Just targets are added:
| Target | Location | Purpose |
checks::invariants | just/checks.just:37 | Runs python3 ./scripts/check_invariants.py directly. |
invariants | justfile:165 | Top-level alias forwarding to checks::invariants. |
The info recipe in the root justfile is also updated to advertise just invariants under a new "Architecture invariants" heading, keeping the built-in help text current.
This follows the project's convention: implementation recipes live in just/checks.just, top-level convenience aliases live in the root justfile.
Add unit tests for the review script
Intent: Verify the pure-logic functions (spec loading, prompt building, report validation, schema generation) without requiring a live Codex backend, enabling fast CI feedback.
Affected files: scripts/test_check_invariants.py
Evidence
@@ -0,0 +1,119 @@
+def load_script_module():
+ spec = importlib.util.spec_from_file_location("check_invariants", SCRIPT)
+ ...
+ spec.loader.exec_module(module)
+ return module
@@ ... @@
+ def test_load_spec_rejects_duplicate_ids(self) -> None:
+ ...
+ with self.assertRaises(SystemExit) as ctx:
+ module.load_spec(spec_path)
+ self.assertIn("duplicate invariant id DUP-001", str(ctx.exception))
@@ ... @@
+ def test_validate_report_preserves_spec_order(self) -> None:
+ ...
+ self.assertEqual([entry["id"] for entry in normalized], ["ONE", "TWO"])
@@ ... @@
+ def test_output_schema_matches_invariant_ids(self) -> None:
+ ...
+ self.assertEqual(
+ schema["properties"]["results"]["items"]["properties"]["id"]["enum"],
+ ["ONE", "TWO"],
+ )
scripts/test_check_invariants.py uses importlib.util to dynamically import check_invariants.py as a module (avoiding package installation), then exercises four scenarios:
Tests
-
test_load_spec_rejects_duplicate_ids — Writes a TOML file with two DUP-001 entries to a temp directory and asserts load_spec raises SystemExit mentioning the duplicate.
-
test_build_prompt_includes_scope_and_hint — Feeds a synthetic spec with scope globs and a hint, then checks the generated prompt string contains the expected scope: and hint: lines.
-
test_validate_report_preserves_spec_order — Supplies a report where results arrive in reverse order (TWO before ONE) and verifies validate_report re-sorts them to match the spec's declaration order.
-
test_output_schema_matches_invariant_ids — Confirms the JSON Schema's enum constraint lists exactly the IDs from the spec, ensuring Codex is constrained to valid invariant IDs.
All tests are offline (no Codex call) and run with python3 -m unittest scripts/test_check_invariants.py.