New /self-check skill: each machine probes its own ClaudeTools harness wiring (identity.json paths, required tooling, settings.json hooks, skill/command/script set, vault decrypt, coord/Gitea connectivity, Ollama capability tier) and grades RED/AMBER/GREEN against a checked-in provisional baseline manifest. - Capability-tier model: architectural/OS/hardware differences (e.g. no local Ollama) select a fallback ruleset instead of failing. - Duplicate detection: flags command/skill names that diverge between the repo and ~/.claude (the "same /cmd, different behaviour" cross-machine bug); CRLF-only diffs ignored. - Memory check: index + orphan detection, plus a model-driven semantic pass for memories that contradict identity/settings. - V1 is a census tool: --publish writes a per-machine census to coord (component selfcheck_<host>); fanout requests the fleet to self-check + self-remediate + re-publish; aggregate derives the proposed baseline. No machine ever fixes another. Reviewed twice by the Code Review Agent; three CRITICAL coord-API bugs and the CRLF false-WARN found and fixed, verified live against the coord API. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9.1 KiB
name, description
| name | description |
|---|---|
| self-check | Self-diagnose a ClaudeTools session's machine: verify the harness is wired the same way as every other instance while allowing for architectural / OS / hardware differences. Checks that identity.json exists and is correct (the map of WHERE things live on this box), required tooling is installed, env/paths resolve, hooks are wired, the skill/command/script set matches the baseline, the vault decrypts, coord/Gitea are reachable, and the machine's capability tier (e.g. no local Ollama) resolves to the right fallback ruleset. Grades RED/AMBER/GREEN and can publish a census to the coord API so the fleet baseline can be built/refined. Invoke for: "self check", "self diagnosis", "self test", "doctor", "health check", "am I configured right", "is my machine set up correctly", "harness conformance", "fleet conformance", "check my environment", "is everything wired up". |
Self-Check — ClaudeTools Harness Self-Diagnosis
A top-to-bottom evaluation of how this machine's ClaudeTools harness is wired, graded against a checked-in baseline manifest so every machine behaves the same way — while explicitly allowing for architecture, OS, and hardware differences via a capability tier model.
This is the skill the user asked for when a session needs to "make sure everything is as it should be."
The model in one paragraph
identity.json is the foundational, per-machine map of where things live and
what this box can do (vault path, repo root, platform, arch, python command,
Ollama endpoints). The baseline manifest
(baseline/manifest.json) declares what every machine must have — required
tools, identity fields, scripts, hook files, the wired settings.json hooks, the
canonical skill/command set, and the capability rules that say what to do when
a capability is absent (e.g. no local Ollama → use the remote endpoint, or if that
is also down, route Tier-0 work to haiku instead of blocking). The probe compares
the live machine against the manifest, resolves the machine's capability tier, and
grades RED/AMBER/GREEN. Required things missing = RED. Advisory drift = AMBER.
Capability differences are never failures — they select a ruleset.
V1 is a CENSUS tool (read this)
There is no ratified fleet baseline yet. baseline/manifest.json is provisional,
generated from a single known-good machine (GURU-5070). So V1's job is to gather
ground truth from every machine and help Mike build the real baseline:
- Probe — each machine runs the check and produces a structured census.
- Publish —
--publishPUTs the census to coord as componentselfcheck_<host>(state = grade, notes = full JSON). One row per machine = a live fleet conformance view. - Fan out —
fanoutbroadcasts a request toALL_SESSIONSso every active instance reports. - Aggregate —
aggregatereads all censuses back and proposes a baseline (tools/skills/commands present on all machines = "required everywhere"; present on some = "capability-gated"), and lists machines with FAILs.
Mike reviews the aggregate and ratifies manifest.json. From then on the same
probe enforces conformance. V1 does not auto-fix anything — it reports the
exact fix command for each finding (per the decision on record).
Running it
The probe is scripts/self-check.sh (bash; runs on Git Bash/Windows, macOS,
Linux; deps: jq + curl). Always pass a real UTC timestamp:
SELFCHECK_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
bash .claude/skills/self-check/scripts/self-check.sh <mode>
| Mode | Purpose |
|---|---|
report (default) |
Human RED/AMBER/GREEN report. Exit 0/1/2 = GREEN/AMBER/RED. |
--json |
Structured census JSON to stdout (for piping). |
--publish |
Run + publish census to coord (component selfcheck_<host>). Softfails to .claude/coord-queue.jsonl. |
fanout |
Broadcast a census request to ALL_SESSIONS. |
aggregate |
Fleet table + proposed-baseline summary from published censuses. |
/self-check is the slash-command runner for the same script.
What it checks
| Category | Checks |
|---|---|
| identity | identity.json exists + valid JSON; all required fields present; claudetools_root exists and equals the running repo; vault_path exists; machine == hostname; git user.name/email match identity. |
| tooling | required everywhere: bash, git, jq, curl, sops, age, ssh, a python. Missing = FAIL. |
| capability | ollama, cargo, node, gh, docker, op — presence is INFO, never a failure. Resolves the Ollama tier (local / remote / none) and prints the effective Tier-0 ruleset. |
| files | required scripts + hook files present and executable. |
| hooks | the three settings.json hooks are wired (block-backslash PreToolUse, check-messages UserPromptSubmit, sync-memory SessionStart); current-mode present. |
| git | origin points at ACG Gitea (internal IP preferred); main-repo post-commit hook installed (AMBER if not). |
| skills/commands | every skill dir and command file in the baseline is present; extras are reported as census candidates. |
| duplicates | command/skill names present in BOTH the repo and ~/.claude. Divergent content = WARN (the "same /cmd, different behaviour on the Mac" bug); identical = INFO (redundant, will drift). CRLF-only differences are ignored. |
| memory | MEMORY.md index exists; no orphaned memory files; manifest-declared contradiction patterns (see semantic pass below). Never FAILs the grade. |
| vault | vault repo exists; sops+age present; vault.sh list succeeds (decrypt wired). |
| connectivity | coord API (required), main API + internal Gitea (advisory; off-network is OK). |
Rogue-memory contradiction — semantic pass (do this when asked, or on a full check)
The engine's memory check is deterministic and conservative (index + orphans + declared patterns) so it never produces false alarms. A true contradiction check — "does any memory directly contradict what this machine's settings say?" — is a judgment task, so the model does it (route the prose/classification to Ollama Tier-0 per the house rules; Claude reviews the result):
- Read
identity.json(where things live + this box's capabilities),settings.json(wired hooks/permissions), andbaseline/manifest.json. - Read the memory index
.claude/memory/MEMORY.md, then open any memory whose one-line hook touches: paths/roots, python launcher, endpoints/IPs, OS/arch assumptions, tool choices, or model routing. - Flag memories that directly contradict this machine's reality, e.g.:
- prescribes
python3/pythonwhenidentity.python.commandispy(or vice-versa), - hardcodes a repo/vault path that isn't this machine's
claudetools_root/vault_path, - names an endpoint/IP that conflicts with
identity.coord_apior the manifest, - assumes a capability (local Ollama) this machine's tier says is absent.
- prescribes
- Report each as: memory file, the contradicting claim, the setting it violates, and a suggested correction. Do not edit memories — surface for the operator (deletions/rewrites go through the human, mirroring memory-dream's posture).
Genuinely machine-specific guidance in a shared memory is the usual culprit — the fix is to scope it ("on Windows…") or split it, not to globally flip it.
Fleet self-remediation loop (machines fix themselves)
We never fix a remote machine. The flow is:
fanout— broadcast asks every instance to self-check + self-fix + re-publish.- Each operator runs
/self-checklocally, applies the printed fix commands on their own box, re-runs to confirm GREEN, then/self-check --publish. aggregate— shows who is still RED/AMBER and prints each machine's own fix list. Relay it to that operator; do not run it for them.- Repeat until the fleet is consistently GREEN, then ratify the manifest.
How to interpret a run
After running, summarize for the user:
- The grade and the PASS/WARN/FAIL/INFO tallies.
- Each FAIL and WARN with its exact fix command. Do not auto-apply.
- The capability tier line — confirm the machine knows its Tier-0 fallback.
- If publishing/aggregating, note how many machines have reported and which are RED.
Capability differences (no Ollama, no gh, ARM vs amd64, macOS vs Windows) are expected and must never be reported as broken — they are the whole point of the tier model.
Files
.claude/skills/self-check/
SKILL.md this file
scripts/self-check.sh the probe engine (report / --json / --publish / fanout / aggregate)
baseline/manifest.json the provisional fleet baseline (single source of truth)
baseline/README.md the baseline model + how to refine/ratify it
.claude/commands/self-check.md the /self-check runner
Extending the baseline
When a new tool/skill/command/hook becomes mandatory fleet-wide, edit
baseline/manifest.json, commit, and /sync. Every machine's next self-check
enforces it. Capability-only tools go in capability_tools with a matching entry
in capability_rules describing the fallback. See baseline/README.md.