feat(harness): P1+P2+P3 harness optimization complete (VERSION 1.4.0)

Task 5  one-line registry descriptions on the 8 biggest skills (remediation-tool,
        gc-audit, packetdial, memory-dream, human-flow, self-check, impeccable,
        mailprotector); skill-description injection ~3320 -> ~2123 tokens (~36%),
        keyword triggers preserved, frontmatter valid.
Task 7  thinned /save + /sync bodies to point at sync.sh (single source) instead of
        re-documenting internals; Phase 0 save-vs-sync, cross-user notes, exit-75
        reporting kept verbatim; mechanical sync never depends on an LLM step.
Task 10 session-logs/YYYY-MM/ forward convention for new logs (scoped-grep recall,
        no monolithic index); existing flat logs untouched (grep covers both).
Bash    now-phoenix.sh helper (fixed UTC-7 epoch math; replaces unreliable
        TZ=America/Phoenix date that silently returns UTC on Git-Bash).

P0 (1.2.0) + Task 6 CLAUDE split + Task 9 delegation (1.3.0) already shipped.
Spec: specs/claudetools-harness-optimization/plan.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-08 08:10:58 -07:00
parent 6671a7a400
commit 68ad1dbd40
14 changed files with 1740 additions and 1710 deletions

View File

@@ -26,11 +26,20 @@ Claude writes all sections directly. Be concise, factual, technical. No filler p
### Location ### Location
New logs go in a **`YYYY-MM/` month folder** under the relevant `session-logs/` dir (keeps the
flat dir from growing unbounded; recall is scoped grep over the month folders — no monolithic
index). `mkdir -p` the month folder before writing.
| Work scope | Path | | Work scope | Path |
|---|---| |---|---|
| Single project | `projects/<project>/session-logs/YYYY-MM-DD-session.md` | | Single project | `projects/<project>/session-logs/YYYY-MM/YYYY-MM-DD-<user>-<topic>.md` |
| Client | `clients/<slug>/session-logs/YYYY-MM-DD-session.md` | | Client | `clients/<slug>/session-logs/YYYY-MM/YYYY-MM-DD-<user>-<topic>.md` |
| Multi-project / general | `session-logs/YYYY-MM-DD-session.md` | | Multi-project / general | `session-logs/YYYY-MM/YYYY-MM-DD-<user>-<topic>.md` |
> Existing flat logs (`session-logs/*.md`) stay where they are — recall grep covers both `*/*.md`
> (month folders) and `*.md` (legacy flat), so no mass migration. The month folder is added
> *after* `session-logs/`, so wiki slug derivation (`<project>`/`<slug>` captured before
> `session-logs/`) is unaffected. Use `bash .claude/scripts/now-phoenix.sh --date` for the date.
### Filename + append behavior ### Filename + append behavior
@@ -97,9 +106,11 @@ not on every save.
bash .claude/scripts/sync.sh bash .claude/scripts/sync.sh
``` ```
`sync.sh` is **serialized by a per-machine lock** (`.git/claudetools-sync.lock`) so concurrent sessions or the scheduled-task sync cannot interleave commits/rebases; if another sync is mid-flight it waits up to ~120s, then **exits 75 (deferred)** rather than racing — the next sync catches up. On a 75, do NOT print a success summary; report "**sync deferred — another sync is running; your session log is written locally and will sync on the next run**". Otherwise it: reconciles this machine's `git config user.name/email` to `.claude/identity.json` (so commit authorship can't drift), stages all changes with `git add -A` (after purging garbled Windows path-as-filename cruft), auto-commits, fetch + rebase, push, then the same flow for the vault repo, then surfaces cross-user `## Note for <user>` blocks. Same driver as `/sync` — see that command for the full semantics. The two load-bearing
points for reporting: **exit 75 = deferred** (another sync is running; report "sync deferred
> Note: `git add -A` is still the catch-all sweep, so a save run will also pick up any *other* dirty files in the shared tree. The lock prevents two syncs from racing, and per-session-unique log filenames prevent log overwrites — but the bare-`add -A` capture means full per-session commit isolation is a later step (see the isolation plan: drop blind `add -A` in favour of explicit per-session staging). For now, avoid running `/save` from two sessions at the exact same moment. — your session log is written locally and will sync on the next run", NOT a success summary);
and `git add -A` is a catch-all sweep, so avoid running `/save` from two sessions at the exact
same moment (per-session-unique log filenames prevent log overwrites, the lock prevents racing).
After sync, emit a **Post-commit Summary**: After sync, emit a **Post-commit Summary**:

View File

@@ -39,18 +39,15 @@ The intent: a `/sync` that finds unsaved work should default toward `/save`. Aut
## What this does ## What this does
Invokes `bash .claude/scripts/sync.sh`, which: Run it — the script is the single source of truth for all git ops (both `/sync` and `/save` invoke it):
1. Detects local changes (including untracked-only files) via `git status --porcelain`; stages with `git add -A` and auto-commits with `sync: auto-sync from <hostname> at <timestamp>` ```bash
2. Fetches from origin, rebases local commits onto remote bash .claude/scripts/sync.sh
3. Pushes to origin ```
4. Copies `.claude/commands/*.md``~/.claude/commands/` so the global Claude CLI commands stay current without a manual copy
5. Repeats steps 1-3 for the **vault** repo (path read from `.claude/identity.json` `vault_path` field)
6. Surfaces any `## Note for <user>` / `## Message for <user>` blocks from incoming session logs
The script is the single source of truth for git operations. Both `/sync` and `/save` invoke it. It stages (`git add -A`, submodule gitlinks unstaged unless `--with-submodules`), auto-commits, fetch+rebase+push for this repo then the vault repo, deploys `.claude/commands/*.md` + skills to `~/.claude/`, and surfaces incoming `## Note for <user>` blocks. Full internals: `.claude/CLAUDE_EXTENDED.md` / the script header.
**Concurrency:** the run is serialized by a per-machine lock (`.git/claudetools-sync.lock`) so two syncs (e.g. interactive + the scheduled-task sync, or two Claude sessions) can't interleave staging/commit/rebase/push. If another sync is already running, this run waits up to ~120s then **exits 75 (EX_TEMPFAIL = deferred, not a failure)** report it as deferred, not synced; the next run catches up. Stale locks (owner process dead, or older than 10 min) are auto-reclaimed. **Exit 75 = deferred, not a failure.** The run is serialized by a per-machine lock (`.git/claudetools-sync.lock`); if another sync is mid-flight it waits ~120s then exits 75. On a 75, report "sync deferred — another sync is running; it will catch up next run", NOT a success summary. Stale locks (dead owner, or >10 min) auto-reclaim.
--- ---

View File

@@ -30,3 +30,19 @@ or old harness during a heterogeneous rollout. See
(full manual, on-demand). Saves ~3.7k tokens per CLAUDE.md injection; nothing lost. (full manual, on-demand). Saves ~3.7k tokens per CLAUDE.md injection; nothing lost.
- Task 9 (P2): delegation re-tuned in CORE — act directly by default; delegate only for - Task 9 (P2): delegation re-tuned in CORE — act directly by default; delegate only for
high-volume output, blast radius >3 files/layers, domain shift, or parallel work. high-volume output, blast radius >3 files/layers, domain shift, or parallel work.
## 1.4.0 — 2026-06-08 (P1+P2+P3 complete)
- Task 5: one-line registry descriptions on the 8 biggest skills (remediation-tool, gc-audit,
packetdial, memory-dream, human-flow, self-check, impeccable, mailprotector). Skill-description
injection ~3320 -> ~2123 tokens (~36% cut); keyword triggers preserved; frontmatter valid.
- Task 7: thinned `/save` + `/sync` bodies — they point to `sync.sh` as the single source instead
of re-documenting its internals; load-bearing LLM-judgment parts (Phase 0 save-vs-sync, cross-user
note display, exit-75 reporting) kept verbatim. The mechanical sync never depends on an LLM step.
- Task 10 (P3): `session-logs/YYYY-MM/` adopted as a FORWARD convention for new logs (recall = scoped
grep over month folders, no monolithic index); existing flat logs untouched (grep covers both).
Recall order (wiki -> CONTEXT/log -> coord) already lives in CORE.
- Deterministic Bash fix: `now-phoenix.sh` helper added — fixed UTC-7 epoch math, replaces the
unreliable `TZ=America/Phoenix date` (silently returns UTC on Git-Bash). `--iso/--date/--datetime/
--fmt` formats. `post-bot-alert.sh` already uses `jq -nc --arg` (verified, no change needed).
- Deferred (unchanged): full Python port = separate spec; Task 8 shard command bodies; promote
guard to FATAL after a clean warn window; schedule memory-dream --apply-safe per-machine.

View File

@@ -1 +1 @@
1.3.0 1.4.0

View File

@@ -0,0 +1,50 @@
#!/usr/bin/env bash
# now-phoenix.sh — emit the current America/Phoenix timestamp, deterministically.
#
# WHY: `TZ=America/Phoenix date` is unreliable on Git-for-Windows bash (the MSYS
# tz database is often absent, so it silently returns UTC). Arizona does NOT
# observe DST — it is fixed UTC-7 (MST) year-round — so we compute Phoenix time
# as (UTC epoch - 7h) and format it. No tz database, no DST edge cases, identical
# result on Windows / macOS / Linux.
#
# Usage:
# bash now-phoenix.sh -> 2026-06-08 14:32 PT (default, human log line)
# bash now-phoenix.sh --iso -> 2026-06-08T14:32:07-07:00
# bash now-phoenix.sh --date -> 2026-06-08
# bash now-phoenix.sh --datetime -> 2026-06-08 14:32:07
# bash now-phoenix.sh --epoch -> 1749422327 (raw UTC epoch, for arithmetic)
# bash now-phoenix.sh --fmt '+%H:%M' -> 14:32 (custom strftime, applied to Phoenix time)
#
# All output is on stdout, no trailing prose. Soft, dependency-free (coreutils date only).
set -euo pipefail
OFFSET=$((7 * 3600)) # Phoenix is UTC-7, fixed
EPOCH_UTC="$(date -u +%s)"
EPOCH_PHX=$((EPOCH_UTC - OFFSET))
# Portable "format an epoch as if it were UTC" (so the wall-clock we print is Phoenix local).
fmt_epoch() {
local e="$1" f="$2"
if date -u -d "@${e}" "$f" >/dev/null 2>&1; then
date -u -d "@${e}" "$f" # GNU/Git-Bash
else
date -u -r "${e}" "$f" # BSD/macOS
fi
}
case "${1:-}" in
--iso) printf '%s-07:00\n' "$(fmt_epoch "$EPOCH_PHX" '+%Y-%m-%dT%H:%M:%S')" ;;
--date) fmt_epoch "$EPOCH_PHX" '+%Y-%m-%d' ;;
--datetime) fmt_epoch "$EPOCH_PHX" '+%Y-%m-%d %H:%M:%S' ;;
--epoch) printf '%s\n' "$EPOCH_UTC" ;;
--fmt) fmt_epoch "$EPOCH_PHX" "${2:?--fmt needs a strftime arg, e.g. --fmt '+%H:%M'}" ;;
''|--pt) printf '%s PT\n' "$(fmt_epoch "$EPOCH_PHX" '+%Y-%m-%d %H:%M')" ;;
-h|--help)
grep -E '^#( |$)' "$0" | sed 's/^# \{0,1\}//'
;;
*)
echo "[ERROR] now-phoenix: unknown arg '$1' (try --help)" >&2
exit 64
;;
esac

File diff suppressed because it is too large Load Diff

View File

@@ -1,192 +1,185 @@
--- ---
name: human-flow name: human-flow
description: > description: "UI/UX scanner for mouse+keyboard interaction friction: Fitts's Law/target sizing, discoverability/affordances, keyboard parity, feedback loops, task efficiency, forgiving interactions. Produces reports with code locations + fixes. Use when reviewing/building interactive UI (dashboards, lists, forms, complex workflows)."
A UI/UX scanner that specializes in detecting interaction patterns unintuitive or inefficient for humans using a mouse and keyboard. user-invocable: true
Expands on frontend-design and impeccable by focusing on real human workflow friction: motor control (Fitts's Law, target sizing, precision), argument-hint: "[scan|audit|report] [target path or component]"
discoverability (affordances, hover vs always-visible), keyboard parity (full navigation and activation without mouse), ---
feedback loops (immediate state changes, error recovery), task efficiency (click/keystroke count, context switches),
and forgiving interaction models. It produces structured reports with code locations, "why this feels bad for a human" explanations, # human-flow — Human Mouse + Keyboard Workflow Scanner
and specific, actionable recommendations to make mouse+keyboard workflows smoother, faster, and more intuitive.
Use when reviewing or building any interactive UI, especially data-heavy tools, dashboards, lists, forms, and complex workflows. This skill is a specialized scanner for **human intuition and ergonomics** in pointer + keyboard interfaces. It goes beyond visual polish, code quality, and general UX heuristics (covered by `frontend-design` and `impeccable`) to focus on what actually feels *clunky, hidden, or frustrating* when a real person is driving with a mouse and keyboard.
user-invocable: true
argument-hint: "[scan|audit|report] [target path or component]" **Core Philosophy**
--- - Humans have limited precision, attention, and patience.
- Mouse users hate tiny targets, hidden controls, and precision clicking.
# human-flow — Human Mouse + Keyboard Workflow Scanner - Keyboard users hate missing focus, no activation keys, and mouse-only gestures.
- Good workflow design makes the *anticipated next action* obvious and low-effort with either input method.
This skill is a specialized scanner for **human intuition and ergonomics** in pointer + keyboard interfaces. It goes beyond visual polish, code quality, and general UX heuristics (covered by `frontend-design` and `impeccable`) to focus on what actually feels *clunky, hidden, or frustrating* when a real person is driving with a mouse and keyboard. - The best interfaces feel "just right" — large enough targets, immediate feedback, discoverable without hunting, and consistent models.
**Core Philosophy** It is **mandatory** to consider human-flow when any mouse or keyboard interaction is involved.
- Humans have limited precision, attention, and patience.
- Mouse users hate tiny targets, hidden controls, and precision clicking. ## When to Invoke
- Keyboard users hate missing focus, no activation keys, and mouse-only gestures.
- Good workflow design makes the *anticipated next action* obvious and low-effort with either input method. - Before or after implementing interactive features (buttons, tables, lists, modals, forms, drag, selection).
- The best interfaces feel "just right" — large enough targets, immediate feedback, discoverable without hunting, and consistent models. - When reviewing a dashboard, admin tool, or data-heavy UI (e.g. session lists, machine management).
- To audit an existing interface for workflow friction.
It is **mandatory** to consider human-flow when any mouse or keyboard interaction is involved. - As a complement to `impeccable critique` / `audit` and `frontend-design` validation.
## When to Invoke Run via natural language ("human-flow scan the sessions table", "run human-flow audit on the dashboard components") or explicitly.
- Before or after implementing interactive features (buttons, tables, lists, modals, forms, drag, selection). ## Commands
- When reviewing a dashboard, admin tool, or data-heavy UI (e.g. session lists, machine management).
- To audit an existing interface for workflow friction. | Command | Description |
- As a complement to `impeccable critique` / `audit` and `frontend-design` validation. |---------------------|-------------|
| `scan [target]` | AST-powered scan of files/directories for workflow friction. Produces a 0-10 Friction Index report. |
Run via natural language ("human-flow scan the sessions table", "run human-flow audit on the dashboard components") or explicitly. | `audit [target]` | Deeper pass: combines AST analysis, component review, and state-flow audit. |
| `elevate [target]` | **Polish & redesign pass.** Goes beyond friction to make a UI top-notch: information hierarchy, signature moment, action gravity, lonely states, density, rhythm, type, tokens, depth/finish, motion — and flags when a screen should be **redesigned, not patched**. Produces an Elevation Index + prioritized tiers (Quick Wins / Elevations / Redesign Candidates). Add `--redesign` to emphasize structural restructuring. See `references/polish-and-redesign.md`. |
## Commands | `fix [target]` | **DISABLED (advisory only for now).** Auto-apply is off — the AST code generator reprints whole files and produces noisy diffs. Use the scan/report output and have an agent apply the fixes surgically. Will be revisited with a surgical (string-splice) editor. |
| `fancy [target]` | **"Fancy as fuck" mode** — elegance pass with a calibrated Restraint-o-Meter. |
| Command | Description | | `report [target]` | Generate a formatted markdown report with the Friction Index rubric. |
|---------------------|-------------|
| `scan [target]` | AST-powered scan of files/directories for workflow friction. Produces a 0-10 Friction Index report. | If no command, defaults to `scan` on the provided target.
| `audit [target]` | Deeper pass: combines AST analysis, component review, and state-flow audit. |
| `elevate [target]` | **Polish & redesign pass.** Goes beyond friction to make a UI top-notch: information hierarchy, signature moment, action gravity, lonely states, density, rhythm, type, tokens, depth/finish, motion — and flags when a screen should be **redesigned, not patched**. Produces an Elevation Index + prioritized tiers (Quick Wins / Elevations / Redesign Candidates). Add `--redesign` to emphasize structural restructuring. See `references/polish-and-redesign.md`. | ## Friction Index (0-10)
| `fix [target]` | **DISABLED (advisory only for now).** Auto-apply is off — the AST code generator reprints whole files and produces noisy diffs. Use the scan/report output and have an agent apply the fixes surgically. Will be revisited with a surgical (string-splice) editor. |
| `fancy [target]` | **"Fancy as fuck" mode** — elegance pass with a calibrated Restraint-o-Meter. | The scan produces an objective score based on weighted deductions:
| `report [target]` | Generate a formatted markdown report with the Friction Index rubric. | - **Motor (3.0)**: Target size, precision, Fitts's Law.
- **Cognitive (2.5)**: Discoverability, affordance, consistency.
If no command, defaults to `scan` on the provided target. - **Keyboard (2.5)**: Accessibility, focus flow, parity.
- **Feedback (2.0)**: Visual response, state transitions.
## Friction Index (0-10)
Score = 10 - Σ(IssueSeverity * DimensionWeight)
The scan produces an objective score based on weighted deductions:
- **Motor (3.0)**: Target size, precision, Fitts's Law. You can combine: e.g. run `scan` first for friction, then `fancy` for delight opportunities.
- **Cognitive (2.5)**: Discoverability, affordance, consistency.
- **Keyboard (2.5)**: Accessibility, focus flow, parity. ## Usage in Practice (for the Agent)
- **Feedback (2.0)**: Visual response, state transitions.
1. Resolve the target (file, component, page, or directory of .tsx/.jsx/.css/.ts).
Score = 10 - Σ(IssueSeverity * DimensionWeight) 2. Load the heuristics from `references/`.
3. Use code tools (read_file, grep, list_dir) + the scanner script if helpful to find candidate patterns.
You can combine: e.g. run `scan` first for friction, then `fancy` for delight opportunities. 4. For each finding:
- Cite exact file:line or component.
## Usage in Practice (for the Agent) - Explain *why it is unintuitive for a human with mouse and/or keyboard*.
- Give a concrete "better for humans" recommendation (with example diff or pattern when useful).
1. Resolve the target (file, component, page, or directory of .tsx/.jsx/.css/.ts). 5. Prioritize by impact on common workflows (high-frequency actions first).
2. Load the heuristics from `references/`. 6. End with an overall "Human Workflow Score" (0-10) and top 3-5 recommended changes.
3. Use code tools (read_file, grep, list_dir) + the scanner script if helpful to find candidate patterns.
4. For each finding: Always separate **mouse friction** and **keyboard friction** in reports, then **combined workflow** issues.
- Cite exact file:line or component.
- Explain *why it is unintuitive for a human with mouse and/or keyboard*. ## Heuristics (Core Categories)
- Give a concrete "better for humans" recommendation (with example diff or pattern when useful).
5. Prioritize by impact on common workflows (high-frequency actions first). The full detailed list with examples and detection guidance lives in `references/mouse-keyboard-heuristics.md`.
6. End with an overall "Human Workflow Score" (0-10) and top 3-5 recommended changes.
High-level categories the scanner always checks:
Always separate **mouse friction** and **keyboard friction** in reports, then **combined workflow** issues.
- **Target Size & Motor Precision** (Fitts's Law)
## Heuristics (Core Categories) - **Discoverability & Affordance** (does it look clickable/tappable? do secondary actions hide?)
- **Hover vs Always-Visible** (actions that require mouse hover to even see options)
The full detailed list with examples and detection guidance lives in `references/mouse-keyboard-heuristics.md`. - **Keyboard Parity & Activation** (can you do everything the mouse can, with reasonable keys?)
- **Focus & Navigation Flow** (tab order, focus trapping, visible focus, escape hatches)
High-level categories the scanner always checks: - **Feedback & State Transitions** (does the UI react immediately and clearly to every mouse/keyboard action?)
- **Selection & Multi-Action Models** (row click vs checkbox, drag vs buttons — are they consistent and forgiving?)
- **Target Size & Motor Precision** (Fitts's Law) - **Workflow Efficiency** (number of steps, precision required, dead space, context loss for common tasks)
- **Discoverability & Affordance** (does it look clickable/tappable? do secondary actions hide?) - **Error Prevention & Recovery** (destructive actions, undo, clear "I didn't mean that" paths)
- **Hover vs Always-Visible** (actions that require mouse hover to even see options) - **Density vs Clarity** (too much crammed into small areas forcing careful mousing)
- **Keyboard Parity & Activation** (can you do everything the mouse can, with reasonable keys?)
- **Focus & Navigation Flow** (tab order, focus trapping, visible focus, escape hatches) The scanner is **opinionated toward making the happy path for a human operator faster and less error-prone**, even if it means slightly more visual weight or always-visible controls.
- **Feedback & State Transitions** (does the UI react immediately and clearly to every mouse/keyboard action?)
- **Selection & Multi-Action Models** (row click vs checkbox, drag vs buttons — are they consistent and forgiving?) ## Scripts & Tooling
- **Workflow Efficiency** (number of steps, precision required, dead space, context loss for common tasks)
- **Error Prevention & Recovery** (destructive actions, undo, clear "I didn't mean that" paths) - `scripts/scan.mjs` — runnable Node scanner.
- **Density vs Clarity** (too much crammed into small areas forcing careful mousing) - `node scripts/scan.mjs --path <target>` → friction mode (default)
- `node scripts/scan.mjs --path <target> --fancy` → fancy mode (collects signals + prompts for the qualitative beauty pass)
The scanner is **opinionated toward making the happy path for a human operator faster and less error-prone**, even if it means slightly more visual weight or always-visible controls. - The agent is expected to supplement with semantic understanding (reading full components, understanding the user task flow in the app) and the rich references. The fancy pass is intentionally more qualitative than the friction scanner.
## Scripts & Tooling ## Integration with Other Skills
- `scripts/scan.mjs` — runnable Node scanner. - Run **after** `frontend-design` visual validation and **alongside** `impeccable critique/audit`.
- `node scripts/scan.mjs --path <target>` → friction mode (default) - Use `stop-slop` thinking when generating any example fixes or new component code.
- `node scripts/scan.mjs --path <target> --fancy` → fancy mode (collects signals + prompts for the qualitative beauty pass) - Can feed findings into `impeccable polish` or `harden` passes.
- The agent is expected to supplement with semantic understanding (reading full components, understanding the user task flow in the app) and the rich references. The fancy pass is intentionally more qualitative than the friction scanner.
## Output Format (Preferred)
## Integration with Other Skills
```markdown
- Run **after** `frontend-design` visual validation and **alongside** `impeccable critique/audit`. ## Human-Flow Scan: <target>
- Use `stop-slop` thinking when generating any example fixes or new component code.
- Can feed findings into `impeccable polish` or `harden` passes. **Overall Human Workflow Score:** 6.5/10 (Mouse: 7/10, Keyboard: 6/10)
## Output Format (Preferred) ### High Friction (P0)
- **File:** ...:123 — Hover-only row actions
```markdown Why unintuitive: Secondary actions (End, Control) are at 0.5 opacity until hover. A keyboard user or rushed mouse user scanning the list will miss or struggle to target them.
## Human-Flow Scan: <target> Human impact: Common task (ending a session) requires extra precision and discovery step.
Recommendation: ...
**Overall Human Workflow Score:** 6.5/10 (Mouse: 7/10, Keyboard: 6/10) ```
### High Friction (P0) See `references/report-template.md` for the full structure.
- **File:** ...:123 — Hover-only row actions
Why unintuitive: Secondary actions (End, Control) are at 0.5 opacity until hover. A keyboard user or rushed mouse user scanning the list will miss or struggle to target them. ## "Elevate" Mode (`elevate`) — Polish & Redesign
Human impact: Common task (ending a session) requires extra precision and discovery step.
Recommendation: ... Where `scan` finds what *hurts*, `elevate` finds what's *missing to be excellent* — and
``` decides when a screen is beyond polishing and should be **restructured**. It exists because
the maintainer is not a designer: after an `elevate` pass, the UI should feel/look/act as if
See `references/report-template.md` for the full structure. a senior product designer + UI expert + UX team planned it.
## "Elevate" Mode (`elevate`) — Polish & Redesign It is primarily an **agent judgment pass** seeded by static signals — read the component,
understand the user's task, score each dimension 15, then prescribe the concrete better
Where `scan` finds what *hurts*, `elevate` finds what's *missing to be excellent* — and version (a tweak, or a sketched redesign). The 12 heuristics, the scoring model, and the
decides when a screen is beyond polishing and should be **restructured**. It exists because output shape live in `references/polish-and-redesign.md`. In brief:
the maintainer is not a designer: after an `elevate` pass, the UI should feel/look/act as if
a senior product designer + UI expert + UX team planned it. - **12 heuristics:** Hierarchy & Visual Anchors · Signature Moment · Action Gravity ·
Narrative Coherence · Lonely States (empty/error/loading/success) · Progressive
It is primarily an **agent judgment pass** seeded by static signals — read the component, Disclosure & Density · Spacing Rhythm · Typographic Scale · Token Fidelity · Surface/
understand the user's task, score each dimension 15, then prescribe the concrete better Depth/Finish · Intentional Motion · Redesign Triggers.
version (a tweak, or a sketched redesign). The 12 heuristics, the scoring model, and the - **Elevation Index (010):** weighted score, with Hierarchy / Signature / Action Gravity /
output shape live in `references/polish-and-redesign.md`. In brief: Narrative weighted heaviest.
- **Redesign Urgency (05):** if ≥ 4, lead with a Structural Audit ("restructure, don't
- **12 heuristics:** Hierarchy & Visual Anchors · Signature Moment · Action Gravity · polish") and a sketched alternative layout/component tree.
Narrative Coherence · Lonely States (empty/error/loading/success) · Progressive - **Prioritized, not dumped:** `Opportunity = ImpactWeight × (5 score)`; present the top
Disclosure & Density · Spacing Rhythm · Typographic Scale · Token Fidelity · Surface/ 57 as **Quick Wins / Elevations / Redesign Candidates**, each citing file + signal +
Depth/Finish · Intentional Motion · Redesign Triggers. exact replacement.
- **Elevation Index (010):** weighted score, with Hierarchy / Signature / Action Gravity /
Narrative weighted heaviest. Recommended sequence: `scan` (kill friction) → `elevate` (reach top-notch / decide redesign)
- **Redesign Urgency (05):** if ≥ 4, lead with a Structural Audit ("restructure, don't `fancy` (calibrated delight on top).
polish") and a sketched alternative layout/component tree.
- **Prioritized, not dumped:** `Opportunity = ImpactWeight × (5 score)`; present the top ## "Fancy as Fuck" Mode (`fancy`)
57 as **Quick Wins / Elevations / Redesign Candidates**, each citing file + signal +
exact replacement. This is a deliberate second (or standalone) pass focused on **beauty, refinement, and elegant interaction**.
Recommended sequence: `scan` (kill friction) → `elevate` (reach top-notch / decide redesign) ### Philosophy
`fancy` (calibrated delight on top). - Not every interface needs (or should have) fancy elements. The first question is always: *"Does this beauty make the experience more useful?"*
- **Core principle**: Don't be pretty just to be pretty. In the course of being as useful as possible, do it with panache.
## "Fancy as Fuck" Mode (`fancy`) - "Useful decoration" is explicitly welcomed on surfaces where beauty can amplify comprehension, guidance, emotional reassurance, decision speed, or long-term connection to the product.
- On dense internal tools (operator consoles, admin dashboards): favor *restrained luxury* and precision. Think "high-end instrument." Only add panache where it clearly helps the operator.
This is a deliberate second (or standalone) pass focused on **beauty, refinement, and elegant interaction**. - On other surfaces (onboarding, public tools, marketing moments, creative experiences): more room for expressive, delightful "useful decoration."
- Fancy must **serve the human workflow**. It should increase perceived quality, clarity, emotional satisfaction, or effectiveness without adding cognitive load, slowing tasks, or hurting performance/accessibility.
### Philosophy - "Fancy" includes (but is not limited to): transitions & easing, micro-interactions, hover/focus states, page/view transitions (View Transitions API), loading & skeleton experiences, selection/confirmation moments, empty/success states, scroll reveals, depth/layering, typography shifts, cursor feedback, and tasteful celebration moments.
- Not every interface needs (or should have) fancy elements. The first question is always: *"Does this beauty make the experience more useful?"*
- **Core principle**: Don't be pretty just to be pretty. In the course of being as useful as possible, do it with panache. ### How the Fancy Pass Works
- "Useful decoration" is explicitly welcomed on surfaces where beauty can amplify comprehension, guidance, emotional reassurance, decision speed, or long-term connection to the product. 1. Assess appropriateness for the surface and user context.
- On dense internal tools (operator consoles, admin dashboards): favor *restrained luxury* and precision. Think "high-end instrument." Only add panache where it clearly helps the operator. 2. Look for **opportunities** where a small amount of elegant motion or feedback would make interactions feel more premium and alive.
- On other surfaces (onboarding, public tools, marketing moments, creative experiences): more room for expressive, delightful "useful decoration." 3. Critique existing attempts that are half-baked, janky, overused, or performance-negative.
- Fancy must **serve the human workflow**. It should increase perceived quality, clarity, emotional satisfaction, or effectiveness without adding cognitive load, slowing tasks, or hurting performance/accessibility. 4. Suggest specific, high-craft refinements and polish.
- "Fancy" includes (but is not limited to): transitions & easing, micro-interactions, hover/focus states, page/view transitions (View Transitions API), loading & skeleton experiences, selection/confirmation moments, empty/success states, scroll reveals, depth/layering, typography shifts, cursor feedback, and tasteful celebration moments. 5. Always respect `prefers-reduced-motion` and provide graceful fallbacks.
### How the Fancy Pass Works See `references/fancy-as-fuck.md` for the full set of beauty/elegance heuristics, appropriateness guidelines, and examples.
1. Assess appropriateness for the surface and user context.
2. Look for **opportunities** where a small amount of elegant motion or feedback would make interactions feel more premium and alive. ### Recommended Invocation Pattern
3. Critique existing attempts that are half-baked, janky, overused, or performance-negative. ```bash
4. Suggest specific, high-craft refinements and polish. # Friction first
5. Always respect `prefers-reduced-motion` and provide graceful fallbacks. human-flow scan the dashboard
See `references/fancy-as-fuck.md` for the full set of beauty/elegance heuristics, appropriateness guidelines, and examples. # Then delight
human-flow fancy the dashboard
### Recommended Invocation Pattern ```
```bash
# Friction first The output of a `fancy` pass should live in its own section of the report (or a dedicated delight report) and feed nicely into `impeccable polish` or `delight` work.
human-flow scan the dashboard
## Creating / Extending
# Then delight
human-flow fancy the dashboard - Add new heuristics to `references/mouse-keyboard-heuristics.md` (with detection hints and "better human workflow" examples).
``` - Add fancy/delights ideas to `references/fancy-as-fuck.md`.
- Add polish/redesign heuristics to `references/polish-and-redesign.md` (the `elevate` layer).
The output of a `fancy` pass should live in its own section of the report (or a dedicated delight report) and feed nicely into `impeccable polish` or `delight` work. - Update the scanner script for new static patterns (fancy detection is intentionally more qualitative).
- The skill is designed to be extended — new categories of mouse/keyboard friction **and** opportunities for tasteful elegance are welcome.
## Creating / Extending
**Remember:** The goal is not "perfect accessibility" in isolation or "pretty UI". It is **making the actual anticipated physical and cognitive workflow of a human with a mouse and a keyboard feel natural, fast, and low-friction** — and, when appropriate, doing so with panache and high craft. Beauty that serves usefulness is excellent. Beauty for its own sake is noise.
- Add new heuristics to `references/mouse-keyboard-heuristics.md` (with detection hints and "better human workflow" examples).
- Add fancy/delights ideas to `references/fancy-as-fuck.md`.
- Add polish/redesign heuristics to `references/polish-and-redesign.md` (the `elevate` layer).
- Update the scanner script for new static patterns (fancy detection is intentionally more qualitative).
- The skill is designed to be extended — new categories of mouse/keyboard friction **and** opportunities for tasteful elegance are welcome.
**Remember:** The goal is not "perfect accessibility" in isolation or "pretty UI". It is **making the actual anticipated physical and cognitive workflow of a human with a mouse and a keyboard feel natural, fast, and low-friction** — and, when appropriate, doing so with panache and high craft. Beauty that serves usefulness is excellent. Beauty for its own sake is noise.

View File

@@ -1,168 +1,168 @@
--- ---
name: impeccable name: impeccable
description: "Use when the user wants to design, redesign, shape, critique, audit, polish, clarify, distill, harden, optimize, adapt, animate, colorize, extract, or otherwise improve a frontend interface. Covers websites, landing pages, dashboards, product UI, app shells, components, forms, settings, onboarding, and empty states. Handles UX review, visual hierarchy, information architecture, cognitive load, accessibility, performance, responsive behavior, theming, anti-patterns, typography, fonts, spacing, layout, alignment, color, motion, micro-interactions, UX copy, error states, edge cases, i18n, and reusable design systems or tokens. Also use for bland designs that need to become bolder or more delightful, loud designs that should become quieter, live browser iteration on UI elements, or ambitious visual effects that should feel technically extraordinary. Not for backend-only or non-UI tasks." description: "Design, redesign, critique, audit, or polish a frontend interface (sites, landing pages, dashboards, app UI, components, forms, onboarding, empty states). Covers UX review, visual hierarchy, IA, accessibility, performance, responsive, theming, typography, spacing, color, motion, copy, design systems/tokens. Not for backend/non-UI."
argument-hint: "[{{command_hint}}] [target]" argument-hint: "[{{command_hint}}] [target]"
user-invocable: true user-invocable: true
allowed-tools: allowed-tools:
- Bash(npx impeccable *) - Bash(npx impeccable *)
license: Apache 2.0. Based on Anthropic's frontend-design skill. See NOTICE.md for attribution. license: Apache 2.0. Based on Anthropic's frontend-design skill. See NOTICE.md for attribution.
--- ---
Designs and iterates production-grade frontend interfaces. Real working code, committed design choices, exceptional craft. Designs and iterates production-grade frontend interfaces. Real working code, committed design choices, exceptional craft.
## Setup ## Setup
Before any design work or file edits: Before any design work or file edits:
1. Load context (PRODUCT.md / DESIGN.md) via the loader script. 1. Load context (PRODUCT.md / DESIGN.md) via the loader script.
2. Identify the register and load the matching register reference (brand.md or product.md). 2. Identify the register and load the matching register reference (brand.md or product.md).
3. **If the user invoked a sub-command (e.g. `craft`, `shape`, `audit`), load its reference file too.** This is non-negotiable: `craft` without `craft.md` loaded means you'll skip the shape-and-confirm step the user expects. 3. **If the user invoked a sub-command (e.g. `craft`, `shape`, `audit`), load its reference file too.** This is non-negotiable: `craft` without `craft.md` loaded means you'll skip the shape-and-confirm step the user expects.
Skipping these produces generic output that ignores the project. Skipping these produces generic output that ignores the project.
### 1. Context gathering ### 1. Context gathering
Two files, case-insensitive. The loader looks at the project root by default and falls back to `.agents/context/` and `docs/` if the root is clean. Override with `IMPECCABLE_CONTEXT_DIR=path/to/dir` (absolute or relative to cwd). Two files, case-insensitive. The loader looks at the project root by default and falls back to `.agents/context/` and `docs/` if the root is clean. Override with `IMPECCABLE_CONTEXT_DIR=path/to/dir` (absolute or relative to cwd).
- **PRODUCT.md**: required. Users, brand, tone, anti-references, strategic principles. - **PRODUCT.md**: required. Users, brand, tone, anti-references, strategic principles.
- **DESIGN.md**: optional, strongly recommended. Colors, typography, elevation, components. - **DESIGN.md**: optional, strongly recommended. Colors, typography, elevation, components.
Load both in one call: Load both in one call:
```bash ```bash
node {{scripts_path}}/load-context.mjs node {{scripts_path}}/load-context.mjs
``` ```
Consume the full JSON output. Never pipe through `head`, `tail`, `grep`, or `jq`. The output's `contextDir` field tells you where the files were resolved from. Consume the full JSON output. Never pipe through `head`, `tail`, `grep`, or `jq`. The output's `contextDir` field tells you where the files were resolved from.
If the output is already in this session's conversation history, don't re-run. Exceptions requiring a fresh load: you just ran `{{command_prefix}}impeccable teach` or `{{command_prefix}}impeccable document` (they rewrite the files), or the user manually edited one. If the output is already in this session's conversation history, don't re-run. Exceptions requiring a fresh load: you just ran `{{command_prefix}}impeccable teach` or `{{command_prefix}}impeccable document` (they rewrite the files), or the user manually edited one.
`{{command_prefix}}impeccable live` already warms context via `live.mjs`. If you've run `live.mjs`, don't also run `load-context.mjs` this session. `{{command_prefix}}impeccable live` already warms context via `live.mjs`. If you've run `live.mjs`, don't also run `load-context.mjs` this session.
If PRODUCT.md is missing, empty, or placeholder (`[TODO]` markers, <200 chars): run `{{command_prefix}}impeccable teach`, then resume the user's original task with the fresh context. If the original task was `{{command_prefix}}impeccable craft`, resume into `{{command_prefix}}impeccable shape` before any implementation work. If PRODUCT.md is missing, empty, or placeholder (`[TODO]` markers, <200 chars): run `{{command_prefix}}impeccable teach`, then resume the user's original task with the fresh context. If the original task was `{{command_prefix}}impeccable craft`, resume into `{{command_prefix}}impeccable shape` before any implementation work.
If DESIGN.md is missing: nudge once per session (*"Run `{{command_prefix}}impeccable document` for more on-brand output"*), then proceed. If DESIGN.md is missing: nudge once per session (*"Run `{{command_prefix}}impeccable document` for more on-brand output"*), then proceed.
### 2. Register ### 2. Register
Every design task is **brand** (marketing, landing, campaign, long-form content, portfolio: design IS the product) or **product** (app UI, admin, dashboard, tool: design SERVES the product). Every design task is **brand** (marketing, landing, campaign, long-form content, portfolio: design IS the product) or **product** (app UI, admin, dashboard, tool: design SERVES the product).
Identify before designing. Priority: (1) cue in the task itself ("landing page" vs "dashboard"); (2) the surface in focus (the page, file, or route being worked on); (3) `register` field in PRODUCT.md. First match wins. Identify before designing. Priority: (1) cue in the task itself ("landing page" vs "dashboard"); (2) the surface in focus (the page, file, or route being worked on); (3) `register` field in PRODUCT.md. First match wins.
If PRODUCT.md lacks the `register` field (legacy), infer it once from its "Users" and "Product Purpose" sections, then cache the inferred value for the session. Suggest the user run `{{command_prefix}}impeccable teach` to add the field explicitly. If PRODUCT.md lacks the `register` field (legacy), infer it once from its "Users" and "Product Purpose" sections, then cache the inferred value for the session. Suggest the user run `{{command_prefix}}impeccable teach` to add the field explicitly.
Load the matching reference: [reference/brand.md](reference/brand.md) or [reference/product.md](reference/product.md). The shared design laws below apply to both. Load the matching reference: [reference/brand.md](reference/brand.md) or [reference/product.md](reference/product.md). The shared design laws below apply to both.
## Shared design laws ## Shared design laws
Apply to every design, both registers. Match implementation complexity to the aesthetic vision: maximalism needs elaborate code, minimalism needs precision. Interpret creatively. Vary across projects; never converge on the same choices. {{model}} is capable of extraordinary work. Don't hold back. Apply to every design, both registers. Match implementation complexity to the aesthetic vision: maximalism needs elaborate code, minimalism needs precision. Interpret creatively. Vary across projects; never converge on the same choices. {{model}} is capable of extraordinary work. Don't hold back.
### Color ### Color
- Use OKLCH. Reduce chroma as lightness approaches 0 or 100; high chroma at extremes looks garish. - Use OKLCH. Reduce chroma as lightness approaches 0 or 100; high chroma at extremes looks garish.
- Never use `#000` or `#fff`. Tint every neutral toward the brand hue (chroma 0.0050.01 is enough). - Never use `#000` or `#fff`. Tint every neutral toward the brand hue (chroma 0.0050.01 is enough).
- Pick a **color strategy** before picking colors. Four steps on the commitment axis: - Pick a **color strategy** before picking colors. Four steps on the commitment axis:
- **Restrained**: tinted neutrals + one accent ≤10%. Product default; brand minimalism. - **Restrained**: tinted neutrals + one accent ≤10%. Product default; brand minimalism.
- **Committed**: one saturated color carries 3060% of the surface. Brand default for identity-driven pages. - **Committed**: one saturated color carries 3060% of the surface. Brand default for identity-driven pages.
- **Full palette**: 34 named roles, each used deliberately. Brand campaigns; product data viz. - **Full palette**: 34 named roles, each used deliberately. Brand campaigns; product data viz.
- **Drenched**: the surface IS the color. Brand heroes, campaign pages. - **Drenched**: the surface IS the color. Brand heroes, campaign pages.
- The "one accent ≤10%" rule is Restrained only. Committed / Full palette / Drenched exceed it on purpose. Don't collapse every design to Restrained by reflex. - The "one accent ≤10%" rule is Restrained only. Committed / Full palette / Drenched exceed it on purpose. Don't collapse every design to Restrained by reflex.
### Theme ### Theme
Dark vs. light is never a default. Not dark "because tools look cool dark." Not light "to be safe." Dark vs. light is never a default. Not dark "because tools look cool dark." Not light "to be safe."
Before choosing, write one sentence of physical scene: who uses this, where, under what ambient light, in what mood. If the sentence doesn't force the answer, it's not concrete enough. Add detail until it does. Before choosing, write one sentence of physical scene: who uses this, where, under what ambient light, in what mood. If the sentence doesn't force the answer, it's not concrete enough. Add detail until it does.
"Observability dashboard" does not force an answer. "SRE glancing at incident severity on a 27-inch monitor at 2am in a dim room" does. Run the sentence, not the category. "Observability dashboard" does not force an answer. "SRE glancing at incident severity on a 27-inch monitor at 2am in a dim room" does. Run the sentence, not the category.
### Typography ### Typography
- Cap body line length at 6575ch. - Cap body line length at 6575ch.
- Hierarchy through scale + weight contrast (≥1.25 ratio between steps). Avoid flat scales. - Hierarchy through scale + weight contrast (≥1.25 ratio between steps). Avoid flat scales.
### Layout ### Layout
- Vary spacing for rhythm. Same padding everywhere is monotony. - Vary spacing for rhythm. Same padding everywhere is monotony.
- Cards are the lazy answer. Use them only when they're truly the best affordance. Nested cards are always wrong. - Cards are the lazy answer. Use them only when they're truly the best affordance. Nested cards are always wrong.
- Don't wrap everything in a container. Most things don't need one. - Don't wrap everything in a container. Most things don't need one.
### Motion ### Motion
- Don't animate CSS layout properties. - Don't animate CSS layout properties.
- Ease out with exponential curves (ease-out-quart / quint / expo). No bounce, no elastic. - Ease out with exponential curves (ease-out-quart / quint / expo). No bounce, no elastic.
### Absolute bans ### Absolute bans
Match-and-refuse. If you're about to write any of these, rewrite the element with different structure. Match-and-refuse. If you're about to write any of these, rewrite the element with different structure.
- **Side-stripe borders.** `border-left` or `border-right` greater than 1px as a colored accent on cards, list items, callouts, or alerts. Never intentional. Rewrite with full borders, background tints, leading numbers/icons, or nothing. - **Side-stripe borders.** `border-left` or `border-right` greater than 1px as a colored accent on cards, list items, callouts, or alerts. Never intentional. Rewrite with full borders, background tints, leading numbers/icons, or nothing.
- **Gradient text.** `background-clip: text` combined with a gradient background. Decorative, never meaningful. Use a single solid color. Emphasis via weight or size. - **Gradient text.** `background-clip: text` combined with a gradient background. Decorative, never meaningful. Use a single solid color. Emphasis via weight or size.
- **Glassmorphism as default.** Blurs and glass cards used decoratively. Rare and purposeful, or nothing. - **Glassmorphism as default.** Blurs and glass cards used decoratively. Rare and purposeful, or nothing.
- **The hero-metric template.** Big number, small label, supporting stats, gradient accent. SaaS cliché. - **The hero-metric template.** Big number, small label, supporting stats, gradient accent. SaaS cliché.
- **Identical card grids.** Same-sized cards with icon + heading + text, repeated endlessly. - **Identical card grids.** Same-sized cards with icon + heading + text, repeated endlessly.
- **Modal as first thought.** Modals are usually laziness. Exhaust inline / progressive alternatives first. - **Modal as first thought.** Modals are usually laziness. Exhaust inline / progressive alternatives first.
### Copy ### Copy
- Every word earns its place. No restated headings, no intros that repeat the title. - Every word earns its place. No restated headings, no intros that repeat the title.
- **No em dashes.** Use commas, colons, semicolons, periods, or parentheses. Also not `--`. - **No em dashes.** Use commas, colons, semicolons, periods, or parentheses. Also not `--`.
### The AI slop test ### The AI slop test
If someone could look at this interface and say "AI made that" without doubt, it's failed. Cross-register failures are the absolute bans above. Register-specific failures live in each reference. If someone could look at this interface and say "AI made that" without doubt, it's failed. Cross-register failures are the absolute bans above. Register-specific failures live in each reference.
**Category-reflex check.** Run at two altitudes; the second one catches what the first one misses. **Category-reflex check.** Run at two altitudes; the second one catches what the first one misses.
- **First-order:** if someone could guess the theme + palette from the category alone ("observability → dark blue", "healthcare → white + teal", "finance → navy + gold", "crypto → neon on black"), it's the first training-data reflex. Rework the scene sentence and color strategy until the answer isn't obvious from the domain. - **First-order:** if someone could guess the theme + palette from the category alone ("observability → dark blue", "healthcare → white + teal", "finance → navy + gold", "crypto → neon on black"), it's the first training-data reflex. Rework the scene sentence and color strategy until the answer isn't obvious from the domain.
- **Second-order:** if someone could guess the aesthetic family from category-plus-anti-references ("AI workflow tool that's not SaaS-cream → editorial-typographic", "fintech that's not navy-and-gold → terminal-native dark mode"), it's the trap one tier deeper. The first reflex was avoided; the second wasn't. Rework until both answers are not obvious. The brand register's [reflex-reject aesthetic lanes](reference/brand.md) list catches the currently-saturated families. - **Second-order:** if someone could guess the aesthetic family from category-plus-anti-references ("AI workflow tool that's not SaaS-cream → editorial-typographic", "fintech that's not navy-and-gold → terminal-native dark mode"), it's the trap one tier deeper. The first reflex was avoided; the second wasn't. Rework until both answers are not obvious. The brand register's [reflex-reject aesthetic lanes](reference/brand.md) list catches the currently-saturated families.
## Commands ## Commands
| Command | Category | Description | Reference | | Command | Category | Description | Reference |
|---|---|---|---| |---|---|---|---|
| `craft [feature]` | Build | Shape, then build a feature end-to-end | [reference/craft.md](reference/craft.md) | | `craft [feature]` | Build | Shape, then build a feature end-to-end | [reference/craft.md](reference/craft.md) |
| `shape [feature]` | Build | Plan UX/UI before writing code | [reference/shape.md](reference/shape.md) | | `shape [feature]` | Build | Plan UX/UI before writing code | [reference/shape.md](reference/shape.md) |
| `teach` | Build | Set up PRODUCT.md and DESIGN.md context | [reference/teach.md](reference/teach.md) | | `teach` | Build | Set up PRODUCT.md and DESIGN.md context | [reference/teach.md](reference/teach.md) |
| `document` | Build | Generate DESIGN.md from existing project code | [reference/document.md](reference/document.md) | | `document` | Build | Generate DESIGN.md from existing project code | [reference/document.md](reference/document.md) |
| `extract [target]` | Build | Pull reusable tokens and components into design system | [reference/extract.md](reference/extract.md) | | `extract [target]` | Build | Pull reusable tokens and components into design system | [reference/extract.md](reference/extract.md) |
| `critique [target]` | Evaluate | UX design review with heuristic scoring | [reference/critique.md](reference/critique.md) | | `critique [target]` | Evaluate | UX design review with heuristic scoring | [reference/critique.md](reference/critique.md) |
| `audit [target]` | Evaluate | Technical quality checks (a11y, perf, responsive) | [reference/audit.md](reference/audit.md) | | `audit [target]` | Evaluate | Technical quality checks (a11y, perf, responsive) | [reference/audit.md](reference/audit.md) |
| `polish [target]` | Refine | Final quality pass before shipping | [reference/polish.md](reference/polish.md) | | `polish [target]` | Refine | Final quality pass before shipping | [reference/polish.md](reference/polish.md) |
| `bolder [target]` | Refine | Amplify safe or bland designs | [reference/bolder.md](reference/bolder.md) | | `bolder [target]` | Refine | Amplify safe or bland designs | [reference/bolder.md](reference/bolder.md) |
| `quieter [target]` | Refine | Tone down aggressive or overstimulating designs | [reference/quieter.md](reference/quieter.md) | | `quieter [target]` | Refine | Tone down aggressive or overstimulating designs | [reference/quieter.md](reference/quieter.md) |
| `distill [target]` | Refine | Strip to essence, remove complexity | [reference/distill.md](reference/distill.md) | | `distill [target]` | Refine | Strip to essence, remove complexity | [reference/distill.md](reference/distill.md) |
| `harden [target]` | Refine | Production-ready: errors, i18n, edge cases | [reference/harden.md](reference/harden.md) | | `harden [target]` | Refine | Production-ready: errors, i18n, edge cases | [reference/harden.md](reference/harden.md) |
| `onboard [target]` | Refine | Design first-run flows, empty states, activation | [reference/onboard.md](reference/onboard.md) | | `onboard [target]` | Refine | Design first-run flows, empty states, activation | [reference/onboard.md](reference/onboard.md) |
| `animate [target]` | Enhance | Add purposeful animations and motion | [reference/animate.md](reference/animate.md) | | `animate [target]` | Enhance | Add purposeful animations and motion | [reference/animate.md](reference/animate.md) |
| `colorize [target]` | Enhance | Add strategic color to monochromatic UIs | [reference/colorize.md](reference/colorize.md) | | `colorize [target]` | Enhance | Add strategic color to monochromatic UIs | [reference/colorize.md](reference/colorize.md) |
| `typeset [target]` | Enhance | Improve typography hierarchy and fonts | [reference/typeset.md](reference/typeset.md) | | `typeset [target]` | Enhance | Improve typography hierarchy and fonts | [reference/typeset.md](reference/typeset.md) |
| `layout [target]` | Enhance | Fix spacing, rhythm, and visual hierarchy | [reference/layout.md](reference/layout.md) | | `layout [target]` | Enhance | Fix spacing, rhythm, and visual hierarchy | [reference/layout.md](reference/layout.md) |
| `delight [target]` | Enhance | Add personality and memorable touches | [reference/delight.md](reference/delight.md) | | `delight [target]` | Enhance | Add personality and memorable touches | [reference/delight.md](reference/delight.md) |
| `overdrive [target]` | Enhance | Push past conventional limits | [reference/overdrive.md](reference/overdrive.md) | | `overdrive [target]` | Enhance | Push past conventional limits | [reference/overdrive.md](reference/overdrive.md) |
| `clarify [target]` | Fix | Improve UX copy, labels, and error messages | [reference/clarify.md](reference/clarify.md) | | `clarify [target]` | Fix | Improve UX copy, labels, and error messages | [reference/clarify.md](reference/clarify.md) |
| `adapt [target]` | Fix | Adapt for different devices and screen sizes | [reference/adapt.md](reference/adapt.md) | | `adapt [target]` | Fix | Adapt for different devices and screen sizes | [reference/adapt.md](reference/adapt.md) |
| `optimize [target]` | Fix | Diagnose and fix UI performance | [reference/optimize.md](reference/optimize.md) | | `optimize [target]` | Fix | Diagnose and fix UI performance | [reference/optimize.md](reference/optimize.md) |
| `live` | Iterate | Visual variant mode: pick elements in the browser, generate alternatives | [reference/live.md](reference/live.md) | | `live` | Iterate | Visual variant mode: pick elements in the browser, generate alternatives | [reference/live.md](reference/live.md) |
Plus two management commands: `pin <command>` and `unpin <command>`, detailed below. Plus two management commands: `pin <command>` and `unpin <command>`, detailed below.
### Routing rules ### Routing rules
1. **No argument**: render the table above as the user-facing command menu, grouped by category. Ask what they'd like to do. 1. **No argument**: render the table above as the user-facing command menu, grouped by category. Ask what they'd like to do.
2. **First word matches a command**: load its reference file and follow its instructions. Everything after the command name is the target. 2. **First word matches a command**: load its reference file and follow its instructions. Everything after the command name is the target.
3. **First word doesn't match**: general design invocation. Apply the setup steps, shared design laws, and the loaded register reference, using the full argument as context. 3. **First word doesn't match**: general design invocation. Apply the setup steps, shared design laws, and the loaded register reference, using the full argument as context.
Setup (context gathering, register) is already loaded by then; sub-commands don't re-invoke `{{command_prefix}}impeccable`. Setup (context gathering, register) is already loaded by then; sub-commands don't re-invoke `{{command_prefix}}impeccable`.
If the first word is `craft`, setup still runs first, but [reference/craft.md](reference/craft.md) owns the rest of the flow. If setup invokes `teach` as a blocker, finish teach, refresh context, then resume the original command and target. If the first word is `craft`, setup still runs first, but [reference/craft.md](reference/craft.md) owns the rest of the flow. If setup invokes `teach` as a blocker, finish teach, refresh context, then resume the original command and target.
## Pin / Unpin ## Pin / Unpin
**Pin** creates a standalone shortcut so `{{command_prefix}}<command>` invokes `{{command_prefix}}impeccable <command>` directly. **Unpin** removes it. The script writes to every harness directory present in the project. **Pin** creates a standalone shortcut so `{{command_prefix}}<command>` invokes `{{command_prefix}}impeccable <command>` directly. **Unpin** removes it. The script writes to every harness directory present in the project.
```bash ```bash
node {{scripts_path}}/pin.mjs <pin|unpin> <command> node {{scripts_path}}/pin.mjs <pin|unpin> <command>
``` ```
Valid `<command>` is any command from the table above. Report the script's result concisely. Confirm the new shortcut on success, relay stderr verbatim on error. Valid `<command>` is any command from the table above. Report the script's result concisely. Confirm the new shortcut on success, relay stderr verbatim on error.

View File

@@ -1,166 +1,156 @@
--- ---
name: mailprotector name: mailprotector
description: >- description: "Manage the ACG Mailprotector CloudFilter email-security gateway (emailservice.io). Search/release held/quarantined mail (in+outbound), pull mail-flow logs (why a message did/did not deliver), inspect + manage allow/block rules. Read-only default; releases/rule-changes gated --confirm. Triggers: mailprotector, cloudfilter, held/quarantined mail, release email, allow/block rule, INKY. Live production."
Manage the Arizona Computer Guru (ACG) Mailprotector CloudFilter email-security
gateway via the live CloudFilter REST API (emailservice.io). Search and release ---
held / quarantined mail (inbound and outbound), pull mail-flow logs to explain
why a message did or did not deliver, inspect entity configuration and # Mailprotector / CloudFilter Skill
allow/block rules, find a user or alias by email, and manage allow/block rules.
Read-only by default; every release / rule-add / config-change is gated behind Standalone CLI client for the **Mailprotector CloudFilter REST API**
--confirm. Invoke for: "mailprotector", "cloudfilter", "emailservice.io", "held (`emailservice.io`), the reseller email-security platform ACG layers on top of
mail", "quarantined email", "release email", "outbound quarantine", "why didn't client mail flow. Read-only by default; every write (release, rule add, config
my email arrive", "email security gateway", "INKY", "mail flow logs", "allow change) is gated behind `--confirm`.
block rule", "release spam". This skill talks to the LIVE production reseller
CloudFilter platform — treat releases conservatively. ## The two-layer context (important)
---
ACG's email security sits in front of client mailboxes as two cooperating layers:
# Mailprotector / CloudFilter Skill
| Layer | What it does |
Standalone CLI client for the **Mailprotector CloudFilter REST API** |---|---|
(`emailservice.io`), the reseller email-security platform ACG layers on top of | **Mailprotector CloudFilter** | The delivery / filtering gateway. Inbound and outbound mail passes through it; spam, virus, and policy hits are **held / quarantined** here. Releasing a held message re-injects it for delivery. This is the API this skill drives. |
client mail flow. Read-only by default; every write (release, rule add, config | **INKY** | Email annotation / phishing-banner layer. Adds the warning banners and protects against impersonation. Not part of this API surface. |
change) is gated behind `--confirm`.
Both sit **layered on top of the client's own Exchange / M365 mail flow** — so a
## The two-layer context (important) "missing email" investigation usually means: was it held at CloudFilter (check
`messages` / `logs`), or did it pass CloudFilter and stall in Exchange?
ACG's email security sits in front of client mailboxes as two cooperating layers:
## Connection
| Layer | What it does |
|---|---| | Item | Value |
| **Mailprotector CloudFilter** | The delivery / filtering gateway. Inbound and outbound mail passes through it; spam, virus, and policy hits are **held / quarantined** here. Releasing a held message re-injects it for delivery. This is the API this skill drives. | |---|---|
| **INKY** | Email annotation / phishing-banner layer. Adds the warning banners and protects against impersonation. Not part of this API surface. | | Base URL | `https://emailservice.io/api/v1` (override `MAILPROTECTOR_API_BASE_URL`) |
| Auth | `Authorization: Bearer <api_key>` |
Both sit **layered on top of the client's own Exchange / M365 mail flow** — so a | Vault entry | `msp-tools/mailprotector.sops.yaml`, field `credentials.api_key` |
"missing email" investigation usually means: was it held at CloudFilter (check | Env override | `MAILPROTECTOR_API_KEY` |
`messages` / `logs`), or did it pass CloudFilter and stall in Exchange?
Credential resolution order: `MAILPROTECTOR_API_KEY` env -> vault
## Connection `credentials.api_key`. The key is never hardcoded; a clear setup error is raised
if neither resolves.
| Item | Value |
|---|---| ### Scopes
| Base URL | `https://emailservice.io/api/v1` (override `MAILPROTECTOR_API_BASE_URL`) |
| Auth | `Authorization: Bearer <api_key>` | Five entity types carry `logs` / `messages` / `configuration` /
| Vault entry | `msp-tools/mailprotector.sops.yaml`, field `credentials.api_key` | `allow_block_rules` / `users` / `domains` sub-resources. Path form is
| Env override | `MAILPROTECTOR_API_KEY` | `/{scope}/{id}/...`:
Credential resolution order: `MAILPROTECTOR_API_KEY` env -> vault ```
`credentials.api_key`. The key is never hardcoded; a clear setup error is raised resellers, customers, domains, user_groups, users
if neither resolves. ```
### Scopes The CLI validates `scope` against this set.
Five entity types carry `logs` / `messages` / `configuration` / ## Running the CLI
`allow_block_rules` / `users` / `domains` sub-resources. Path form is
`/{scope}/{id}/...`: This machine's Python launcher is `py` (per identity.json); `python` / `python3`
also work. Run from the scripts dir so the two modules resolve.
```
resellers, customers, domains, user_groups, users ```bash
``` cd C:/claudetools/.claude/skills/mailprotector/scripts
The CLI validates `scope` against this set. py mp.py status # validate token (GET /domains, per_page=1)
py mp.py domains # list domains (global)
## Running the CLI py mp.py domains --scope customers --id <id>
py mp.py domain <domain_id>
This machine's Python launcher is `py` (per identity.json); `python` / `python3` py mp.py customers <reseller_id>
also work. Run from the scripts dir so the two modules resolve. py mp.py customer <customer_id>
py mp.py users <scope> <id>
```bash py mp.py user <user_id>
cd C:/claudetools/.claude/skills/mailprotector/scripts py mp.py find-user user@client.com # locate a user / alias by email (a READ)
py mp.py config <scope> <id> # shows permissions.messages.allow_spam_release
py mp.py status # validate token (GET /domains, per_page=1) py mp.py rules <scope> <id>
py mp.py domains # list domains (global) ```
py mp.py domains --scope customers --id <id>
py mp.py domain <domain_id> ### Mail-flow logs and held mail (the common investigation)
py mp.py customers <reseller_id>
py mp.py customer <customer_id> Both accept the same filters: `--sender --recipient --subject --decision
py mp.py users <scope> <id> --sort-field --sort-direction --page --page-size`.
py mp.py user <user_id>
py mp.py find-user user@client.com # locate a user / alias by email (a READ) ```bash
py mp.py config <scope> <id> # shows permissions.messages.allow_spam_release # Why didn't this arrive? Look at the decision in the flow logs.
py mp.py rules <scope> <id> py mp.py logs domains <domain_id> --recipient ceo@client.com --decision quarantine_spam
```
# Held / quarantined mail search.
### Mail-flow logs and held mail (the common investigation) py mp.py messages domains <domain_id> --sender boss@vendor.com
```
Both accept the same filters: `--sender --recipient --subject --decision
--sort-field --sort-direction --page --page-size`. `--decision` values: `default`, `deliver`, `quarantine_spam`,
`quarantine_virus`, `quarantine_policy`, `bounce`, `encrypt`, `delete`.
```bash `--sort-field` values: `@timestamp` (default), `prime.direction`,
# Why didn't this arrive? Look at the decision in the flow logs. `prime.from_header_raw`, `prime.recipient`, `prime.subject`, `prime.decision`,
py mp.py logs domains <domain_id> --recipient ceo@client.com --decision quarantine_spam `prime.score`.
# Held / quarantined mail search. ## Writes (gated)
py mp.py messages domains <domain_id> --sender boss@vendor.com
``` Every mutating command prints a `[DRY RUN]` line and exits non-zero unless you
pass `--confirm`.
`--decision` values: `default`, `deliver`, `quarantine_spam`,
`quarantine_virus`, `quarantine_policy`, `bounce`, `encrypt`, `delete`. ```bash
`--sort-field` values: `@timestamp` (default), `prime.direction`, py mp.py release <message_id> --confirm
`prime.from_header_raw`, `prime.recipient`, `prime.subject`, `prime.decision`, py mp.py release <message_id> --recipients alt@client.com --confirm
`prime.score`. py mp.py release-many <scope> <id> --ids 111,222,333 --confirm
py mp.py release-many <scope> <id> --all --confirm
## Writes (gated) py mp.py add-rule <scope> <id> --value vendor.com --type allow --confirm
py mp.py enable-release <scope> <id> --confirm
Every mutating command prints a `[DRY RUN]` line and exits non-zero unless you ```
pass `--confirm`.
## The `allow_spam_release` gotcha
```bash
py mp.py release <message_id> --confirm Releasing a held **spam** message fails if the owning entity does not have
py mp.py release <message_id> --recipients alt@client.com --confirm `permissions.messages.allow_spam_release = true`. Workflow:
py mp.py release-many <scope> <id> --ids 111,222,333 --confirm
py mp.py release-many <scope> <id> --all --confirm 1. `py mp.py config <scope> <id>` — check `allow_spam_release`.
py mp.py add-rule <scope> <id> --value vendor.com --type allow --confirm 2. If `false`: `py mp.py enable-release <scope> <id> --confirm`.
py mp.py enable-release <scope> <id> --confirm 3. Re-run the `release` / `release-many`.
```
Virus and policy quarantines are governed separately — only spam release is
## The `allow_spam_release` gotcha gated by this permission.
Releasing a held **spam** message fails if the owning entity does not have ## Example workflow: find a client's held outbound mail from a sender and release it
`permissions.messages.allow_spam_release = true`. Workflow:
```bash
1. `py mp.py config <scope> <id>` — check `allow_spam_release`. # 1. Find the client's domain.
2. If `false`: `py mp.py enable-release <scope> <id> --confirm`. py mp.py domains --scope customers --id <customer_id>
3. Re-run the `release` / `release-many`.
# 2. Search held messages from the sender (outbound = sender is the client user).
Virus and policy quarantines are governed separately — only spam release is py mp.py messages domains <domain_id> --sender user@client.com --decision quarantine_spam
gated by this permission.
# 3. If it's spam-held, make sure release is permitted on the domain.
## Example workflow: find a client's held outbound mail from a sender and release it py mp.py config domains <domain_id> # check allow_spam_release
py mp.py enable-release domains <domain_id> --confirm # only if needed
```bash
# 1. Find the client's domain. # 4. Release by message id (DRY RUN first — omit --confirm to preview).
py mp.py domains --scope customers --id <customer_id> py mp.py release <message_id> # [DRY RUN]
py mp.py release <message_id> --confirm # actually release
# 2. Search held messages from the sender (outbound = sender is the client user). ```
py mp.py messages domains <domain_id> --sender user@client.com --decision quarantine_spam
## Raw escape hatch
# 3. If it's spam-held, make sure release is permitted on the domain.
py mp.py config domains <domain_id> # check allow_spam_release The named commands cover the common surface; for anything else, hit the path
py mp.py enable-release domains <domain_id> --confirm # only if needed directly. Non-GET methods still require `--confirm`.
# 4. Release by message id (DRY RUN first — omit --confirm to preview). ```bash
py mp.py release <message_id> # [DRY RUN] py mp.py raw GET domains/<id>/logs
py mp.py release <message_id> --confirm # actually release py mp.py raw POST messages/<id>/deliver --body '{"include_original_recipients":1}' --confirm
``` ```
## Raw escape hatch ## Notes
The named commands cover the common surface; for anything else, hit the path - This is the **LIVE production reseller CloudFilter platform**. A release
directly. Non-GET methods still require `--confirm`. re-delivers real mail to real recipients, and an allow rule can let real spam
or phishing through — confirm the target entity with a read command before any
```bash write, and prefer releasing specific message ids over `--all`.
py mp.py raw GET domains/<id>/logs - Pagination: `page` (default 1) and `per_page` (default 25); reseller
py mp.py raw POST messages/<id>/deliver --body '{"include_original_recipients":1}' --confirm `messages` caps `per_page` at 50. The `X-Pagination` response header carries
``` the page/total metadata.
- Full endpoint catalog, filter tables, and the global `field[op]=value`
## Notes operators live in `references/api.md`.
- This is the **LIVE production reseller CloudFilter platform**. A release
re-delivers real mail to real recipients, and an allow rule can let real spam
or phishing through — confirm the target entity with a read command before any
write, and prefer releasing specific message ids over `--all`.
- Pagination: `page` (default 1) and `per_page` (default 25); reseller
`messages` caps `per_page` at 50. The `X-Pagination` response header carries
the page/total metadata.
- Full endpoint catalog, filter tables, and the global `field[op]=value`
operators live in `references/api.md`.

View File

@@ -1,142 +1,130 @@
--- ---
name: memory-dream name: memory-dream
description: >- description: "Lint + consolidate the ClaudeTools repo memory store (.claude/memory/): audits index, backlinks, file paths, duplicate clusters, stale facts. Read-only default; --apply-safe does low-risk fixes; merges/deletes surfaced as proposals. Triggers: memory dream, consolidate/lint/clean up/dedupe memory."
Memory lint + consolidation analyzer for the ClaudeTools REPO memory store
(.claude/memory/). Audits the index, backlinks, referenced file paths, ---
duplicate/overlap clusters, stale dated facts, and drift against the
machine-local harness profile memory store. Default run is read-only. # Memory Dream
--apply-safe performs the low-risk fixes (append missing index lines, copy
any profile-only files into the repo for indexing). Cluster merges, dedup A read-only-by-default analyzer that flags issues in the shared memory store.
deletes, and stale-fact removal are surfaced as PROPOSED actions for a Mutating ops are gated behind `--apply-safe` (for low-risk fixes) or the
human to apply -- they're judgment calls, not automation candidates. (Repo PROPOSED section (for judgment calls a human resolves by hand).
is the source of truth as of 2026-06-02; sync-memory.sh mirrors repo to
profile, so PROFILE-side cleanup is handled by that script, not here. See ## The two-store model (important)
feedback_memory_sync_destructive_ok.md.) Invoke for: "memory dream",
"consolidate memory", "memory lint", "clean up memory", "memory errors", There are TWO separate memory stores on every machine:
"dedupe memory".
--- - REPO store -- `.claude/memory/` (88+ `*.md` files + `MEMORY.md` index).
Tracked in git, syncs to all machines via Gitea. **This is the source of
# Memory Dream truth.** `CLAUDE.md` mandates writing here.
- HARNESS PROFILE store -- `$HOME/.claude/projects/<slug>/memory/`. Machine
A read-only-by-default analyzer that flags issues in the shared memory store. local, NOT in git, NOT synced. This is the store the Claude Code harness
Mutating ops are gated behind `--apply-safe` (for low-risk fixes) or the auto-injects into the system prompt at session start.
PROPOSED section (for judgment calls a human resolves by hand).
The two drift over time. `memory-dream` reports that drift in its report
## The two-store model (important) section. The companion script `.claude/scripts/sync-memory.sh` is what
actually reconciles them: it runs in **mirror mode** (since 2026-06-02) —
There are TWO separate memory stores on every machine: repo is authoritative, profile is synced to match (deletions propagate;
repo content wins on conflict). PROFILE-side hygiene lives in
- REPO store -- `.claude/memory/` (88+ `*.md` files + `MEMORY.md` index). `sync-memory.sh`, not here.
Tracked in git, syncs to all machines via Gitea. **This is the source of
truth.** `CLAUDE.md` mandates writing here. ## What it checks
- HARNESS PROFILE store -- `$HOME/.claude/projects/<slug>/memory/`. Machine
local, NOT in git, NOT synced. This is the store the Claude Code harness `scripts/memory_dream.py` runs six READ-ONLY analyses over the REPO store:
auto-injects into the system prompt at session start.
1. INDEX RECONCILE -- orphan files (no `MEMORY.md` line), index lines whose
The two drift over time. `memory-dream` reports that drift in its report target file is missing, and frontmatter `name:` vs filename signals.
section. The companion script `.claude/scripts/sync-memory.sh` is what 2. BACKLINKS -- `[[name]]` references in bodies whose target slug has no file.
actually reconciles them: it runs in **mirror mode** (since 2026-06-02) — 3. REFERENCED-ARTIFACT VALIDITY -- conservatively extracts repo-relative file
repo is authoritative, profile is synced to match (deletions propagate; paths / script names from each body (backtick-wrapped single tokens only)
repo content wins on conflict). PROFILE-side hygiene lives in and flags ones not found in the repo. Reported as **verify**, never delete
`sync-memory.sh`, not here. (many are legitimately server-side or in sibling repos).
4. DUPLICATE / OVERLAP CLUSTERS -- groups memories by type + token-overlap /
## What it checks shared slug-prefix and lists candidate mergeable clusters (e.g. the many
`feedback_syncro_*` files). **Proposes** merges; never performs them.
`scripts/memory_dream.py` runs six READ-ONLY analyses over the REPO store: 5. STALE DATED FACTS -- flags `project`-type memories with an "as of <date>"
style claim older than ~6 months for re-verification.
1. INDEX RECONCILE -- orphan files (no `MEMORY.md` line), index lines whose 6. DRIFT vs PROFILE STORE -- locates the harness profile memory dir for this
target file is missing, and frontmatter `name:` vs filename signals. project and reports profile-only files (candidates to migrate INTO the repo)
2. BACKLINKS -- `[[name]]` references in bodies whose target slug has no file. and repo-only files (candidates to push OUT to profile). Report only.
3. REFERENCED-ARTIFACT VALIDITY -- conservatively extracts repo-relative file
paths / script names from each body (backtick-wrapped single tokens only) The report ends with a `## PROPOSED (needs human approval)` section that is
and flags ones not found in the repo. Reported as **verify**, never delete NEVER auto-applied.
(many are legitimately server-side or in sibling repos).
4. DUPLICATE / OVERLAP CLUSTERS -- groups memories by type + token-overlap / ## Modes
shared slug-prefix and lists candidate mergeable clusters (e.g. the many
`feedback_syncro_*` files). **Proposes** merges; never performs them. - Default (no flag) -- **REPORT ONLY. Mutates nothing.** Writes a timestamped
5. STALE DATED FACTS -- flags `project`-type memories with an "as of <date>" report to `.claude/memory/_reports/YYYY-MM-DD-HHMM-dream.md` (created if
style claim older than ~6 months for re-verification. missing) and prints it to stdout.
6. DRIFT vs PROFILE STORE -- locates the harness profile memory dir for this - `--apply-safe` -- performs ONLY additive, non-destructive fixes and prints
project and reports profile-only files (candidates to migrate INTO the repo) each action:
and repo-only files (candidates to push OUT to profile). Report only. - (a) append missing index lines to `MEMORY.md` for orphan files, under the
correct `## <Type>` header, never reordering or removing existing lines;
The report ends with a `## PROPOSED (needs human approval)` section that is - (b) copy profile-only memory files INTO the repo store (additive
NEVER auto-applied. migration). If a same-named repo file already exists it is SKIPPED and the
conflict is reported -- it is never overwritten.
## Modes - `--no-file` -- print to stdout only; skip writing the `_reports/` file.
- `--report-file <path>` -- write the report to an explicit path.
- Default (no flag) -- **REPORT ONLY. Mutates nothing.** Writes a timestamped
report to `.claude/memory/_reports/YYYY-MM-DD-HHMM-dream.md` (created if ### What dream does NOT auto-do
missing) and prints it to stdout.
- `--apply-safe` -- performs ONLY additive, non-destructive fixes and prints `memory-dream` does NOT, even with `--apply-safe`:
each action:
- (a) append missing index lines to `MEMORY.md` for orphan files, under the - delete a repo memory file (cluster dedup is a judgment call — pick which file becomes canonical, fold the others' content, retire the originals deliberately);
correct `## <Type>` header, never reordering or removing existing lines; - remove or reorder index lines (index cleanups are also surfaced as proposals);
- (b) copy profile-only memory files INTO the repo store (additive - overwrite a file whose content differs;
migration). If a same-named repo file already exists it is SKIPPED and the - perform a proposed merge.
conflict is reported -- it is never overwritten.
- `--no-file` -- print to stdout only; skip writing the `_reports/` file. These stay in the report's `## PROPOSED` section. The rationale isn't "never delete" any more (the fleet-wide additive safety net was dropped 2026-06-02; see `feedback_memory_sync_destructive_ok.md`) — it's that merges and dedups require human judgment about which file is canonical and how to combine content. Profile-side deletion DOES happen automatically — but in `sync-memory.sh`, not here.
- `--report-file <path>` -- write the report to an explicit path.
## Running it
### What dream does NOT auto-do
This machine's Python launcher is `py` (per identity.json); the script also
`memory-dream` does NOT, even with `--apply-safe`: runs under `python` / `python3`. Stdlib only -- no pip deps.
- delete a repo memory file (cluster dedup is a judgment call — pick which file becomes canonical, fold the others' content, retire the originals deliberately); ```bash
- remove or reorder index lines (index cleanups are also surfaced as proposals); # REPORT ONLY (default) -- writes _reports/<stamp>-dream.md and prints it
- overwrite a file whose content differs; py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py"
- perform a proposed merge.
# report to stdout only, write nothing
These stay in the report's `## PROPOSED` section. The rationale isn't "never delete" any more (the fleet-wide additive safety net was dropped 2026-06-02; see `feedback_memory_sync_destructive_ok.md`) — it's that merges and dedups require human judgment about which file is canonical and how to combine content. Profile-side deletion DOES happen automatically — but in `sync-memory.sh`, not here. py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --no-file
## Running it # additive-only fixes (append orphan index lines, migrate profile-only files)
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --apply-safe
This machine's Python launcher is `py` (per identity.json); the script also ```
runs under `python` / `python3`. Stdlib only -- no pip deps.
`CLAUDETOOLS_ROOT` resolves from the env var, else `claudetools_root` in
```bash `.claude/identity.json`, else the repo root derived from the script's own
# REPORT ONLY (default) -- writes _reports/<stamp>-dream.md and prints it location -- no hardcoded drive letters.
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py"
## Cleanup / approve workflow
# report to stdout only, write nothing
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --no-file 1. Run with no flag. Read the report (stdout or `_reports/<stamp>-dream.md`).
2. Run `--apply-safe` to take the safe additive wins: orphan index lines get
# additive-only fixes (append orphan index lines, migrate profile-only files) added, profile-only memories get migrated into the repo (conflicts skipped
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --apply-safe and reported).
``` 3. Work the `## PROPOSED` section by hand:
- `[MERGE?]` -- decide whether to consolidate a cluster. If yes, author a new
`CLAUDETOOLS_ROOT` resolves from the env var, else `claudetools_root` in combined memory (or set of files for a rule/history split), retire the
`.claude/identity.json`, else the repo root derived from the script's own originals via `git rm`, update `MEMORY.md`. Deletions are now first-class
location -- no hardcoded drive letters. `sync-memory.sh` mirror mode will propagate them to every profile store
on the next run.
## Cleanup / approve workflow - `[REVERIFY?]` -- confirm the dated fact still holds; update the body and
its date if it changed.
1. Run with no flag. Read the report (stdout or `_reports/<stamp>-dream.md`). - `[STALE-REF?]` -- confirm the referenced path moved/renamed; repoint or
2. Run `--apply-safe` to take the safe additive wins: orphan index lines get annotate. Many are legitimately server-side (`.service` units, `/opt/...`).
added, profile-only memories get migrated into the repo (conflicts skipped - `[INDEX-CLEANUP?]` / `[DRIFT-RESOLVE?]` -- human picks the winner.
and reported). 4. Commit the repo store changes so they sync to the fleet via Gitea.
3. Work the `## PROPOSED` section by hand:
- `[MERGE?]` -- decide whether to consolidate a cluster. If yes, author a new ## Self-test
combined memory (or set of files for a rule/history split), retire the
originals via `git rm`, update `MEMORY.md`. Deletions are now first-class `scripts/selftest.py` runs the analyzer against a synthetic fixture memory
`sync-memory.sh` mirror mode will propagate them to every profile store store in a temp dir and asserts each detector fires (orphan, missing target,
on the next run. broken backlink, stale path, cluster, profile drift) and that `--apply-safe`
- `[REVERIFY?]` -- confirm the dated fact still holds; update the body and only touches the things it's supposed to (index appends + profile→repo copy
its date if it changed. of new files; no deletions, no merges, no overwrites of differing content).
- `[STALE-REF?]` -- confirm the referenced path moved/renamed; repoint or Run:
annotate. Many are legitimately server-side (`.service` units, `/opt/...`).
- `[INDEX-CLEANUP?]` / `[DRIFT-RESOLVE?]` -- human picks the winner. ```bash
4. Commit the repo store changes so they sync to the fleet via Gitea. py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/selftest.py"
```
## Self-test
`scripts/selftest.py` runs the analyzer against a synthetic fixture memory
store in a temp dir and asserts each detector fires (orphan, missing target,
broken backlink, stale path, cluster, profile drift) and that `--apply-safe`
only touches the things it's supposed to (index appends + profile→repo copy
of new files; no deletions, no merges, no overwrites of differing content).
Run:
```bash
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/selftest.py"
```

View File

@@ -1,127 +1,115 @@
--- ---
name: packetdial name: packetdial
description: >- description: "Manage the ACG PacketDial/OITVOIP hosted VoIP via the NetSapiens API (pbx.packetdial.com). List/inspect domains, users, devices, DIDs, resellers; pull CDRs; provision domains/users/SIP/numbers (writes gated --confirm; read-only default). Triggers: packetdial, oitvoip, netsapiens, voip domain/user/extension, provision phone, add did, CDR. Live production PBX."
Manage the Arizona Computer Guru (ACG) PacketDial / OITVOIP hosted-VoIP
platform via the NetSapiens SNAPsolution API v2 (pbx.packetdial.com, ---
v44.4). List and inspect domains, users, devices/phones, DIDs (phone
numbers), resellers, and pull CDRs (call detail records). Provision new # PacketDial / NetSapiens (OITVOIP) Skill
customer domains, users, SIP devices, and phone numbers (all writes gated
behind --confirm). Read-only by default. Invoke for: "packetdial", Standalone CLI client for the NetSapiens SNAPsolution **API v2** that backs
"oitvoip", "oit voip", "netsapiens", "voip portal", "pbx portal", "voip ACG's hosted-VoIP offering through OITVOIP / PacketDial. Read-only by default;
domain", "voip user", "voip extension", "provision phone", "add did", every write (create / update / delete) is gated behind `--confirm`.
"phone number on voip", "call detail records", "cdr", "voip.packetdial",
"pbx.packetdial". NOTE: voip.packetdial.com is the customer-facing portal ## The two hostnames (important)
(the fax/UC dashboard, e.g. Cascades account 28598) and has no API — the
programmable surface is pbx.packetdial.com. This skill talks to the LIVE | Host | What it is | API? |
production reseller PBX; treat writes conservatively. |---|---|---|
--- | `voip.packetdial.com` | Customer-facing white-label portal / UC & fax dashboard (e.g. Cascades fax account **28598**). Login-gated UI. | **No** |
| `pbx.packetdial.com` | Reseller PBX platform — NetSapiens v44.4. | **Yes** — this skill targets it |
# PacketDial / NetSapiens (OITVOIP) Skill
- API base: `https://pbx.packetdial.com/ns-api/v2`
Standalone CLI client for the NetSapiens SNAPsolution **API v2** that backs - Token endpoint: `https://pbx.packetdial.com/ns-api/v2/tokens`
ACG's hosted-VoIP offering through OITVOIP / PacketDial. Read-only by default; - Live OpenAPI spec: `https://pbx.packetdial.com/ns-api/webroot/openapi/openapi.json`
every write (create / update / delete) is gated behind `--confirm`. - Live Swagger UI: `https://pbx.packetdial.com/ns-api/openapi`
- Vendor docs: https://docs.ns-api.com/ (login) and https://voipdocs.io/oitvoip-access-platform-apis
## The two hostnames (important)
## Credentials — ONE-TIME SETUP (not yet provisioned)
| Host | What it is | API? |
|---|---|---| As of this skill's creation **no API key exists yet** — the vault entry
| `voip.packetdial.com` | Customer-facing white-label portal / UC & fax dashboard (e.g. Cascades fax account **28598**). Login-gated UI. | **No** | `msp-tools/oitvoip.sops.yaml` is empty/absent, so every command will fail with a
| `pbx.packetdial.com` | Reseller PBX platform — NetSapiens v44.4. | **Yes** — this skill targets it | clear "No credentials found" error until you do this once:
- API base: `https://pbx.packetdial.com/ns-api/v2` 1. Log into `pbx.packetdial.com` -> **Admin > API Keys** and create a
- Token endpoint: `https://pbx.packetdial.com/ns-api/v2/tokens` reseller-scoped key (prefix `nsr_`). If self-service key creation is not
- Live OpenAPI spec: `https://pbx.packetdial.com/ns-api/webroot/openapi/openapi.json` available, reply to **Darwin Escaro (OITVOIP)** for reseller OAuth client
- Live Swagger UI: `https://pbx.packetdial.com/ns-api/openapi` credentials.
- Vendor docs: https://docs.ns-api.com/ (login) and https://voipdocs.io/oitvoip-access-platform-apis 2. Store it in the SOPS vault. Preferred (static bearer key):
```
## Credentials — ONE-TIME SETUP (not yet provisioned) # msp-tools/oitvoip.sops.yaml
credentials:
As of this skill's creation **no API key exists yet** — the vault entry api_key: nsr_xxxxxxxxxxxxxxxx
`msp-tools/oitvoip.sops.yaml` is empty/absent, so every command will fail with a ```
clear "No credentials found" error until you do this once: Or, for OAuth2 password-grant credentials:
```
1. Log into `pbx.packetdial.com` -> **Admin > API Keys** and create a credentials:
reseller-scoped key (prefix `nsr_`). If self-service key creation is not client_id: <client id>
available, reply to **Darwin Escaro (OITVOIP)** for reseller OAuth client client_secret: <client secret>
credentials. username: <portal user@domain>
2. Store it in the SOPS vault. Preferred (static bearer key): password: <portal password>
``` ```
# msp-tools/oitvoip.sops.yaml 3. That's it — the client auto-detects which shape is present.
credentials:
api_key: nsr_xxxxxxxxxxxxxxxx The client never hardcodes secrets. Resolution order: `PACKETDIAL_API_KEY` env
``` -> `PACKETDIAL_CLIENT_ID`+friends env -> vault `credentials.api_key` -> vault
Or, for OAuth2 password-grant credentials: OAuth fields. Env overrides exist for quick testing without touching the vault.
```
credentials: ## Running the CLI
client_id: <client id>
client_secret: <client secret> This machine's Python launcher is `py` (per identity.json); `python` / `python3`
username: <portal user@domain> also work. Run from the scripts dir so the two modules resolve.
password: <portal password>
``` ```bash
3. That's it — the client auto-detects which shape is present. cd C:/claudetools/.claude/skills/packetdial/scripts
The client never hardcodes secrets. Resolution order: `PACKETDIAL_API_KEY` env py ns.py status # API version + authenticated key identity
-> `PACKETDIAL_CLIENT_ID`+friends env -> vault `credentials.api_key` -> vault py ns.py domains # list all domains
OAuth fields. Env overrides exist for quick testing without touching the vault. py ns.py domain <domain> # one domain's config
py ns.py users <domain> # users / extensions in a domain
## Running the CLI py ns.py user <domain> <user>
py ns.py phones <domain> # SIP devices registered in a domain
This machine's Python launcher is `py` (per identity.json); `python` / `python3` py ns.py dids <domain> # phone numbers (DIDs) on a domain
also work. Run from the scripts dir so the two modules resolve. py ns.py devices <domain> <user>
py ns.py cdrs --domain <domain> --start 2026-06-01 --end 2026-06-02
```bash py ns.py resellers
cd C:/claudetools/.claude/skills/packetdial/scripts ```
py ns.py status # API version + authenticated key identity ## Writes (gated)
py ns.py domains # list all domains
py ns.py domain <domain> # one domain's config Every mutating command prints a `[DRY RUN]` line and exits non-zero unless you
py ns.py users <domain> # users / extensions in a domain pass `--confirm`. Bodies are raw JSON matching the NetSapiens v2 schema.
py ns.py user <domain> <user>
py ns.py phones <domain> # SIP devices registered in a domain ```bash
py ns.py dids <domain> # phone numbers (DIDs) on a domain py ns.py create-domain --body '{"domain":"acme","description":"Acme Inc"}' --confirm
py ns.py devices <domain> <user> py ns.py create-user acme --body '{"user":"101","name-first-name":"Jane"}' --confirm
py ns.py cdrs --domain <domain> --start 2026-06-01 --end 2026-06-02 py ns.py create-phone acme --body '{...}' --confirm
py ns.py resellers py ns.py create-did acme --body '{"phonenumber":"15205551234"}' --confirm
``` py ns.py update-user acme 101 --body '{"name-last-name":"Doe"}' --confirm
py ns.py delete-user acme 101 --confirm
## Writes (gated) ```
Every mutating command prints a `[DRY RUN]` line and exits non-zero unless you ## Raw escape hatch (any of the 239 v2 paths)
pass `--confirm`. Bodies are raw JSON matching the NetSapiens v2 schema.
The named commands cover the common surface; for anything else, hit the path
```bash directly. Non-GET methods still require `--confirm`.
py ns.py create-domain --body '{"domain":"acme","description":"Acme Inc"}' --confirm
py ns.py create-user acme --body '{"user":"101","name-first-name":"Jane"}' --confirm ```bash
py ns.py create-phone acme --body '{...}' --confirm py ns.py raw GET domains/acme/users/101/answerrules
py ns.py create-did acme --body '{"phonenumber":"15205551234"}' --confirm py ns.py raw POST domains/acme/users --body '{...}' --confirm
py ns.py update-user acme 101 --body '{"name-last-name":"Doe"}' --confirm ```
py ns.py delete-user acme 101 --confirm
``` ## Standard provisioning flow (new customer)
## Raw escape hatch (any of the 239 v2 paths) 1. `create-domain` -> dial plan auto-generates
2. `create-user` per extension
The named commands cover the common surface; for anything else, hit the path 3. `create-phone` per SIP device (MAC-provisioned)
directly. Non-GET methods still require `--confirm`. 4. `create-did` to attach DIDs and route them to users
5. Log the work back to the Syncro ticket
```bash
py ns.py raw GET domains/acme/users/101/answerrules ## Notes
py ns.py raw POST domains/acme/users --body '{...}' --confirm
``` - This is the LIVE production reseller PBX. A bad `create-domain` or
`delete-user` affects real customers — confirm the target domain first with a
## Standard provisioning flow (new customer) read command before any write.
- CDR queries can be large; always pass `--start`/`--end` and a `--limit`.
1. `create-domain` -> dial plan auto-generates - Reference detail (auth shapes, full endpoint inventory) lives in
2. `create-user` per extension `references/api.md`.
3. `create-phone` per SIP device (MAC-provisioned)
4. `create-did` to attach DIDs and route them to users
5. Log the work back to the Syncro ticket
## Notes
- This is the LIVE production reseller PBX. A bad `create-domain` or
`delete-user` affects real customers — confirm the target domain first with a
read command before any write.
- CDR queries can be large; always pass `--start`/`--end` and a `--limit`.
- Reference detail (auth shapes, full endpoint inventory) lives in
`references/api.md`.

View File

@@ -1,65 +1,61 @@
--- ---
name: remediation-tool name: remediation-tool
description: | description: "M365 tenant investigation + remediation via the ComputerGuru MSP app suite (Security Investigator/Exchange Operator/User Manager/Tenant Admin/Defender). Direct Graph+Exchange REST (not CIPP). Triggers: 365 remediation, breach/credential-stuffing check, check a mailbox, inbox rules, mailbox forwarding, delegate/SendAs audit, OAuth consent, sign-in/risky-user lookup, tenant sweep."
M365 tenant investigation and remediation using the ComputerGuru tiered MSP app suite (5 apps: Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on). Auto-invoke when the user says "remediation tool", "365 remediation", "check <user>'s mailbox/box", "credential stuffing" against an M365 user, "breach check" on an M365 tenant, or needs M365 admin API work that client-credentials Graph + Exchange REST can perform. NOT for CIPP — this is the direct Graph API app suite.
---
Also invoke when the user needs any of: inbox rule enumeration, mailbox forwarding check, delegate/SendAs audit, OAuth consent audit, sign-in log queries, risky user lookup, directory audit queries, B2B guest invite audit against M365.
# 365 Remediation Tool
Triggers: "365 remediation", "remediation tool", "check <user> box/mailbox/account for breach", "credential stuff*", "who's getting attacked", "foreign sign-in", "inbox rule", "mailbox forward*", "oauth consent" (in MSP context), "tenant sweep", "risky user", "hidden rule", Exchange Online admin API, "adminapi/beta/{tenant}/InvokeCommand".
--- Read-only by default. All remediation actions require explicit `YES` confirmation in chat (not a permission prompt).
# 365 Remediation Tool ## App Architecture (Tiered)
Read-only by default. All remediation actions require explicit `YES` confirmation in chat (not a permission prompt). Five multi-tenant apps cover distinct privilege tiers. Use only what the task requires.
## App Architecture (Tiered) | Tier | App display name | App ID | Vault file | Scope |
|---|---|---|---|---|
Five multi-tenant apps cover distinct privilege tiers. Use only what the task requires. | `investigator` | ComputerGuru Security Investigator | `bfbc12a4-f0dd-4e12-b06d-997e7271e10c` | `computerguru-security-investigator.sops.yaml` | Graph read-only |
| `investigator-exo` | ComputerGuru Security Investigator | `bfbc12a4-f0dd-4e12-b06d-997e7271e10c` | `computerguru-security-investigator.sops.yaml` | Exchange Online read |
| Tier | App display name | App ID | Vault file | Scope | | `exchange-op` | ComputerGuru Exchange Operator | `b43e7342-5b4b-492f-890f-bb5a4f7f40e9` | `computerguru-exchange-operator.sops.yaml` | Exchange Online write |
|---|---|---|---|---| | `user-manager` | ComputerGuru User Manager | `64fac46b-8b44-41ad-93ee-7da03927576c` | `computerguru-user-manager.sops.yaml` | Graph user/group write |
| `investigator` | ComputerGuru Security Investigator | `bfbc12a4-f0dd-4e12-b06d-997e7271e10c` | `computerguru-security-investigator.sops.yaml` | Graph read-only | | `tenant-admin` | ComputerGuru Tenant Admin | `709e6eed-0711-4875-9c44-2d3518c47063` | `computerguru-tenant-admin.sops.yaml` | Graph high-privilege |
| `investigator-exo` | ComputerGuru Security Investigator | `bfbc12a4-f0dd-4e12-b06d-997e7271e10c` | `computerguru-security-investigator.sops.yaml` | Exchange Online read | | `defender` | ComputerGuru Defender Add-on | `dbf8ad1a-54f4-4bb8-8a9e-ea5b9634635b` | `computerguru-defender-addon.sops.yaml` | Defender ATP (MDE only) |
| `exchange-op` | ComputerGuru Exchange Operator | `b43e7342-5b4b-492f-890f-bb5a4f7f40e9` | `computerguru-exchange-operator.sops.yaml` | Exchange Online write |
| `user-manager` | ComputerGuru User Manager | `64fac46b-8b44-41ad-93ee-7da03927576c` | `computerguru-user-manager.sops.yaml` | Graph user/group write | **Default for breach checks:** use `investigator` (Graph) + `investigator-exo` (Exchange read). Escalate to write tiers only when remediating.
| `tenant-admin` | ComputerGuru Tenant Admin | `709e6eed-0711-4875-9c44-2d3518c47063` | `computerguru-tenant-admin.sops.yaml` | Graph high-privilege |
| `defender` | ComputerGuru Defender Add-on | `dbf8ad1a-54f4-4bb8-8a9e-ea5b9634635b` | `computerguru-defender-addon.sops.yaml` | Defender ATP (MDE only) | ## Auto-Invocation Behavior
**Default for breach checks:** use `investigator` (Graph) + `investigator-exo` (Exchange read). Escalate to write tiers only when remediating. When triggered automatically (vs. via `/remediation-tool`), follow the same workflow in `.claude/commands/remediation-tool.md`:
## Auto-Invocation Behavior 1. Parse the user's intent into a subcommand (check/sweep/signins/consent-url/remediate).
2. Resolve tenant ID from domain.
When triggered automatically (vs. via `/remediation-tool`), follow the same workflow in `.claude/commands/remediation-tool.md`: 3. Acquire tokens via `get-token.sh <tenant> <tier>` — use lowest-privilege tier needed.
4. Run checks via scripts in `scripts/`.
1. Parse the user's intent into a subcommand (check/sweep/signins/consent-url/remediate). 5. Interpret findings using `references/checklist.md`.
2. Resolve tenant ID from domain. 6. Write report to `clients/{slug}/reports/YYYY-MM-DD-{action}.md` using `templates/breach-report.md`.
3. Acquire tokens via `get-token.sh <tenant> <tier>` — use lowest-privilege tier needed. 7. Chat summary + delegate commit to Gitea agent.
4. Run checks via scripts in `scripts/`.
5. Interpret findings using `references/checklist.md`. ## Before calling any script, verify
6. Write report to `clients/{slug}/reports/YYYY-MM-DD-{action}.md` using `templates/breach-report.md`.
7. Chat summary + delegate commit to Gitea agent. - The SOPS vault is accessible via `.claude/identity.json` `vault_path` field. The scripts auto-resolve the vault location from identity.json — no hardcoded paths.
- `jq`, `curl`, `bash` are available.
## Before calling any script, verify - For Exchange REST checks: confirm the target tenant has **Exchange Administrator** role assigned to the **Security Investigator** SP (for reads) or **Exchange Operator** SP (for writes). If any Exchange REST call returns 403, emit the tenant-scoped Entra Roles link from `references/gotchas.md`.
- For Identity Protection checks: `IdentityRiskyUser.Read.All` is in the Security Investigator manifest AND the tenant has consented to that app. If 403, emit the per-app consent URL from `references/gotchas.md`.
- The SOPS vault is accessible via `.claude/identity.json` `vault_path` field. The scripts auto-resolve the vault location from identity.json — no hardcoded paths. - For Defender checks: confirm tenant has Microsoft Defender for Endpoint (MDE) license before using `defender` tier — it returns AADSTS650052 otherwise.
- `jq`, `curl`, `bash` are available.
- For Exchange REST checks: confirm the target tenant has **Exchange Administrator** role assigned to the **Security Investigator** SP (for reads) or **Exchange Operator** SP (for writes). If any Exchange REST call returns 403, emit the tenant-scoped Entra Roles link from `references/gotchas.md`. ## Conventions
- For Identity Protection checks: `IdentityRiskyUser.Read.All` is in the Security Investigator manifest AND the tenant has consented to that app. If 403, emit the per-app consent URL from `references/gotchas.md`.
- For Defender checks: confirm tenant has Microsoft Defender for Endpoint (MDE) license before using `defender` tier — it returns AADSTS650052 otherwise. - **Target identifiers**: accept UPN, domain, or tenant GUID. Normalize to tenant GUID internally.
- **Token tiers**: minimum necessary privilege. Never use `tenant-admin` for a read-only check.
## Conventions - **Token cache**: `/tmp/remediation-tool/{tenant-id}/{tier}.jwt`. TTL 55 minutes. Check `-mmin -55` before reuse.
- **Raw JSON artifacts**: `/tmp/remediation-tool/{tenant-id}/{check}/` — keep so the user can re-analyze.
- **Target identifiers**: accept UPN, domain, or tenant GUID. Normalize to tenant GUID internally. - **Reports**: `clients/{slug}/reports/YYYY-MM-DD-{action}.md`. Derive slug from domain (strip TLD, hyphenate).
- **Token tiers**: minimum necessary privilege. Never use `tenant-admin` for a read-only check. - **UTC dates everywhere**.
- **Token cache**: `/tmp/remediation-tool/{tenant-id}/{tier}.jwt`. TTL 55 minutes. Check `-mmin -55` before reuse.
- **Raw JSON artifacts**: `/tmp/remediation-tool/{tenant-id}/{check}/` — keep so the user can re-analyze. ## Scope boundaries
- **Reports**: `clients/{slug}/reports/YYYY-MM-DD-{action}.md`. Derive slug from domain (strip TLD, hyphenate).
- **UTC dates everywhere**. - **Not a replacement for CIPP.** Use CIPP for bulk baseline configuration, templates, standards alerting. Use this tool for focused investigation and point-in-time remediation.
- **Entra app registrations stay manual in the portal** — don't create/modify the multi-tenant apps themselves via the tool.
## Scope boundaries - **Conditional Access policies CAN be managed programmatically** (Tenant Admin tier holds `Policy.ReadWrite.ConditionalAccess` + the Conditional Access Administrator role). MANDATORY discipline: (1) always create/modify in **report-only** (`state: enabledForReportingButNotEnforced`) first; (2) always **exclude the tenant's break-glass account** (`conditions.users.excludeUsers`); (3) verify impact in Entra sign-in logs before enforcing; (4) get explicit user confirmation before flipping any policy to `enabled` on a tenant with real users. (CA-manual boundary relaxed 2026-05-27 at Mike's direction — break-glass + report-only keep blast radius near zero.)
- **Not for Graph permissions the apps don't have.** If a call 403s and the scope isn't in the relevant app's manifest, stop and tell the user — don't try to work around it.
- **Not a replacement for CIPP.** Use CIPP for bulk baseline configuration, templates, standards alerting. Use this tool for focused investigation and point-in-time remediation. - **Defender tier requires MDE license.** If the tenant doesn't have MDE, the token request succeeds but API calls return AADSTS650052. Check before using.
- **Entra app registrations stay manual in the portal** — don't create/modify the multi-tenant apps themselves via the tool.
- **Conditional Access policies CAN be managed programmatically** (Tenant Admin tier holds `Policy.ReadWrite.ConditionalAccess` + the Conditional Access Administrator role). MANDATORY discipline: (1) always create/modify in **report-only** (`state: enabledForReportingButNotEnforced`) first; (2) always **exclude the tenant's break-glass account** (`conditions.users.excludeUsers`); (3) verify impact in Entra sign-in logs before enforcing; (4) get explicit user confirmation before flipping any policy to `enabled` on a tenant with real users. (CA-manual boundary relaxed 2026-05-27 at Mike's direction — break-glass + report-only keep blast radius near zero.)
- **Not for Graph permissions the apps don't have.** If a call 403s and the scope isn't in the relevant app's manifest, stop and tell the user — don't try to work around it.
- **Defender tier requires MDE license.** If the tenant doesn't have MDE, the token request succeeds but API calls return AADSTS650052. Check before using.

View File

@@ -1,161 +1,151 @@
--- ---
name: self-check name: self-check
description: >- description: "Self-diagnose this machine's harness conformance vs the fleet baseline: identity.json, tooling, env/paths, hooks, skill/command/script set, vault decrypt, coord/Gitea reachability, capability tier. Grades RED/AMBER/GREEN; can publish a census. Triggers: self check/test, doctor, health check, am I configured right, harness/fleet conformance."
Self-diagnose a ClaudeTools session's machine: verify the harness is wired the
same way as every other instance while allowing for architectural / OS / hardware ---
differences. Checks that identity.json exists and is correct (the map of WHERE
things live on this box), required tooling is installed, env/paths resolve, # Self-Check — ClaudeTools Harness Self-Diagnosis
hooks are wired, the skill/command/script set matches the baseline, the vault
decrypts, coord/Gitea are reachable, and the machine's capability tier (e.g. no A top-to-bottom evaluation of how *this* machine's ClaudeTools harness is wired,
local Ollama) resolves to the right fallback ruleset. Grades RED/AMBER/GREEN and graded against a checked-in **baseline manifest** so every machine behaves the
can publish a census to the coord API so the fleet baseline can be built/refined. same way — while explicitly allowing for architecture, OS, and hardware
Invoke for: "self check", "self diagnosis", "self test", "doctor", "health check", differences via a **capability tier** model.
"am I configured right", "is my machine set up correctly", "harness conformance",
"fleet conformance", "check my environment", "is everything wired up". This is the skill the user asked for when a session needs to "make sure
--- everything is as it should be."
# Self-Check — ClaudeTools Harness Self-Diagnosis ## The model in one paragraph
A top-to-bottom evaluation of how *this* machine's ClaudeTools harness is wired, `identity.json` is the foundational, per-machine map of **where things live and
graded against a checked-in **baseline manifest** so every machine behaves the what this box can do** (vault path, repo root, platform, arch, python command,
same way — while explicitly allowing for architecture, OS, and hardware Ollama endpoints). The **baseline manifest**
differences via a **capability tier** model. (`baseline/manifest.json`) declares what *every* machine must have — required
tools, identity fields, scripts, hook files, the wired `settings.json` hooks, the
This is the skill the user asked for when a session needs to "make sure canonical skill/command set, and the **capability rules** that say what to do when
everything is as it should be." a capability is absent (e.g. no local Ollama → use the remote endpoint, or if that
is also down, route Tier-0 work to haiku instead of blocking). The probe compares
## The model in one paragraph the live machine against the manifest, resolves the machine's capability tier, and
grades RED/AMBER/GREEN. Required things missing = RED. Advisory drift = AMBER.
`identity.json` is the foundational, per-machine map of **where things live and Capability differences are **never** failures — they select a ruleset.
what this box can do** (vault path, repo root, platform, arch, python command,
Ollama endpoints). The **baseline manifest** ## V1 is a CENSUS tool (read this)
(`baseline/manifest.json`) declares what *every* machine must have — required
tools, identity fields, scripts, hook files, the wired `settings.json` hooks, the There is no ratified fleet baseline yet. `baseline/manifest.json` is **provisional**,
canonical skill/command set, and the **capability rules** that say what to do when generated from a single known-good machine (GURU-5070). So V1's job is to gather
a capability is absent (e.g. no local Ollama → use the remote endpoint, or if that ground truth from every machine and help Mike build the real baseline:
is also down, route Tier-0 work to haiku instead of blocking). The probe compares
the live machine against the manifest, resolves the machine's capability tier, and 1. **Probe** — each machine runs the check and produces a structured census.
grades RED/AMBER/GREEN. Required things missing = RED. Advisory drift = AMBER. 2. **Publish**`--publish` PUTs the census to coord as component
Capability differences are **never** failures — they select a ruleset. `selfcheck_<host>` (state = grade, notes = full JSON). One row per machine =
a live fleet conformance view.
## V1 is a CENSUS tool (read this) 3. **Fan out**`fanout` broadcasts a request to `ALL_SESSIONS` so every active
instance reports.
There is no ratified fleet baseline yet. `baseline/manifest.json` is **provisional**, 4. **Aggregate**`aggregate` reads all censuses back and proposes a baseline
generated from a single known-good machine (GURU-5070). So V1's job is to gather (tools/skills/commands present on *all* machines = "required everywhere";
ground truth from every machine and help Mike build the real baseline: present on *some* = "capability-gated"), and lists machines with FAILs.
1. **Probe** — each machine runs the check and produces a structured census. Mike reviews the aggregate and ratifies `manifest.json`. From then on the same
2. **Publish**`--publish` PUTs the census to coord as component probe enforces conformance. **V1 does not auto-fix anything** — it reports the
`selfcheck_<host>` (state = grade, notes = full JSON). One row per machine = exact fix command for each finding (per the decision on record).
a live fleet conformance view.
3. **Fan out**`fanout` broadcasts a request to `ALL_SESSIONS` so every active ## Running it
instance reports.
4. **Aggregate**`aggregate` reads all censuses back and proposes a baseline The probe is `scripts/self-check.sh` (bash; runs on Git Bash/Windows, macOS,
(tools/skills/commands present on *all* machines = "required everywhere"; Linux; deps: jq + curl). Always pass a real UTC timestamp:
present on *some* = "capability-gated"), and lists machines with FAILs.
```bash
Mike reviews the aggregate and ratifies `manifest.json`. From then on the same SELFCHECK_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
probe enforces conformance. **V1 does not auto-fix anything** — it reports the bash .claude/skills/self-check/scripts/self-check.sh <mode>
exact fix command for each finding (per the decision on record). ```
## Running it | Mode | Purpose |
|------|---------|
The probe is `scripts/self-check.sh` (bash; runs on Git Bash/Windows, macOS, | `report` (default) | Human RED/AMBER/GREEN report. Exit 0/1/2 = GREEN/AMBER/RED. |
Linux; deps: jq + curl). Always pass a real UTC timestamp: | `--json` | Structured census JSON to stdout (for piping). |
| `--publish` | Run + publish census to coord (component `selfcheck_<host>`). Softfails to `.claude/coord-queue.jsonl`. |
```bash | `fanout` | Broadcast a census request to ALL_SESSIONS. |
SELFCHECK_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)" \ | `aggregate` | Fleet table + proposed-baseline summary from published censuses. |
bash .claude/skills/self-check/scripts/self-check.sh <mode>
``` `/self-check` is the slash-command runner for the same script.
| Mode | Purpose | ## What it checks
|------|---------|
| `report` (default) | Human RED/AMBER/GREEN report. Exit 0/1/2 = GREEN/AMBER/RED. | | Category | Checks |
| `--json` | Structured census JSON to stdout (for piping). | |----------|--------|
| `--publish` | Run + publish census to coord (component `selfcheck_<host>`). Softfails to `.claude/coord-queue.jsonl`. | | **identity** | identity.json exists + valid JSON; **all required fields present**; `claudetools_root` exists and equals the running repo; `vault_path` exists; `machine` == hostname; git user.name/email match identity. |
| `fanout` | Broadcast a census request to ALL_SESSIONS. | | **tooling** | required everywhere: bash, git, jq, curl, sops, age, ssh, a python. Missing = FAIL. |
| `aggregate` | Fleet table + proposed-baseline summary from published censuses. | | **capability** | ollama, cargo, node, gh, docker, op — presence is INFO, never a failure. Resolves the **Ollama tier** (local / remote / none) and prints the effective Tier-0 ruleset. |
| **files** | required scripts + hook files present and executable. |
`/self-check` is the slash-command runner for the same script. | **hooks** | the three `settings.json` hooks are wired (block-backslash PreToolUse, check-messages UserPromptSubmit, sync-memory SessionStart); `current-mode` present. |
| **git** | origin points at ACG Gitea (internal IP preferred); main-repo post-commit hook installed (AMBER if not). |
## What it checks | **skills/commands** | every skill dir and command file in the baseline is present; extras are reported as census candidates. |
| **duplicates** | command/skill names present in BOTH the repo and `~/.claude`. Divergent content = WARN (the "same `/cmd`, different behaviour on the Mac" bug); identical = INFO (redundant, will drift). CRLF-only differences are ignored. |
| Category | Checks | | **memory** | `MEMORY.md` index exists; no orphaned memory files; manifest-declared contradiction patterns (see semantic pass below). Never FAILs the grade. |
|----------|--------| | **vault** | vault repo exists; sops+age present; `vault.sh list` succeeds (decrypt wired). |
| **identity** | identity.json exists + valid JSON; **all required fields present**; `claudetools_root` exists and equals the running repo; `vault_path` exists; `machine` == hostname; git user.name/email match identity. | | **connectivity** | coord API (required), main API + internal Gitea (advisory; off-network is OK). |
| **tooling** | required everywhere: bash, git, jq, curl, sops, age, ssh, a python. Missing = FAIL. |
| **capability** | ollama, cargo, node, gh, docker, op — presence is INFO, never a failure. Resolves the **Ollama tier** (local / remote / none) and prints the effective Tier-0 ruleset. | ## Rogue-memory contradiction — semantic pass (do this when asked, or on a full check)
| **files** | required scripts + hook files present and executable. |
| **hooks** | the three `settings.json` hooks are wired (block-backslash PreToolUse, check-messages UserPromptSubmit, sync-memory SessionStart); `current-mode` present. | The engine's memory check is deterministic and conservative (index + orphans +
| **git** | origin points at ACG Gitea (internal IP preferred); main-repo post-commit hook installed (AMBER if not). | declared patterns) so it never produces false alarms. A *true* contradiction
| **skills/commands** | every skill dir and command file in the baseline is present; extras are reported as census candidates. | check — "does any memory directly contradict what this machine's settings say?"
| **duplicates** | command/skill names present in BOTH the repo and `~/.claude`. Divergent content = WARN (the "same `/cmd`, different behaviour on the Mac" bug); identical = INFO (redundant, will drift). CRLF-only differences are ignored. | — is a judgment task, so the model does it (route the prose/classification to
| **memory** | `MEMORY.md` index exists; no orphaned memory files; manifest-declared contradiction patterns (see semantic pass below). Never FAILs the grade. | Ollama Tier-0 per the house rules; Claude reviews the result):
| **vault** | vault repo exists; sops+age present; `vault.sh list` succeeds (decrypt wired). |
| **connectivity** | coord API (required), main API + internal Gitea (advisory; off-network is OK). | 1. Read `identity.json` (where things live + this box's capabilities),
`settings.json` (wired hooks/permissions), and `baseline/manifest.json`.
## Rogue-memory contradiction — semantic pass (do this when asked, or on a full check) 2. Read the memory index `.claude/memory/MEMORY.md`, then open any memory whose
one-line hook touches: paths/roots, python launcher, endpoints/IPs, OS/arch
The engine's memory check is deterministic and conservative (index + orphans + assumptions, tool choices, or model routing.
declared patterns) so it never produces false alarms. A *true* contradiction 3. Flag memories that **directly contradict** this machine's reality, e.g.:
check — "does any memory directly contradict what this machine's settings say?" - prescribes `python3`/`python` when `identity.python.command` is `py` (or vice-versa),
— is a judgment task, so the model does it (route the prose/classification to - hardcodes a repo/vault path that isn't this machine's `claudetools_root`/`vault_path`,
Ollama Tier-0 per the house rules; Claude reviews the result): - names an endpoint/IP that conflicts with `identity.coord_api` or the manifest,
- assumes a capability (local Ollama) this machine's tier says is absent.
1. Read `identity.json` (where things live + this box's capabilities), 4. Report each as: memory file, the contradicting claim, the setting it violates,
`settings.json` (wired hooks/permissions), and `baseline/manifest.json`. and a suggested correction. **Do not edit memories** — surface for the operator
2. Read the memory index `.claude/memory/MEMORY.md`, then open any memory whose (deletions/rewrites go through the human, mirroring memory-dream's posture).
one-line hook touches: paths/roots, python launcher, endpoints/IPs, OS/arch
assumptions, tool choices, or model routing. Genuinely machine-specific guidance in a *shared* memory is the usual culprit —
3. Flag memories that **directly contradict** this machine's reality, e.g.: the fix is to scope it ("on Windows…") or split it, not to globally flip it.
- prescribes `python3`/`python` when `identity.python.command` is `py` (or vice-versa),
- hardcodes a repo/vault path that isn't this machine's `claudetools_root`/`vault_path`, ## Fleet self-remediation loop (machines fix themselves)
- names an endpoint/IP that conflicts with `identity.coord_api` or the manifest,
- assumes a capability (local Ollama) this machine's tier says is absent. We never fix a remote machine. The flow is:
4. Report each as: memory file, the contradicting claim, the setting it violates,
and a suggested correction. **Do not edit memories** — surface for the operator 1. `fanout` — broadcast asks every instance to self-check + self-fix + re-publish.
(deletions/rewrites go through the human, mirroring memory-dream's posture). 2. Each operator runs `/self-check` locally, applies the printed fix commands on
their own box, re-runs to confirm GREEN, then `/self-check --publish`.
Genuinely machine-specific guidance in a *shared* memory is the usual culprit — 3. `aggregate` — shows who is still RED/AMBER and prints each machine's own fix
the fix is to scope it ("on Windows…") or split it, not to globally flip it. list. Relay it to that operator; do not run it for them.
4. Repeat until the fleet is consistently GREEN, then ratify the manifest.
## Fleet self-remediation loop (machines fix themselves)
## How to interpret a run
We never fix a remote machine. The flow is:
After running, summarize for the user:
1. `fanout` — broadcast asks every instance to self-check + self-fix + re-publish. - The **grade** and the PASS/WARN/FAIL/INFO tallies.
2. Each operator runs `/self-check` locally, applies the printed fix commands on - Each **FAIL** and **WARN** with its exact fix command. Do not auto-apply.
their own box, re-runs to confirm GREEN, then `/self-check --publish`. - The **capability tier** line — confirm the machine knows its Tier-0 fallback.
3. `aggregate` — shows who is still RED/AMBER and prints each machine's own fix - If publishing/aggregating, note how many machines have reported and which are RED.
list. Relay it to that operator; do not run it for them.
4. Repeat until the fleet is consistently GREEN, then ratify the manifest. Capability differences (no Ollama, no gh, ARM vs amd64, macOS vs Windows) are
expected and must never be reported as broken — they are the whole point of the
## How to interpret a run tier model.
After running, summarize for the user: ## Files
- The **grade** and the PASS/WARN/FAIL/INFO tallies.
- Each **FAIL** and **WARN** with its exact fix command. Do not auto-apply. ```
- The **capability tier** line — confirm the machine knows its Tier-0 fallback. .claude/skills/self-check/
- If publishing/aggregating, note how many machines have reported and which are RED. SKILL.md this file
scripts/self-check.sh the probe engine (report / --json / --publish / fanout / aggregate)
Capability differences (no Ollama, no gh, ARM vs amd64, macOS vs Windows) are baseline/manifest.json the provisional fleet baseline (single source of truth)
expected and must never be reported as broken — they are the whole point of the baseline/README.md the baseline model + how to refine/ratify it
tier model. .claude/commands/self-check.md the /self-check runner
```
## Files
## Extending the baseline
```
.claude/skills/self-check/ When a new tool/skill/command/hook becomes mandatory fleet-wide, edit
SKILL.md this file `baseline/manifest.json`, commit, and `/sync`. Every machine's next self-check
scripts/self-check.sh the probe engine (report / --json / --publish / fanout / aggregate) enforces it. Capability-only tools go in `capability_tools` with a matching entry
baseline/manifest.json the provisional fleet baseline (single source of truth) in `capability_rules` describing the fallback. See `baseline/README.md`.
baseline/README.md the baseline model + how to refine/ratify it
.claude/commands/self-check.md the /self-check runner
```
## Extending the baseline
When a new tool/skill/command/hook becomes mandatory fleet-wide, edit
`baseline/manifest.json`, commit, and `/sync`. Every machine's next self-check
enforces it. Capability-only tools go in `capability_tools` with a matching entry
in `capability_rules` describing the fallback. See `baseline/README.md`.

View File

@@ -18,10 +18,36 @@
serialized (per-article coord lock, TTL evict, coord-down=warn+proceed) + STAGED serialized (per-article coord lock, TTL evict, coord-down=warn+proceed) + STAGED
(`.claude/wiki_staging/` -> review diff -> apply -> commit -> release). No blind merge. (`.claude/wiki_staging/` -> review diff -> apply -> commit -> release). No blind merge.
- **P0 COMPLETE (2026-06-08, VERSION 1.2.0).** Fleet broadcast sent. - **P0 COMPLETE (2026-06-08, VERSION 1.2.0).** Fleet broadcast sent.
- NEXT — P1 context diet (Task 5a inventory -> Task 5 one-line descriptions -> Task 6 - [DONE] Task 6 — CLAUDE.md split: lean CORE (~1.2k tokens, was ~4.9k) + `CLAUDE_EXTENDED.md`
CLAUDE CORE/EXTENDED -> Task 7 thin save/sync). Then P2 delegation re-tune, P3 knowledge. (full original preserved verbatim). All safety-critical rules retained in CORE (v1.3.0).
- Promote the warn-only guard (Task 4) to FATAL after a clean warn window (check - [DONE] Task 9 (P2) — delegation re-tune folded into CORE: act-directly-by-default; delegate
`.claude/harness/guard.log` across the fleet). only for (a) high-volume output, (b) blast radius >3 files across layers, (c) domain shift,
(d) parallel. Single-agent-for-coupled-tasks kept.
- [DONE] Task 5 — one-line registry descriptions on the 8 biggest skills (remediation-tool,
gc-audit, packetdial, memory-dream, human-flow, self-check, impeccable, mailprotector);
skill-description injection ~3320 -> ~2123 tokens (~36% cut), keyword triggers preserved,
frontmatter still valid.
- [DONE] Task 7 — thinned `/save` + `/sync` command bodies: they now point to `sync.sh` as the
single source instead of re-documenting its internals; load-bearing LLM-judgment parts (Phase 0
save-vs-sync, cross-user note display, exit-75 reporting) kept verbatim. Mechanical sync never
depends on an LLM step.
- [DONE] Deterministic Bash fixes (from the "deferred to separate spec" carve-out): `now-phoenix.sh`
helper added (fixed UTC-7 epoch math — no `TZ=America/Phoenix` Git-Bash bug); `post-bot-alert.sh`
already builds JSON via `jq -nc --arg` (verified). Full Python port remains a separate future spec.
- [DONE] Task 10 (P3) — knowledge tiering: recall order already in CORE (wiki -> CONTEXT/log ->
coord); adopted `session-logs/YYYY-MM/` as a FORWARD convention for new logs (verified: nested
path still matches the Phase 0 log-detection grep and wiki slug derivation). Existing flat logs
left in place — recall grep covers both layouts; NO mass migration (would break wiki `source_logs`
refs for low payoff).
- **P1+P2+P3 COMPLETE (2026-06-08, VERSION 1.4.0).**
- REMAINING (ops follow-ups, not blocking):
- Promote the warn-only guard (Task 4) to FATAL after a clean warn window (check
`.claude/harness/guard.log` across the fleet).
- Schedule `memory-dream --apply-safe` per-machine (deliberate per-box ops setup; default is
read-only/proposals, so unattended --apply-safe is a judgment call left to the operator).
- Optional later: migrate existing flat session-logs into month folders if/when the flat dir
becomes unwieldy (low priority; grep recall already covers both).
- Task 8 (shard big command bodies) stays DEFERRED per Grok/Gemini.
## Task 0: Commit this spec ## Task 0: Commit this spec
Commit `specs/claudetools-harness-optimization/` (incl. `review-3way.md`). Commit `specs/claudetools-harness-optimization/` (incl. `review-3way.md`).