- graduation-push.sh: tar+scp scratch -> BEAST graduation-inbox over Tailscale (decoupled
from /save, soft-fail if BEAST off). Tested: 241 files -> BEAST.
- docs/graduation-pipeline.md: full spec (push -> Ollama triage on BEAST GPU via API ->
reviewed sanitize+git-mv). Secrets never enter git; ride the encrypted link to BEAST only.
- tmp-promotion-check.sh: rewritten pure-builtin (0.4s) after the per-file grep/fork loop
hung /save for 4 min on Windows at ~240 scratch files. Deep triage moves to the pipeline.
- forum-post: GRADUATED the canonical flarum poster from scratch ->
skills/forum-post/scripts/flarum-post.py (s9e markdown->XML + DB insert machinery), with
the hardcoded IX SSH + Flarum DB passwords swapped to vault lookups. First pipeline test case.
- Vaulted the Flarum DB cred (services/flarum-community.sops.yaml) + sanitized the two
plaintext copies in forum-post.md.
- errorlog: logged the WSL-stub correction + BEAST-Ollama-CPU(vram=0) finding + the
promotion-check hang, all via the new log helper.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.
- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
(b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
+ onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The skill/command DOCS instructed Claude to run a bare `py ...`, which is the
Windows py-launcher — absent on Linux/macOS (exit 127, hit on GURU-KALI). A blind
py->python3 swap is wrong too: python3 is a broken MS Store shim on some Windows
boxes where `py` is the correct launcher.
Fix mirrors the resolution the .sh skill scripts already do:
- New .claude/scripts/py.sh: picks the interpreter that actually RUNS —
identity.json python.command first, then py -> python3 -> python, each
validated with `-c 'import sys'` so the MS Store stub is skipped. exec's it.
- Repointed all DOC invocations (10 files, ~70 sites) from `py ...` to
`bash "$CLAUDETOOLS_ROOT/.claude/scripts/py.sh" ...` (incl. the `py -c` and
`py -` heredoc forms in checkpoint.md / mailbox.md).
- Left the .sh skill scripts untouched — they already resolve py/python/python3.
- errorlog.md: marked the GURU-KALI entry RESOLVED.
Depends on CLAUDETOOLS_ROOT (seeded by ensure-settings-env.py); py.sh also
self-resolves the repo root via git/cwd as a fallback.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Double-checked the 2026-06-08 BEC remediation for missed EXO-dependent items now that
the Exchange role is confirmed. Findings: malicious inbox rules gone (cleanup stuck);
all 14 mailboxes clean of fwd/redirect/delete/move rules; no mailbox forwarding; no
transport rules; no rogue delegates. Open (need Ken): Christina-Micek StopProcessing rule
+ Ken FullAccess to Accounting. Corrected stale 'Exchange Admin NOT assigned' note (it IS).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EXO email-cleanup tasks (Search-UnifiedAuditLog, Get-MessageTrace, inbox rules) kept
401/403-ing per tenant because the Exchange Operator SP was missing the Exchange Admin
directory role — admin consent grants Exchange.ManageAsApp but never the directory role.
onboard-tenant.sh assigns it, but tenants consented before that step / by hand never got
it, and nothing audited for it. Hence the recurring 'next onboarding will fix it' (false
for already-onboarded tenants).
- NEW assign-exchange-role.sh: idempotent role assignment via the authoritative
roleManagement/directory/roleAssignments API (the legacy directoryRoles/members list
reads back unreliably). <domain|--all> + --verify/--dry-run.
- Backfilled the whole fleet (--all): 13 stragglers ASSIGNED, 12 already OK, 20 skipped
(tenant-admin not consented), 0 errors. Safe Site included.
- Standing audit documented (assign-exchange-role.sh --all --verify) + memory so no future
session repeats the empty promise.
- Adds wiki/clients/safesite.md (tenant + 4-source endpoint inventory + investigation).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live-verified 2026-06-08: Security Investigator + User Manager + Tenant Admin Graph
tiers all consented and reading (subscribedSkus/organization HTTP 200) on
safesitellc.com (71b4e637-...). The reference's 'NO' was stale (last touched 2026-04-20).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Builds the false-positive/true-positive proof the plan requires before the guard can be
promoted to blocking, and fixes the one false-positive it surfaced.
- test-harness-guard.sh: 12-case matrix in a throwaway repo, runs the REAL guard, asserts
WARN/clean for real conflicts/secrets/keys vs legit content (setext underlines, dividers,
docs that mention a marker, encrypted sops, public keys, .example templates).
- harness-guard.sh: conflict rule now requires a real hunk (BOTH ^<<<<<<< AND ^>>>>>>>),
dropping the lone =======$ trigger that false-positived on a 7-char setext underline /
divider. Identical true-positive power (git writes all three markers); FP surface -> 0.
- /self-check: new harness.guard_selftest runs the matrix in an isolated temp repo (read-only
vs the real tree) so guard correctness is continuously proven.
Verified 12/12 pass, true positives intact, real-tree FP surface = 0. FATAL flip (todo
f1c11d0d, on/after 2026-06-22) is now evidence-backed + one-step.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Task 3 leftover. Adds a 'consistency' category to /self-check that catches a standard
drifting back into restating/contradicting the command that owns the rule -- the Syncro
timers failure mode (standard said 'always timer' while /syncro said 'outlier only').
Deterministic half: each manifest.command_standard_links pair's standard must still carry
its defer-to-SSOT pointer (must_reference regex). Lost pointer = WARN. Seeded with
syncro-billing (time-entry-protocol.md -> /syncro). Semantic contradiction pass delegated
to the model in SKILL.md, mirroring check_memory. Verified PASS; negative-tested.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a 'harness' category to /self-check (Task 12, self-check half) so the harness-
optimization gains can't silently regress. All read-only / non-invasive:
- VERSION marker present + not older than manifest.harness.min_version
- skill-registry description budget (sum of all SKILL.md description: fields under
registry_desc_budget_chars) -- the metric that catches Task 5 bloating back
- global deploy targets ~/.claude/skills + ~/.claude/commands populated (Mac-wipe failure)
- harness-guard.sh present + wired into sync.sh
- core scripts parse (bash -n on sync/guard/now-phoenix); now-phoenix.sh emits a valid date
Tunables in baseline/manifest.json 'harness' block. Verified 9/9 PASS; budget WARN
negative-tested at a synthetic over-budget value.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New `elevate` mode that goes beyond friction to make a UI top-notch and
flags when to redesign rather than patch. references/polish-and-redesign.md
holds 12 heuristics (hierarchy, signature moment, action gravity, narrative,
lonely states, density, rhythm, type, tokens, depth/finish, motion, redesign
triggers) synthesized from three independent model passes (Claude + Gemini +
Grok). Adds an Elevation Index (0-10), a Redesign Urgency score (>=4 leads
with a Structural Audit), and Opportunity-ranked Quick Wins / Elevations /
Redesign Candidates tiers. SKILL.md: command + mode section + extend note.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Upgrade the human-flow skill (Gemini-assisted, Claude-reviewed):
- scan.mjs rewritten to AST-based (@babel/parser/traverse) with 4
detectors: unlabeled-icon-button, tiny-target, missing-feedback-props,
click-without-keyboard; regex fallback on parse failure.
- Objective Friction Index (Motor 3.0 / Cognitive 2.5 / Keyboard 2.5 /
Feedback 2.0); 0-10 Human Workflow Score.
- New heuristics: State-Flow Audit, Precision Rail / Fumble Zones,
Restraint-o-Meter (1-5) for the fancy pass.
- `fix` command DISABLED for now (advisory only): the AST generator
reprints whole files and produces noisy diffs; agents apply surgical
fixes from the report. To be revisited with a string-splice editor.
- Add @babel/* deps + package-lock.json.
- Memory: agy review/review-files is NOT actually read-only (wrote files
+ ran npm despite documented plan-mode) — diff after every agy review.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
review/review-files resolve relative paths only against CWD or
$CLAUDETOOLS_ROOT, never a submodule/subdir — so submodule-relative
paths fail with "file not found". Add a [!WARNING] callout to both
SKILL.md files, fix the misleading "absolute or repo-relative" table
wording, and add inline GOTCHA comments at each resolution site in
both scripts. Bitten us repeatedly (latest: GuruConnect review).
image-analyze: independent second-model vision over OAuth (pins the
gemini-3.1-pro-preview vision model; the default flash-lite router
hallucinates image content) — reads an image via read_file and describes it.
search: Google-grounded live web results with citation URLs (google_web_search).
Both verified working on the keyless Google OAuth. Image GENERATION
(nano-banana) still needs an AI Studio key + extension and stays Grok's lane.
Includes a scoped best-effort output sanitizer for image-analyze (preview
model occasionally leaks reasoning tokens); text/verify/review/search
unchanged. migrate-identity.sh now upgrades the gemini capabilities array.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If a grok read_file-based review (review/review-files/review-diff) returns
empty (the 0.2.20-style headless tool-gating regression), retry once with
the file(s)/diff embedded inline via the no-tools text path, when content
is under 256KB; otherwise emit a clear skip note. Keeps grok-reads-files as
the default happy path (works on 0.2.22) and degrades gracefully instead of
returning silence. text/verify/raw unchanged; Windows path handling intact.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live Mailprotector CloudFilter REST client (emailservice.io/api/v1,
Bearer auth via vault msp-tools/mailprotector.sops.yaml). Lists mail-flow
logs and held/quarantined messages across client domains and releases them
(POST messages/{id}/deliver, deliver_many). Read-only by default; every
release/rule-add/config-change gated behind --confirm. Mirrors the
packetdial skill pattern. Built after diagnosing a Dataforth held-outbound
message that never reached ACG.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
role_assigned() sent an unencoded space in the OData $filter
(principalId eq '...'), so the query always failed and the function
always returned false -> onboard-tenant.sh always printed
"MISSING -> ASSIGNING" and relied on the conflict-tolerant POST for
idempotency. Fixed to %20; corrected the stale PIM-misdiagnosis comment.
Verified live against the ACG tenant. Roles still assign correctly;
PRESENT/MISSING reporting is now accurate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sibling of the grok skill: routes text/verify/review (+ review-files,
review-diff, raw) to the official Google Gemini CLI (gemini, npm global,
v0.45.1) for an independent second model. ask-gemini.sh mirrors ask-grok.sh
(identity-aware gating, binary auto-locate, cygpath hardening, prompt-file
inputs, clean stdout/stderr separation, JSON .response extraction). review
modes copy targets into a temp dir + --include-directories to bypass
Gemini's gitignore/workspace sandbox. verify/review pinned to
gemini-3.1-pro-preview (GEMINI_MODEL overridable). migrate-identity.sh
auto-detects gemini and writes a per-machine identity.json gemini block.
Auth: Google OAuth (no key). Fleet Gemini host: GURU-5070.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixes the two Windows pain points when routing code review to the Grok CLI
(native Windows grok.exe driven from Git Bash):
- winpath() (cygpath -w; no-op off Windows) on every path handed to grok.exe
(--prompt-file, --cwd) -> deterministic, space-safe; removes reliance on
MSYS's argv auto-conversion heuristic (the 'confounded by Windows paths').
- review mode resolves to an absolute Windows path (handles absolute/spaced paths).
- NEW review-files [-i instr] <f1> [f2...]: review a set of files together.
- NEW review-diff [-C <repo-dir>] [-i instr] <gitref> [-- <pathspec>]: review a
git diff; -C targets submodules (e.g. guru-rmm). Diff goes via --prompt-file,
not a shell arg -> no 'quote hell'.
Tested: text, review (spaced abs path), review-files (2 tray modules),
review-diff (self-review of these changes). SKILL.md updated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The ask-grok.sh wrapper script used 'timeout' command which doesn't
exist on macOS by default. Updated to detect macOS (darwin) and use
'gtimeout' from GNU coreutils instead.
Tested on macOS with:
- Text reasoning queries (working)
- Live web + X/Twitter search (working)
Requires: brew install coreutils (provides gtimeout)