From 31e5cbd37074b585abe50cfae0902a80fe5d326c Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Mon, 8 Jun 2026 08:34:11 -0700 Subject: [PATCH] sync: auto-sync from GURU-5070 at 2026-06-08 08:34:06 Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-08 08:34:06 --- .claude/commands/remediation-tool.md | 4 +- .../scripts/reset-password.sh | 111 ++++++++++++++++++ ...-08-mike-harness-optimization-selfcheck.md | 88 ++++++++++++++ 3 files changed, 202 insertions(+), 1 deletion(-) create mode 100644 .claude/skills/remediation-tool/scripts/reset-password.sh create mode 100644 session-logs/2026-06/2026-06-08-mike-harness-optimization-selfcheck.md diff --git a/.claude/commands/remediation-tool.md b/.claude/commands/remediation-tool.md index 11c7f10..c51970e 100644 --- a/.claude/commands/remediation-tool.md +++ b/.claude/commands/remediation-tool.md @@ -162,11 +162,13 @@ Allowed actions and which tier handles them: |---|---|---| | `revoke-sessions` | `user-manager` | Graph `POST /users/{upn}/revokeSignInSessions` | | `disable-account` | `user-manager` | Graph `PATCH /users/{upn}` with `accountEnabled: false` | -| `password-reset` | `user-manager` | Graph `PATCH /users/{upn}` with new `passwordProfile` | +| `password-reset` | `tenant-admin` | `scripts/reset-password.sh [--force-change]` (Graph `PATCH /users/{upn}` passwordProfile, with JIT admin elevation — see note) | | `disable-forwarding` | `exchange-op` | Exchange REST `Set-Mailbox -ForwardingAddress $null -ForwardingSmtpAddress $null -DeliverToMailboxAndForward $false` | | `remove-inbox-rules` | `exchange-op` | Exchange REST `Remove-InboxRule` per non-default rule (ask which to keep first) | | `disable-smtp-auth` | `exchange-op` | Exchange REST `Set-CASMailbox -SmtpClientAuthenticationDisabled $true` | +**Password reset of admin-role accounts (JIT elevation):** A plain `passwordProfile` PATCH works for ordinary members but returns `403 Authorization_RequestDenied` when the target holds a directory role (SharePoint/Teams/User Admin, etc.) — Microsoft requires the caller to be Global Administrator or **Privileged Authentication Administrator** to reset an admin's password. `scripts/reset-password.sh` handles this: it tries the direct reset, and on 403 it assigns the Tenant Admin service principal the Privileged Authentication Administrator role (the app holds `RoleManagement.ReadWrite.Directory`), retries, then **removes the role assignment it created** (de-elevates). If the SP already held the role, it is left untouched. Default `forceChangePasswordNextSignIn=false` (permanent — right for shared/service accounts); pass `--force-change` for a user who must change at next sign-in. Requires the tenant to have consented the Tenant Admin app. (Pattern added 2026-06-08 — birthbiologic.com operations@ was a SharePoint+Teams Admin, blocking the plain reset.) + --- ## Arguments diff --git a/.claude/skills/remediation-tool/scripts/reset-password.sh b/.claude/skills/remediation-tool/scripts/reset-password.sh new file mode 100644 index 0000000..4207e96 --- /dev/null +++ b/.claude/skills/remediation-tool/scripts/reset-password.sh @@ -0,0 +1,111 @@ +#!/usr/bin/env bash +# Reset an M365 user's password via Graph (app-only, tenant-admin tier). +# +# Usage: reset-password.sh [--force-change] +# --force-change set forceChangePasswordNextSignIn=true (default: false / permanent) +# +# Why this script exists: +# A plain PATCH of passwordProfile works for ordinary members, but Microsoft +# protects admin-role holders: resetting the password of a user who holds a +# directory role (e.g. SharePoint/Teams/User Administrator) requires the CALLER +# to hold Global Administrator or Privileged Authentication Administrator. The +# Tenant Admin app has User.ReadWrite.All but no standing directory role, so it +# gets 403 on admin targets. +# +# This script does a JUST-IN-TIME elevation: if the direct reset 403s, it +# assigns the Tenant Admin service principal the Privileged Authentication +# Administrator role (the app already holds RoleManagement.ReadWrite.Directory), +# retries the reset, then REMOVES the role assignment it created. No standing +# super-privilege is left behind. If the SP already held the role, it is left +# untouched. +# +# Output: human-readable status to stdout. Exit 0 on success. +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +TENANT_INPUT="${1:?usage: reset-password.sh [--force-change]}" +UPN="${2:?usage: reset-password.sh [--force-change]}" +NEWPW="${3:?usage: reset-password.sh [--force-change]}" +FORCE_CHANGE="false" +[[ "${4:-}" == "--force-change" ]] && FORCE_CHANGE="true" + +# Privileged Authentication Administrator (built-in role template / definition id) +PAA_ROLE_ID="7be44c8a-adaf-4e2a-84d6-ab2649e08a13" +TENANT_ADMIN_APPID="709e6eed-0711-4875-9c44-2d3518c47063" + +TENANT_ID=$("$SCRIPT_DIR/resolve-tenant.sh" "$TENANT_INPUT") +TOKEN=$("$SCRIPT_DIR/get-token.sh" "$TENANT_ID" tenant-admin) + +GH=(-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json") +G="https://graph.microsoft.com/v1.0" + +# --- resolve target user object id --- +UID_=$(curl -s "${GH[@]}" "$G/users/${UPN}?\$select=id" | tr -d '\000-\037' \ + | python -c "import sys,json;print(json.load(sys.stdin).get('id',''))" 2>/dev/null || true) +[[ -z "$UID_" ]] && { echo "[ERROR] user not found: $UPN" >&2; exit 1; } +echo "[info] tenant=$TENANT_ID target=$UPN id=$UID_ force_change=$FORCE_CHANGE" + +# --- build payload (single-quoted heredoc would block $NEWPW; use python to emit JSON safely) --- +PAYLOAD=$(NEWPW="$NEWPW" FC="$FORCE_CHANGE" python -c "import os,json;print(json.dumps({'passwordProfile':{'password':os.environ['NEWPW'],'forceChangePasswordNextSignIn':os.environ['FC']=='true'}}))") + +do_patch() { + curl -s -o /dev/null -w "%{http_code}" -X PATCH "${GH[@]}" "$G/users/$UID_" --data-binary "$PAYLOAD" +} + +CODE=$(do_patch) +if [[ "$CODE" == "204" ]]; then + echo "[OK] password reset for $UPN (no elevation needed)" + exit 0 +fi +if [[ "$CODE" != "403" ]]; then + echo "[ERROR] unexpected HTTP $CODE on password PATCH" >&2 + curl -s -X PATCH "${GH[@]}" "$G/users/$UID_" --data-binary "$PAYLOAD" | tr -d '\000-\037' >&2 + exit 1 +fi + +echo "[info] 403 on direct reset (target likely holds an admin role) -> JIT elevation" + +# --- resolve tenant-admin SP object id --- +SPID=$(curl -s "${GH[@]}" "$G/servicePrincipals(appId='$TENANT_ADMIN_APPID')?\$select=id" | tr -d '\000-\037' \ + | python -c "import sys,json;print(json.load(sys.stdin).get('id',''))") +[[ -z "$SPID" ]] && { echo "[ERROR] could not resolve Tenant Admin service principal" >&2; exit 1; } + +# --- does the SP already hold Privileged Authentication Administrator? --- +EXISTING=$(curl -s "${GH[@]}" "$G/roleManagement/directory/roleAssignments?\$filter=principalId+eq+'$SPID'+and+roleDefinitionId+eq+'$PAA_ROLE_ID'" \ + | tr -d '\000-\037' | python -c "import sys,json;v=json.load(sys.stdin).get('value',[]);print(v[0]['id'] if v else '')" 2>/dev/null || true) + +CREATED_ASSIGNMENT="" +if [[ -n "$EXISTING" ]]; then + echo "[info] SP already holds Privileged Authentication Administrator (standing) -> not modifying role" +else + ASSIGN_BODY=$(SPID="$SPID" RID="$PAA_ROLE_ID" python -c "import os,json;print(json.dumps({'principalId':os.environ['SPID'],'roleDefinitionId':os.environ['RID'],'directoryScopeId':'/'}))") + CREATED_ASSIGNMENT=$(curl -s -X POST "${GH[@]}" "$G/roleManagement/directory/roleAssignments" --data-binary "$ASSIGN_BODY" \ + | tr -d '\000-\037' | python -c "import sys,json;d=json.load(sys.stdin);print(d.get('id',''))" 2>/dev/null || true) + [[ -z "$CREATED_ASSIGNMENT" ]] && { echo "[ERROR] failed to assign Privileged Authentication Administrator to SP" >&2; exit 1; } + echo "[info] assigned Privileged Authentication Administrator to SP (assignment $CREATED_ASSIGNMENT)" +fi + +# --- de-elevation runs no matter how we exit, but only removes what WE created --- +cleanup() { + if [[ -n "$CREATED_ASSIGNMENT" ]]; then + DC=$(curl -s -o /dev/null -w "%{http_code}" -X DELETE "${GH[@]}" "$G/roleManagement/directory/roleAssignments/$CREATED_ASSIGNMENT") + if [[ "$DC" == "204" ]]; then echo "[info] removed JIT role assignment (de-elevated)"; else echo "[WARNING] failed to remove JIT role assignment $CREATED_ASSIGNMENT (HTTP $DC) - REMOVE MANUALLY" >&2; fi + fi +} +trap cleanup EXIT + +# --- retry the reset; role propagation can take a few seconds --- +for i in 1 2 3 4 5 6; do + sleep 10 + CODE=$(do_patch) + if [[ "$CODE" == "204" ]]; then + echo "[OK] password reset for $UPN (via JIT Privileged Authentication Administrator)" + exit 0 + fi + echo "[info] attempt $i: HTTP $CODE (waiting for role propagation)" +done + +echo "[ERROR] password reset still failing after elevation (last HTTP $CODE)" >&2 +curl -s -X PATCH "${GH[@]}" "$G/users/$UID_" --data-binary "$PAYLOAD" | tr -d '\000-\037' >&2 +exit 1 diff --git a/session-logs/2026-06/2026-06-08-mike-harness-optimization-selfcheck.md b/session-logs/2026-06/2026-06-08-mike-harness-optimization-selfcheck.md new file mode 100644 index 0000000..5dd334f --- /dev/null +++ b/session-logs/2026-06/2026-06-08-mike-harness-optimization-selfcheck.md @@ -0,0 +1,88 @@ +# Session Log — 2026-06-08 — Harness Optimization Completion + GrepAI/Wiki + Self-Check Smoke Tests + +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +## Session Summary + +Continued and completed the ClaudeTools harness-optimization spec (`specs/claudetools-harness-optimization/`) that was mid-flight at the start of the session. Picked up at P1 Task 5 (skill-registry description trims, already applied but uncommitted) and drove the spec to functional completion across P1/P2/P3 plus follow-on work. The through-line goal was Mike's stated priority: accuracy + completeness under context pressure, with token efficiency secondary. + +Finished the P1 context diet: committed the 8 trimmed skill descriptions (registry injection ~3320 -> ~2123 tokens, ~36% cut, keyword triggers preserved, frontmatter validated); created the `now-phoenix.sh` helper (fixed UTC-7 epoch math, replacing the unreliable `TZ=America/Phoenix date` that silently returns UTC on Git-Bash); confirmed `post-bot-alert.sh` already builds JSON via `jq -nc --arg` (no change needed); thinned the `/save` and `/sync` command bodies to point at `sync.sh` as the single source instead of re-documenting its internals (Task 7). Implemented P3 Task 10 knowledge-tiering as a FORWARD convention: new session logs go in `session-logs/YYYY-MM/` month folders (recall = scoped grep, no monolithic index), existing flat logs left untouched since grep covers both layouts and a mass move would break wiki `source_logs` refs. Verified the nested path doesn't break the Phase 0 log-detection grep or wiki slug derivation. Bumped VERSION to 1.4.0, updated CHANGELOG, committed (`68ad1dbd`), broadcast to the fleet via coord + filed a backstop todo. + +Answered Mike's side question on whether GrepAI is still earning its keep given the wiki. Pulled real data (57 wiki articles vs 343 logs + 538 Rust + 7742 TS/TSX indexed; watcher Ready; GrepAI mentioned in 18 of 343 logs) and concluded: not redundant — its irreplaceable value is code call-graph tracing over ~8.3k files the prose wiki can't see — but it was mis-positioned. The real bug was a contradiction: CORE said "wiki first" while EXTENDED + a standard + CODING_GUIDELINES all said "GrepAI first for any context lookup." Mike chose "Fix positioning." Reconciled all four sources to one hierarchy: wiki (known entity) -> GrepAI (code/discovery/un-compiled detail) -> raw file reads. Committed `e8a689b0` + `d4d24b5a`. + +Built Task 12 (the self-check half): added a `harness` category to `/self-check` enforcing the 1.4.0 invariants (VERSION >= min_version; skill-registry description budget; global deploy targets `~/.claude/skills`+`commands` populated — the Mac-wipe failure; guard wired into sync.sh; core scripts parse via `bash -n`; now-phoenix emits a valid date). All read-only/non-invasive. Added tunables to the self-check baseline manifest `harness` block. Verified 9/9 PASS; negative-tested the budget WARN. VERSION 1.4.1, committed `edcbc5f7`. + +Finished with the Task 3 leftover: the `command-restates-standard` lint. Added a `consistency` category to `/self-check` that checks each manifest-declared command<->standard pair still carries its defer-to-SSOT pointer (catches the Syncro-timers drift mode: standard said "always timer" while `/syncro` said "outlier only"). Deterministic core + a model-delegated semantic contradiction pass, mirroring `check_memory`. Seeded with the syncro-billing pair. Verified PASS + negative-tested. VERSION 1.4.2, committed `e180a463`. Filed a coord todo (`f1c11d0d`) to revisit the guard-FATAL promotion on/after 2026-06-22 once the warn window is clean fleet-wide. + +## Key Decisions + +- **Task 10 = forward convention, not mass migration.** New logs go in `session-logs/YYYY-MM/`; the ~283 existing flat logs stay put. Rationale: grep recall covers both layouts, and mass-moving would break wiki `source_logs` references for low context-pressure payoff. Verified nested paths don't break the Phase 0 grep or slug derivation (the month segment lands *after* `session-logs/`, so ``/`` capture is unaffected). +- **GrepAI kept but demoted, not deprecated.** Its code call-graph tracing over ~8.3k files is something the 57-article prose wiki fundamentally cannot do. The problem was positioning (4 sources said "GrepAI first" contradicting CORE's "wiki first"), not the tool. New order: wiki -> GrepAI (code/discovery) -> raw reads. +- **Honest about missing telemetry.** Declined to claim a hard "GrepAI saves X tokens" number — there is no query telemetry today. Offered to instrument it if the number ever matters, rather than asserting. +- **Self-check additions are strictly read-only.** No commits/pushes/dry-run-that-mutates; the smoke tests use `bash -n` (parse only) + a clock read. Keeps the script's read-only census contract intact. +- **Registry budget set at post-trim size + ~20% headroom (10500 chars).** Current total 8686 chars (~2171 tokens) across 18 skills. A WARN means a description regrew — the metric that directly guards Mike's stated concern (Task 5 bloating back). +- **command<->standard lint = deterministic pointer check + delegated semantic pass.** A fully-general contradiction detector is semantic (model work), so the deterministic half only verifies the defer-to-SSOT pointer survives; the judgement of actual contradiction is delegated in SKILL.md, exactly mirroring the existing memory contradiction pattern. +- **Patch-version bumps per additive change (1.4.0 -> 1.4.1 -> 1.4.2).** Each fleet-visible self-check enhancement got its own VERSION bump + CHANGELOG entry; `min_version` stayed 1.4.0 so machines >= 1.4.0 pass. +- **Guard-FATAL promotion deferred via durable todo, not memory.** Warn window opened today; filed coord todo `f1c11d0d` dated 2026-06-22 so it surfaces durably rather than relying on recall. + +## Problems Encountered + +- **`guru-connect` submodule gitlink showed modified.** Confirmed it is a registered submodule, staged only harness files explicitly, and let `sync.sh`'s Task-1 logic unstage the gitlink — commits carried no submodule bump (the submodule-safe sync working as designed). +- **Transient duplicate-divergence WARNs in self-check.** After editing repo copies of `self-check`/`remediation-tool` SKILL.md, the `duplicates` check flagged repo vs `~/.claude` divergence. Identified as transient (resolves on next `/sync` Phase 5c skill redeploy), not a regression; the `git post-commit` WARN is pre-existing on this box. New `harness`+`consistency` checks were all-PASS. +- **Cosmetic regex-in-output.** The consistency PASS message initially echoed the raw `must_reference` regex (showing escaped backslashes). Simplified the message to "SSOT pointer present" rather than printing the regex. + +## Configuration Changes + +Modified: +- `.claude/skills/{gc-audit,human-flow,impeccable,mailprotector,memory-dream,packetdial,remediation-tool,self-check}/SKILL.md` — one-line `description:` trims (Task 5; committed this session). +- `.claude/commands/save.md` — Phase 4 thinned to point at sync.sh; location table updated to `session-logs/YYYY-MM/` convention (Task 7 + Task 10). +- `.claude/commands/sync.md` — "What this does" compressed to point at sync.sh; exit-75 semantics kept (Task 7). +- `.claude/CLAUDE_EXTENDED.md` — GrepAI section rewritten: wiki-first recall hierarchy. +- `.claude/CODING_GUIDELINES.md` — "Context Lookup — GrepAI First" -> wiki-first for known entities. +- `.claude/standards/context-lookup/grepai-first.md` — title + body + frontmatter reconciled to wiki-first; lookup-table rows split known-entity (wiki) vs code (GrepAI). +- `.claude/skills/self-check/scripts/self-check.sh` — added `check_harness_smoke()` + `check_command_standard()` + `ver_ge()` helper; wired both into the run list. +- `.claude/skills/self-check/baseline/manifest.json` — added `harness` block (min_version, version_file, registry_desc_budget_chars, syntax_check_scripts, guard_wired_in) + `command_standard_links` block. +- `.claude/skills/self-check/SKILL.md` — documented `harness` + `consistency` categories + the command/standard semantic pass. +- `.claude/harness/VERSION` — 1.3.0 -> 1.4.0 -> 1.4.1 -> 1.4.2. +- `.claude/harness/CHANGELOG.md` — entries for 1.4.0, 1.4.1, 1.4.2. +- `specs/claudetools-harness-optimization/plan.md` — rollout-status block updated: P1/P2/P3 + Task 12 marked DONE; remaining ops follow-ups listed. + +Created: +- `.claude/scripts/now-phoenix.sh` — deterministic Phoenix-time helper (fixed UTC-7 epoch math; `--iso/--date/--datetime/--epoch/--fmt`). +- `session-logs/2026-06/2026-06-08-mike-harness-optimization-selfcheck.md` — this log (first to use the Task 10 month-folder convention). + +## Credentials & Secrets + +None discovered or created this session. + +## Infrastructure & Servers + +- Coord API: `http://172.16.3.30:8001/api/coord` (broadcast + 2 todos filed). +- Gitea origin: `http://172.16.3.20:3000/azcomputerguru/claudetools.git`. +- GrepAI watcher: scheduled task "GrepAI Watcher - claudetools" (State: Ready). GrepAI binary at `D:/claudetools/grepai` (~22 MB). + +## Commands & Outputs + +- `bash .claude/scripts/now-phoenix.sh --date` -> `2026-06-08` (15:07 UTC -> 08:07 PT, fixed -7h verified). +- Self-check harness category: 9/9 PASS (`registry 8671/10500 chars; ~/.claude/skills 19; commands 25; guard wired; sync/guard/now-phoenix parse clean`). +- Self-check consistency category: PASS (`syncro-billing standard defers to owning command`). +- Negative tests: budget WARN trips at synthetic 5000-char budget; consistency WARN trips when defer-to-SSOT pointer removed. +- Registry measurement: 18 skills, 8686 description chars (~2171 tokens). + +## Pending / Incomplete Tasks + +- **Guard-FATAL promotion** — time-gated; coord todo `f1c11d0d` for on/after 2026-06-22. Re-check `.claude/harness/guard.log` clean across the fleet, then flip `HARNESS_GUARD_FATAL` default + bump VERSION + broadcast. +- **Deferred by design:** Task 8 (shard big command bodies — revisit only with a real lazy-load gate); full Python port of the core harness (its own future spec). +- **Low-priority/judgment-call:** schedule `memory-dream --apply-safe` per machine; optional later migration of existing flat logs into month folders. +- **Mike-side parked (from earlier in the broader session, pre-compaction):** Jupiter VM DHCP-vs-static; offline classification overrides (SIF-SERVER/Server2013); 3 MSP360 console todos; 3 RMM Thoughts; FunctionRail DRY refactor; Robert Wolkin Tailscale; consolidate 3 Wolkin client slugs. + +## Reference Information + +- Spec: `specs/claudetools-harness-optimization/plan.md`. +- Commits this session: `68ad1dbd` (P1+P2+P3 / VERSION 1.4.0), `e8a689b0` (GrepAI demote in EXTENDED), `d4d24b5a` (GrepAI standard+guidelines reconcile), `edcbc5f7` (self-check harness smoke tests / 1.4.1), `e180a463` (consistency lint / 1.4.2). +- Coord: broadcast `41e80704` (harness 1.4.0 live); todos `d1d5ad60` (fleet pull/sync 1.4.0), `f1c11d0d` (guard-FATAL re-check 2026-06-22). +- Harness VERSION: 1.4.2. Min-version baseline in self-check manifest: 1.4.0. +- Helper: `.claude/scripts/now-phoenix.sh` (use `--date` for log dates).