sync: auto-sync from GURU-5070 at 2026-06-21 18:11:16
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-21 18:11:16
This commit is contained in:
@@ -0,0 +1,161 @@
|
||||
# Session — RMM BUG-019 merge, gururmm-build skill, errorlog lint, coord purge
|
||||
|
||||
## User
|
||||
- **User:** Mike Swanson (mike)
|
||||
- **Machine:** GURU-5070
|
||||
- **Role:** admin
|
||||
|
||||
## Session Summary
|
||||
|
||||
Started as routine `/sync` runs that surfaced a backlog of coord questions from Howard. Answered
|
||||
all three open threads via coord: pfSense SSH backend cred-path (approved Option A — slug arg
|
||||
containing `/` is used as a literal vault path), BUG-018 approach (fast 202 + background purge), and
|
||||
the Tier-1 RMM items (Mail.Send already lives in the suite; packetdial still has no NetSapiens key).
|
||||
A verification step found the 2a/2b answers had been made in commits earlier in the day but never
|
||||
communicated to Howard via coord, so a corrected reply was sent pointing at the actual decisions
|
||||
rather than a non-existent prior message.
|
||||
|
||||
Merged GuruRMM BUG-019 to `main`. The guru-rmm submodule pushes to `git.azcomputerguru.com` (a
|
||||
different host than the claudetools parent's HTTP remote), and the gitea skill can't inject creds
|
||||
there, nor is the Gitea SSH key authorized on GURU-5070 — so the working auth path was the vaulted
|
||||
`services/gitea` API token via `GIT_ASKPASS`. Fast-forwarded `ed8cad3 -> 66a7f4e`; CI's version-bump
|
||||
bot committed `8b5e0dc` on top, confirming the pipeline fired. Verified via the gururmm server's own
|
||||
root RMM agent that the Linux agent build completed clean (v0.6.67, 0 errors, marker `8b5e0dc`),
|
||||
proving the container-gated `is_docker_container()` compile gate — which Howard could not cross-build
|
||||
from Windows — passed on Linux. Agent landed on the beta channel.
|
||||
|
||||
Mike asked that all Claude instances internalize the build process so the fleet builds the same way.
|
||||
Built a `gururmm-build` skill (`verify.sh server|agent|dashboard|migrations`) plus a developer-facing
|
||||
`docs/BUILD.md` in guru-rmm and refreshed the stale CONTEXT.md build-pipeline section, after reading
|
||||
the actual `deploy/build-pipeline/*.sh` scripts to keep the docs accurate. Captured the core rules in
|
||||
a `feedback_gururmm_build_verification` memory (merge-to-main = build+deploy; Windows cargo can't gate
|
||||
Linux agent code; server needs SQLX_OFFLINE + fresh `.sqlx`; check migration-number collisions).
|
||||
Audited the memory store for stale `fabb3421` references and fixed the one stale entry
|
||||
(`reference_resource_map`), then broadcast both learnings (build verification + fabb3421-deleted/use-
|
||||
the-suite) to the fleet.
|
||||
|
||||
Ran an errorlog lint at Mike's request. The dominant signal (~70% of entries) was bitdefender
|
||||
`gz.py` logging expected GravityZone validation/probe errors; Howard had already built suppression,
|
||||
but one marker gap remained (`"Missing name '...' in 'options' object"`), which was added and verified
|
||||
against all 10 distinct spam phrasings. Secondary findings: the recurring auto-synced-submodule
|
||||
detached-HEAD friction (captured as a new `feedback_submodule_autosync_discipline` memory) and the
|
||||
Windows `/tmp`/quote-stripping traps that keep recurring despite living in memory (promoted to the
|
||||
always-loaded CLAUDE.md CORE Windows bullet). Mike corrected the assumption that GuruRMM merges are
|
||||
Mike-only — Howard can handle merges/deploys himself — which was logged as a correction and written
|
||||
into the approval-workflow memory; Howard was then cleared to merge BUG-018 and told to merge it now.
|
||||
Finally, purged 208 dealt-with coord messages (kept the 29 from 2026-06-19 onward), hitting the
|
||||
known jq-on-Windows CRLF trap mid-loop, and added a reusable `msg purge --before` command (dry-run by
|
||||
default) to the coord skill so future cleanup doesn't need an ad-hoc curl loop.
|
||||
|
||||
## Key Decisions
|
||||
- **pfSense backend cred-path:** Option A — if the slug arg contains `/`, use it verbatim as the
|
||||
vault path; else treat as a client slug. No cred duplication; gw-audit/gw-control inherit it.
|
||||
- **BUG-018 fix:** fast 202 Accepted + background/scheduled child-row purge (volume problem, not
|
||||
indexes); 204->202 contract change accepted. Assigned to Howard.
|
||||
- **Mailbox architecture stays split (settled):** `/mailbox` (ACG own-mail) on single-tenant
|
||||
`1873b1b0`; client mail send on the suite's Exchange Operator `b43e7342` (already holds Graph
|
||||
Mail.Send). `fabb3421` is deleted — do not reference.
|
||||
- **Cross-host submodule auth on GURU-5070:** use the vaulted `services/gitea` API token via
|
||||
`GIT_ASKPASS` (gitea skill can't inject — parent is HTTP, submodule is a different host; SSH key
|
||||
not authorized). Push merges by explicit SHA.
|
||||
- **Build knowledge as skill + doc + memory:** `gururmm-build` skill is local pre-merge verification
|
||||
only (does NOT trigger the prod build — that's the webhook on merge). `docs/BUILD.md` is the human
|
||||
guide; `deploy/build-pipeline/README.md` remains the server-side source of truth.
|
||||
- **Windows /tmp + quote traps promoted to CLAUDE.md CORE** rather than left in memory — they recur
|
||||
because a memory may not be recalled; CORE is always loaded.
|
||||
- **Howard cleared for GuruRMM merges/deploys** (Mike). Mike still owns RMM architecture/direction;
|
||||
prepared+verified branches no longer bottleneck on Mike to land.
|
||||
- **Coord purge cutoff:** delete everything before 2026-06-19 (208 msgs); keep the last few days +
|
||||
today's live threads (29). Coordination messages are ephemeral; durable record is session logs.
|
||||
- **`msg purge` safety:** `--before` required (can't wipe the store by accident), dry-run by default,
|
||||
`--yes` to delete — mirrors `promote-dashboard.sh`.
|
||||
|
||||
## Problems Encountered
|
||||
- **gitea skill couldn't push the guru-rmm submodule:** it bails inside the submodule ("locate git
|
||||
repo") and its cred-injection only works for HTTPS-with-embedded-auth parents (claudetools is HTTP).
|
||||
Gitea SSH key not authorized on GURU-5070 (`Permission denied (publickey)`). Resolved with the
|
||||
vaulted API token via `GIT_ASKPASS`.
|
||||
- **Docs push rejected non-fast-forward:** remote main moved (CI/other session). Resolved by fetch +
|
||||
rebase of the docs commit onto the new tip, then push (`572435f`).
|
||||
- **RMM agent `sudo -u guru` blocked** when checking build state on `.30`: the agent runs under
|
||||
systemd `NoNewPrivileges`, so sudo failed. The non-sudo reads (build log, marker, downloads dir,
|
||||
Cargo.toml version) still answered the question, so no workaround needed.
|
||||
- **Coord purge loop returned HTTP 000 on all 208 deletes:** jq-on-Windows emits CRLF, so each ID
|
||||
carried a trailing `\r` that broke the curl URL. A single manual DELETE worked (200), proving the
|
||||
endpoint is fine. Resolved by regenerating the ID list through `tr -d '\r'` + trimming in the read
|
||||
loop; re-ran clean (207/207 + 1 = 208). Logged as `--friction` (documented-gotcha repeat).
|
||||
- **`?limit=2000` on coord messages returned a validation error** (cap is 1000). 237 msgs fit in one
|
||||
page; the new `msg purge` paginates over the cap with `skip`.
|
||||
|
||||
## Configuration Changes
|
||||
Created:
|
||||
- `.claude/skills/gururmm-build/SKILL.md` — new skill (build + pre-merge verification).
|
||||
- `.claude/skills/gururmm-build/scripts/verify.sh` — local verify (server/agent/dashboard/migrations).
|
||||
- `.claude/memory/feedback_gururmm_build_verification.md`
|
||||
- `.claude/memory/feedback_submodule_autosync_discipline.md`
|
||||
- `projects/msp-tools/guru-rmm/docs/BUILD.md` (guru-rmm repo)
|
||||
- `session-logs/2026-06/2026-06-21-mike-rmm-build-skill-errorlog-lint-coord-purge.md` (this file)
|
||||
|
||||
Modified:
|
||||
- `.claude/skills/bitdefender/scripts/gz.py` — added `"missing name"` to `_EXPECTED_ERROR_MARKERS`.
|
||||
- `.claude/skills/coord/scripts/coord.py` + `SKILL.md` — added `msg purge --before` command.
|
||||
- `.claude/CLAUDE.md` — Windows CORE bullet: /tmp path + curl.exe/plink quote-stripping hard rules.
|
||||
- `.claude/memory/MEMORY.md` — index lines for the two new memories + updated approval-workflow line.
|
||||
- `.claude/memory/approval-workflow-tools-vs-projects.md` — Howard cleared for GuruRMM merges/deploys.
|
||||
- `.claude/memory/reference_resource_map.md` — replaced stale `fabb3421` cred ref with the suite.
|
||||
- `projects/msp-tools/guru-rmm/CONTEXT.md` — refreshed the stale Build Pipeline section.
|
||||
|
||||
## Credentials & Secrets
|
||||
- No new credentials created or discovered. Used the existing vaulted Gitea API token at
|
||||
`services/gitea` field `credentials.api.api-token` (via `vault.sh get-field`) for cross-host
|
||||
guru-rmm pushes through `GIT_ASKPASS`. Username embedded in URL (`azcomputerguru@...`), token as
|
||||
password; never written to argv/history/git config.
|
||||
- packetdial/NetSapiens: still NO key vaulted (`msp-tools/oitvoip.sops.yaml` empty/absent). Skill
|
||||
remains non-functional until the key is obtained — parked.
|
||||
|
||||
## Infrastructure & Servers
|
||||
- GuruRMM server: `172.16.3.30` (API `:3001`, coord API `:8001/api/coord`). Build host = same box;
|
||||
repo at `/home/guru/gururmm`; downloads `/var/www/gururmm/downloads`; build logs
|
||||
`/var/log/gururmm-build-{linux,windows,server,dashboard}.log`.
|
||||
- gururmm git remote: `https://git.azcomputerguru.com/azcomputerguru/gururmm.git` (Gitea at
|
||||
`172.16.3.20`, HTTP `:3000`, SSH `:2222`). Gitea SSH `.20:2222` was UP from GURU-5070 (port open,
|
||||
sshd answers); Howard reported a transient "refused 3x" from the build host ~00:23, after the
|
||||
BUG-019 build completed at 00:12.
|
||||
- Windows agent build hosts: Beast (`100.101.122.4`, Tailscale, primary) / Pluto (`172.16.3.36`,
|
||||
fallback). Agent beta channel; dashboard beta at `rmm-beta.azcomputerguru.com`.
|
||||
- gururmm own RMM agent: hostname `gururmm`, id `9b92b187-98c7-41b0-9e97-1698d263c42d`, os linux
|
||||
(runs under systemd NoNewPrivileges — no sudo via the agent).
|
||||
|
||||
## Commands & Outputs
|
||||
- Cross-host submodule push (token via askpass, no token in argv):
|
||||
`GIT_ASKPASS=<script> GIT_TERMINAL_PROMPT=0 git -c credential.helper= push "https://azcomputerguru@git.azcomputerguru.com/azcomputerguru/gururmm.git" <sha>:refs/heads/main`
|
||||
- BUG-019 Linux build confirmation (build-linux.log tail): `=== Linux build complete: v0.6.67 in 54s ===`,
|
||||
`Finished release profile [optimized] in 53.64s`, 0 errors; `last-built-commit-linux = 8b5e0dc`.
|
||||
- Coord purge (after CRLF fix): `deleted ok: 207 failed: 0`; remaining total 29, oldest kept 2026-06-19.
|
||||
- New command: `coord.py msg purge --before YYYY-MM-DD [--to <session>] [--yes]` (dry-run default).
|
||||
- bitdefender suppression verify: all 10 real spam phrasings -> `_is_expected_error` True; genuine
|
||||
"Connection refused" -> still logs.
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
- **BUG-018** (`fix/bug-018-fast-delete`, commit `cea87d4`): Howard cleared + told to merge now
|
||||
(=deploy). Watch `/var/log/gururmm-build-server.log`; server-only change, no migration. Optional
|
||||
follow-ups: dashboard optimistic-remove for the 202; bulk-delete endpoint (#3).
|
||||
- **packetdial:** needs the NetSapiens API key vaulted before the skill works. Blocked on the key/portal.
|
||||
- **Migration order on gururmm main:** 060 BUG-019 (merged), 061 BUG-018 (#41), 062 MSP360 (#42),
|
||||
063 SPEC-021 (#40, renumbered from 060 to resolve the collision).
|
||||
- **Optional:** `/wiki-compile` not run this session (decoupled). No single-article slug implied
|
||||
(root session log).
|
||||
|
||||
## Reference Information
|
||||
- gururmm commits: BUG-019 merge `66a7f4e` (CI bump `8b5e0dc`, agent v0.6.67); docs `572435f`
|
||||
(BUILD.md + CONTEXT.md). Earlier fabb purge: `f55b8d25`, `1f65facb`.
|
||||
- claudetools commits (this session): `f8c33c90` (gururmm-build skill + build-verification memory +
|
||||
fabb fix), `ef55121d` (errorlog lint follow-ups: bitdefender marker + submodule memory + Windows
|
||||
CORE), `dd033289`/`615b6f2a` (Howard merge authority), `c281d253`/`6e96ec42` (coord msg purge cmd).
|
||||
- Coord messages sent: `60b119de` (3 decisions), `c8c28d80` (2a/2b correction), `f6fcd546` (BUG-019
|
||||
merged), `15207219` (fleet broadcast), `8b2989f1` (BUG-019 verified), `3a5e0ca1` (bitdefender fix),
|
||||
`3b2892df` (Howard cleared to merge), `1ed75523` (merge BUG-018 now).
|
||||
- App IDs: Exchange Operator `b43e7342` (Graph Mail.Send), Security Investigator `bfbc12a4`, User
|
||||
Manager `64fac46b`, Tenant Admin `709e6eed`, Defender `dbf8ad1a`, ACG Mailbox `1873b1b0`. Deleted:
|
||||
`fabb3421` (Claude-MSP-Access, AADSTS700016).
|
||||
- New endpoint used: `DELETE /api/coord/messages/{id}` (coord API).
|
||||
Reference in New Issue
Block a user