sync: auto-sync from HOWARD-HOME at 2026-06-21 16:16:19
Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-06-21 16:16:19
This commit is contained in:
@@ -0,0 +1,88 @@
|
||||
## User
|
||||
- **User:** Howard Enos (howard)
|
||||
- **Machine:** Howard-Home
|
||||
- **Role:** tech
|
||||
|
||||
## Session Summary
|
||||
|
||||
Worked through a batch of GuruRMM roadmap/bug items, ran a full codebase audit, and ended by root-causing and fixing a ClaudeTools sync bug that had been disrupting the whole session. Started by listing outstanding skill/roadmap work (excluding bitdefender/unifi/guru-scan), then implemented in sequence: SPEC-021 (logged-in-user domain & account-type detection) + the BUG-020 watchdog tray-teardown follow-up; the "Add Devices"/"Enroll an Agent" modal UX fix (top-X close, click-outside + Escape dismissal, refresh-on-close); BUG-018 (reliable agent deletion); and the "Open in MSP360" deep-link button (RMM_THOUGHTS Feature 3). Each landed full-stack on its own branch and was compile-validated (server `cargo check`, dashboard `tsc -b && vite build` run server-side in a throwaway git worktree, since this machine has no `node_modules`).
|
||||
|
||||
The enrollment modal fix was the only item taken all the way to production this session: built the feature branch's dashboard onto the beta web root via a git worktree, the user tested it on rmm-beta, then promoted beta to prod with `promote-dashboard.sh --confirm`. Established (and saved as a feedback memory) a standing rule that GuruRMM **dashboard** changes go to beta first, before main, unless told otherwise — and documented that the pipeline auto-builds beta from `main`, so getting a branch onto beta means a manual worktree build, not a merge.
|
||||
|
||||
Ran `/rmm-audit` (5 parallel Opus passes + a build-pipeline pass over SSH). It surfaced one real HIGH bug — assigning/unassigning any policy silently wiped a connected agent's Event Log Watch monitoring (the third instance of a clobber class already fixed at two other config-push sites) — which was fixed by reusing the single rule-shaper. The audit also flagged a MEDIUM info-disclosure (raw error strings in 500 bodies) that turned out to be **already fixed in main** (commit `58c1a96`); it was a false positive because the audit agents read a stale working tree.
|
||||
|
||||
Lined up the four pending GuruRMM branches as Gitea PRs (#40-#43) with the migration merge-order (060 → 061 → 062) spelled out, DM'd Mike the lineup, and synced. Finally, diagnosed why the guru-rmm submodule kept reverting to a stale commit (`2e469f1`) and discarding work all session: `sync.sh` ran `git submodule update --init` unconditionally on every sync, re-checking-out the intentionally-lagging pinned gitlink in detached HEAD. Guarded it to only populate genuinely-missing submodules, verified guru-rmm now stays on `main` through a sync, and pushed the fix fleet-wide.
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- **Dashboard changes go to beta before main** (saved as memory `rmm-dashboard-beta-before-main`). Beta auto-builds from `main`, so a feature branch is previewed by building its `dashboard/` in a `git worktree` and rsyncing `dist/` to `/var/www/gururmm/dashboard-beta`; promotion to prod is the separate human-run `promote-dashboard.sh --confirm`.
|
||||
- **BUG-018 root cause corrected against the live DB**, not the roadmap's guess: the two huge cascade children (`metrics` ~6.4 GB, `agent_logs` ~3.5 GB) were already indexed on `agent_id`; only five small FKs were unindexed (migration 061). The real operator fix is the new `POST /api/agents/bulk-delete` (one transaction), not the indexes.
|
||||
- **MSP360 deep-link placement unified**: the backup-alert "Investigate" action already deep-links to `?tab=backup`, so a single button on the backup-tab plan card covers both placements; no separate alert-row button (the alert doesn't carry the console id).
|
||||
- **Event Log Watch fix reused the existing shaper** (`watch_rules_for_agent` retargeted from `&AppState` to `&PgPool`) rather than duplicating the rule-mapping JSON — one source of truth across all three push sites.
|
||||
- **Submodule fix guarded the clobber, did not bump the gitlink**: CLAUDE.md intends the parent's pinned gitlink to lag `main`; the bug was `sync.sh` re-checking-out that pin over live work, so the fix is a populate-only guard, preserving the lag-by-design behavior.
|
||||
- **Used a SHA-push workaround** (commit → cherry-pick onto `origin/main` → `git push origin <sha>:refs/heads/<branch>` → verify with `ls-remote`) to land branches reliably while the submodule kept detaching, before the root cause was found.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- **Submodule kept reverting to `2e469f1` (detached HEAD), discarding uncommitted edits** — caused multiple dangling-commit recoveries and one reverted sed edit. Worked around with SHA-pushes, then root-caused: `sync.sh:339` `git submodule update --init` ran unconditionally. Fixed with a `git -C "$ppath" rev-parse --git-dir` populate-only guard. Verified guru-rmm now survives a sync on `main @ ed8cad3`.
|
||||
- **Audit ran against a stale checkout** (working tree pinned at `2e469f1`, 5 commits behind `origin/main`). One MEDIUM finding (500-body info-disclosure, 17 sites) was already fixed in main by `58c1a96`; caught before shipping a redundant change, deleted the bogus branch, corrected the report. Re-verified all other findings against real main — only that one was stale. Same submodule root cause as above.
|
||||
- **Build-server SSH host key changed** — `172.16.3.30` was rebuilt as a new physical box on 2026-06-11 (per wiki), so the ED25519 key legitimately changed. Refreshed `known_hosts` after confirming the rebuild, rather than blindly bypassing.
|
||||
- **No SSH key on Howard-Home for the build server** — used the dedicated `gururmm-physical` key from the SOPS vault (extract to a temp keyfile, `ssh -i`, delete after).
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
**ClaudeTools repo (committed + pushed):**
|
||||
- `M .claude/scripts/sync.sh` — populate-only guard on the submodule update (Phase 1a) so already-populated submodules are never re-checked-out to the lagging gitlink.
|
||||
- `A .claude/memory/rmm-dashboard-beta-before-main.md` + index pointer in `.claude/memory/MEMORY.md`.
|
||||
- `M errorlog.md` — three entries: two `--friction` (detached-HEAD submodule; stale audit base) and the root-cause fix context.
|
||||
|
||||
**guru-rmm submodule (on feature branches, not yet merged):**
|
||||
- `feat/spec-021-and-bug-020-tray-teardown`: `agent/src/metrics/mod.rs`, `agent/Cargo.toml`, `agent/src/watchdog/{mod,monitor,wts}.rs`, `server/migrations/060_logged_in_user_domain.sql`, `server/src/db/metrics.rs`, `server/src/ws/mod.rs`, `dashboard/src/{api/client.ts,pages/AgentDetail.tsx}`, `docs/specs/SPEC-021-*.md`, `docs/FEATURE_ROADMAP.md`.
|
||||
- `fix/bug-018-agent-delete-indexes`: `server/migrations/061_index_agent_foreign_keys.sql`, `server/src/db/agents.rs`, `server/src/api/{agents,mod}.rs`, `dashboard/src/{api/client.ts,pages/Agents.tsx}`, `docs/FEATURE_ROADMAP.md`.
|
||||
- `feat/msp360-deeplink`: `server/src/mspbackups/{client,sync}.rs`, `server/src/db/mspbackups.rs`, `server/src/api/mspbackups.rs`, `server/migrations/062_backup_provider_console_id.sql`, `dashboard/src/{api/client.ts,lib/provider-links.ts,components/BackupDetailTab.tsx}`, `docs/RMM_THOUGHTS.md`.
|
||||
- `fix/eventlog-watch-policy-clobber`: `server/src/api/event_log_watches.rs`, `server/src/ws/mod.rs`, `server/src/api/policies.rs`.
|
||||
- `fix/enrollment-modal-ux` (MERGED to main as `4027c86`): `dashboard/src/components/{EnrollmentModal,EnrollAgentModal}.tsx`, `dashboard/src/pages/{SiteDetail,ClientDetail}.tsx`.
|
||||
- Untracked (NOT committed): `projects/msp-tools/guru-rmm/reports/2026-06-21-rmm-audit.md`.
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
No new credentials created. Used (all already vaulted):
|
||||
- SSH to build server: `gururmm-physical` ed25519 key — vault `infrastructure/gururmm-server-physical.sops.yaml` field `credentials.ssh-private-key` (user `guru@172.16.3.30`, key-only auth; password auth disabled on the new box).
|
||||
- Sudo on build server (for promote): `Paper123!@#-rmm` — vault `infrastructure/gururmm-server.sops.yaml` `credentials.password`.
|
||||
- Postgres (schema query): vault `infrastructure/gururmm-server.sops.yaml` `credentials.databases.postgresql-password` (db `gururmm`, user `gururmm`, localhost:5432).
|
||||
- Gitea API (PR creation): vault `services/gitea-howard.sops.yaml` `credentials.password` (user `howard`, internal `http://172.16.3.20:3000`).
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- **GuruRMM server** `172.16.3.30` (hostname `gururmm`) — physical Lenovo ThinkCentre M83, Ubuntu 26.04, replaced the Jupiter VM at this IP on 2026-06-11. Repo `/home/guru/gururmm`. Build pipeline `/opt/gururmm/*.sh`, webhook `:9000`. Postgres 18, MariaDB, GuruRMM API `:3001`, Coord API `:8001`.
|
||||
- Dashboard web roots: beta `/var/www/gururmm/dashboard-beta` (https://rmm-beta.azcomputerguru.com), prod `/var/www/gururmm/dashboard` (https://rmm.azcomputerguru.com). API https://rmm-api.azcomputerguru.com.
|
||||
- Channel model: webhook auto-builds beta from `origin/main`; prod is updated only by `promote-dashboard.sh --confirm` (backs up prod first; `--rollback` to undo).
|
||||
- Gitea: git.azcomputerguru.com (internal `172.16.3.20:3000`), repo `azcomputerguru/gururmm`.
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
- Beta preview build (per branch, no main merge):
|
||||
`git worktree add --detach <wt> origin/<branch>` → `cd <wt>/dashboard && npm install && npm run build` → `rsync -a --delete dist/ /var/www/gururmm/dashboard-beta/`. Do NOT touch `/opt/gururmm/last-built-commit-dashboard`.
|
||||
- Promote beta→prod: `sudo /opt/gururmm/promote-dashboard.sh --confirm` (prod backup written to `/var/www/gururmm/.dashboard-backups/dashboard-20260621-191008`). Undo: `... --rollback`.
|
||||
- Unindexed-FK / schema queries run via `PGPASSWORD=... psql -U gururmm -d gururmm -h localhost`.
|
||||
- PR creation: `POST http://172.16.3.20:3000/api/v1/repos/azcomputerguru/gururmm/pulls` (basic auth `howard:<pw>`, JSON built with `jq -nc --arg`).
|
||||
- Submodule-clobber fix verified: guru-rmm `main @ ed8cad3` before AND after a full `sync.sh` run (previously reverted to detached `2e469f1`).
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
- **Four GuruRMM PRs await Mike's review/merge** — merge migrations in order:
|
||||
- #40 SPEC-021 + BUG-020 (migration 060) — FIRST. SPEC-021 also needs a Pluto signed-MSI agent build + fleet rollout.
|
||||
- #41 BUG-018 (migration 061) — after 060.
|
||||
- #42 MSP360 deep-link (migration 062) — last.
|
||||
- #43 Event Log Watch HIGH fix — no migration, merge anytime.
|
||||
- After each PR with a dashboard portion merges to main, it auto-builds to beta; promote to prod when validated.
|
||||
- **Remaining audit findings** (all re-verified valid against real main): LOW DoS `cap_field` batch (DiscoveryResult / WatchdogEvent / NetworkState / Auth hostname in `ws/mod.rs`), SSE status-stream revocation bypass + JWT-in-URL (`agents.rs:758-762`), two `console.log` stubs (`AgentDetail.tsx:670`, `Logs.tsx:198`), `ContextTree.tsx` missing `isError`, and the roadmap reconciliation (5 shipped-but-unchecked items to flip `[ ]`→`[x]` + 3 partials to annotate).
|
||||
- The audit report `reports/2026-06-21-rmm-audit.md` is untracked in the guru-rmm submodule working tree — not yet committed to the gururmm repo.
|
||||
|
||||
## Reference Information
|
||||
|
||||
- PRs: https://git.azcomputerguru.com/azcomputerguru/gururmm/pulls/40 (..43)
|
||||
- Branch tips: SPEC-021/BUG-020 `7083e39`; BUG-018 `604c42f`; MSP360 `776b587`; eventlog fix `432b434`; enrollment (merged) `4027c86`.
|
||||
- guru-rmm `origin/main` = `ed8cad3`; the already-shipped info-disclosure fix = `58c1a96`; the stale gitlink the submodule was pinned at = `2e469f1`.
|
||||
- Discord DMs to Mike: prod-promote confirmation (msg 1518332316114489505), PR lineup (msg 1518361486509215955).
|
||||
- Memory added: `.claude/memory/rmm-dashboard-beta-before-main.md`.
|
||||
Reference in New Issue
Block a user