From 84055d62e1d6f3e13f3a44236665f108b5e5e437 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Sat, 6 Jun 2026 08:27:50 -0700 Subject: [PATCH] sync: auto-sync from GURU-5070 at 2026-06-06 08:27:44 Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-06 08:27:44 --- ...4458e-2ad4-4b2c-c355-48eee105f467.run.lock | 0 ...-06-mike-gururmm-site-counts-breadcrumb.md | 144 ++++++++++++++++++ 2 files changed, 144 insertions(+) create mode 100644 .sentry-native/b734458e-2ad4-4b2c-c355-48eee105f467.run.lock create mode 100644 session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md diff --git a/.sentry-native/b734458e-2ad4-4b2c-c355-48eee105f467.run.lock b/.sentry-native/b734458e-2ad4-4b2c-c355-48eee105f467.run.lock new file mode 100644 index 0000000..e69de29 diff --git a/session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md b/session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md new file mode 100644 index 0000000..ef30986 --- /dev/null +++ b/session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md @@ -0,0 +1,144 @@ +# 2026-06-06 — GuruRMM: site agent_count fix, dev_admin bootstrap, SiteDetail breadcrumb + +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +## Session Summary + +Resumed "beta site" work from the prior ClientDetail-redesign session. The first thread was the +carried-over "0 agents vs 16 offline" discrepancy on the Cascades ClientDetail page. Investigation +disproved the previous session's "orphaned/siteless agents" hypothesis: GuruRMM has **no `client_id` +column on the agents table** — `agent.client_id` is derived everywhere via `LEFT JOIN clients c ON +s.client_id = c.id` (i.e. through `site_id`). The real root cause was that the +`GET /api/clients/{id}/sites` handler (`list_sites_by_client`) returned a hardcoded `agent_count: 0` +("Would need separate query"), which fed both the ClientDetail strip total AND every per-site row in the +Sites table. Live DB confirmed Cascades' single site genuinely holds 32 agents (16 online / 16 offline) — +not a data-hygiene issue at all. + +Fixed the server handler to route through the already-existing `db::list_sites_by_clients(&[client_id])` +(which computes `COUNT(*) FROM agents WHERE site_id = s.id` per site, plus `client_name`), mirroring the +working sibling `list_sites` handler. Mike also approved fixing a pre-existing authz gap in the same change: +the endpoint took `_user` and did not org-scope, so any authenticated user could list any client's sites — +added the standard `is_admin() || can_access_org()` guard used by `get_site`/`get_client`. Code Review +APPROVED. Committed `bdac007`, server auto-built/deployed to **v0.3.44** on the shared API (affects beta and +prod together — there is no separate beta API server). Verified live via the authenticated debug-Chrome +admin session: Cascades now shows "32 agents · 1 site" and the site row shows 32. + +While verifying credentials in the UI, the credential "view" returned a bare `403 "Access denied"`. Traced +to `reveal_credential` being **dev_admin-only by design** (the only endpoint returning decrypted plaintext). +The logged-in account (`admin@azcomputerguru.com`) is role `admin`, and the system had **zero dev_admin +accounts**, so reveal was effectively dead for everyone. Per Mike's decision, bootstrapped dev_admin via a +direct DB update for `admin@` and `howard@` only (claude-api@ automation and test@ left as admin). Because +the role lives in the JWT, the affected accounts must log out/in to mint a fresh token before reveal works. + +Mike then flagged that the bare 403 violates the "useful errors" standard. Confirmed that standard was +**never implemented** in GuruRMM (SPEC-018, still Proposed) or GuruConnect (SPEC-008, also Proposed) — no +`error.rs`/`AppError`/correlation_id, ~38 bare-string returns in `credentials.rs` alone, ~30 files +product-wide. Per Mike's choice, logged it as a future task (coord todo `16c2c7d6`) rather than building now. +Finally, fixed a SiteDetail breadcrumb bug Mike reported: the "Sites" crumb (and the back-arrow button) were +hardcoded to the global `/sites` list instead of the current client's site list. Pointed both at +`/clients/{clientRecord.id}` (the ClientDetail page renders that client's Sites table) when the client is +known, with `/sites` as fallback. Committed `081b4cc`, dashboard beta auto-built to **v0.2.52**, verified live. + +## Key Decisions + +- **Diagnosed "0 agents" as a server placeholder bug, not data hygiene.** Reasoned from the schema + (`agents` has no `client_id` column; it's derived through `site_id`), which makes an agent with + `client_id=Cascades` but no Cascades site logically impossible — then confirmed with a live count query. +- **Fixed at the source (server) rather than a dashboard workaround.** A dashboard-only change would have + fixed only the strip; the server fix repairs the strip AND the per-site rows (which Mike also reported), + via one already-tested DB query path. +- **Bundled the org-scoping fix into the same commit (Mike approved).** Mirrored the existing + `get_site`/`get_client` guard (`is_admin() || can_access_org()`); kept the change otherwise behavior- + preserving. Net security tightening, no regression for admins or legit org users. +- **dev_admin bootstrap via direct DB, not code-relaxation.** Kept the intended hard posture (reveal stays + dev_admin-only) and elevated trusted human accounts instead of lowering the bar. Direct DB was required + because only a dev_admin can assign dev_admin and none existed (bootstrap catch). Excluded the + `claude-api@` automation account and `test@` from elevation. +- **Did not build SPEC-018 now.** Mike chose "log as future task"; filed coord todo with full findings. +- **Breadcrumb "Sites" → client page.** No dedicated `/clients/:id/sites` route exists; the ClientDetail + page (`/clients/:id`) IS the per-client site list, so it's the correct target. Aligned the back-arrow and + its aria-label to match. + +## Problems Encountered + +- **Submodule working tree lagged `main`.** The gururmm submodule was checked out at the pinned commit + `226ba9f` (behind the live `3ff0da5` ClientDetail merge), so `ClientExceptionsBand`/current code wasn't + present. Resolved by `git checkout main` + `git pull --ff-only` in the submodule (synced to `be5ddc9`). +- **GuruRMM DB connection not at the obvious path.** `/opt/gururmm/.env` wasn't readable by the first grep; + found `DATABASE_URL` via the systemd unit's `EnvironmentFile` (`systemctl cat gururmm-server`). +- **MCP Chrome was logged in as a non-admin test user** (`testuser_antigravity`, role `user`), which showed + "Client not found" for Cascades — actually a correct demonstration of the new org-scoping. Switched to the + project's CDP debug driver (`cdp.py`, port 9222, profile authenticated as admin) per Mike ("use debug mode + for chrome"); had to `cdp.py launch` first (no Chrome on 9222 yet). +- **`whoami-block.sh` reported "no identity.json"** during /save — caused by the shell cwd still being inside + the submodule. identity.json is present and correct at the repo root; re-ran from `/d/claudetools`. +- **Push race on the breadcrumb commit.** Remote `main` had advanced by a CI version-bump (`c70fe6e`); Gitea + Agent rebased cleanly (commit SHA changed `04af3c8` → `081b4cc`) and pushed. + +## Configuration Changes + +All in the **gururmm submodule** (`projects/msp-tools/guru-rmm`), merged to `main`: +- MOD `server/src/api/sites.rs` — `list_sites_by_client`: switched fetch to `db::list_sites_by_clients(&state.db, &[client_id])`; mapped real `agent_count: s.agent_count` + `client_name: Some(s.client_name)`; added org-access guard (`is_admin() || can_access_org(client_id)`); param `_user` → `user`. Commit `bdac007`. +- MOD `dashboard/src/pages/SiteDetail.tsx` — breadcrumb "Sites" `` and back-arrow `onClick`/`aria-label` now target `/clients/${clientRecord.id}` (fallback `/sites`). Commit `081b4cc`. +- `db::get_sites_by_client` left intact (still used by `clients.rs:144` for `site_count`). + +Database (GuruRMM Postgres): +- `UPDATE users SET role='dev_admin' WHERE email IN ('admin@azcomputerguru.com','howard@azcomputerguru.com')` — 2 rows. Role distribution now: dev_admin 2, admin 2 (claude-api@, test@), user 1. + +ClaudeTools repo (this commit): this session log only. Submodule pointer intentionally NOT bumped (lagging main is expected). Merged throwaway branches `fix/site-agent-count` and `fix/sitedetail-breadcrumb` can be pruned later. + +## Credentials & Secrets + +- **GuruRMM Postgres** (newly located this session; NOT yet vaulted — follow-up): + `DATABASE_URL=postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm` on 172.16.3.30. + Source: `/opt/gururmm/.env` (systemd `EnvironmentFile` for `gururmm-server`). Reachable only on-host (localhost) — query via `ssh guru@172.16.3.30 "psql '' ..."`. Should be added to SOPS vault under `projects/gururmm/` or `infrastructure/`. +- **Role grants:** `admin@azcomputerguru.com` and `howard@azcomputerguru.com` elevated to `dev_admin` (user id of admin@ = `490e2d0f-067d-4130-98fd-83f06ed0b932`). No passwords changed. +- SSH `guru@172.16.3.30` is key-based from GURU-5070 for reads. Sudo (prod promote) password in vault `infrastructure/gururmm-server.sops.yaml`. + +## Infrastructure & Servers + +- **Shared API:** `https://rmm-api.azcomputerguru.com` (dashboard `API_BASE_URL` default; `VITE_API_URL` unset on beta). Beta AND prod dashboards both use it — no separate beta API. Rust server on 172.16.3.30:3001, systemd `gururmm-server`, `WorkingDirectory=/opt/gururmm`, binary `/opt/gururmm/gururmm-server`. Now **v0.3.44**. +- **Beta dashboard:** `https://rmm-beta.azcomputerguru.com` — auto-builds from gururmm `main` push (webhook → `build-dashboard.sh`), deploys `/var/www/gururmm/dashboard-beta`. Now **v0.2.52** (`assets/index-BOagZ71U.js`). Prod = promote-only (`sudo /opt/gururmm/promote-dashboard.sh --confirm`); NOT promoted this session. +- **Server build/deploy:** push to `main` → `build-server.sh` (cargo build --release ~5.5 min) → auto-deploys + restarts service. No staging; affects the shared API immediately. +- Build logs: `/var/log/gururmm-build-server.log`, `/var/log/gururmm-build-dashboard.log`. +- **CDP debug Chrome:** `.claude/scripts/cdp.py` (launch/status/nav/shot/eval), port 9222, profile `C:\Users\guru\.claude\cdp-chrome-profile` (authenticated as admin@ to beta). Left running. +- Gitea (gururmm): internal `http://172.16.3.20:3000/azcomputerguru/gururmm.git`. + +## Commands & Outputs + +```bash +# live Cascades agent counts (the truth the fix now serves) +ssh guru@172.16.3.30 "psql '' -c \"SELECT s.name,(SELECT COUNT(*) FROM agents a WHERE a.site_id=s.id) agents,(SELECT COUNT(*) FROM agents a WHERE a.site_id=s.id AND a.status IN ('offline','error')) offline FROM sites s WHERE s.client_id='42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f';\"" +# -> CascadesTucson | GOLD-MOON-4620 | 32 | 16 + +# locate DB url +ssh guru@172.16.3.30 "systemctl cat gururmm-server | grep EnvironmentFile" # -> /opt/gururmm/.env + +# dev_admin bootstrap (2 rows) +UPDATE users SET role='dev_admin' WHERE email IN ('admin@azcomputerguru.com','howard@azcomputerguru.com') AND role='admin' RETURNING email, role; + +# verify breadcrumb fix live (debug Chrome) +py .claude/scripts/cdp.py nav "https://rmm-beta.azcomputerguru.com/sites/c157c399-82d3-4581-979a-b9fad70f4fef" +py .claude/scripts/cdp.py eval "...breadcrumb hrefs..." # -> "Sites" href=/clients/42e1b0e3...; back aria="Back to Cascades of Tucson" +``` + +## Pending / Incomplete Tasks + +1. **Mike: re-login** (log out/in) so `admin@` picks up the `dev_admin` JWT — then credential reveal works. Same for Howard. Offer to drive debug Chrome to confirm a reveal returns 200 afterward. +2. **Promote dashboard beta → prod** when happy with the breadcrumb fix (beta-only until then): `sudo /opt/gururmm/promote-dashboard.sh --confirm` on 172.16.3.30. The server agent_count + org-scoping fix is ALREADY on prod (shared API). +3. **Vault the GuruRMM Postgres DATABASE_URL** (currently only in `/opt/gururmm/.env`). +4. **SPEC-018 valuable error messages** — coord todo `16c2c7d6` (gururmm, assigned mike). Build AppError envelope + correlation-id middleware; convert handlers (credentials 403 → `AUTH_INSUFFICIENT_ROLE` + message + correlation_id as first adopter). Keep aligned with GuruConnect SPEC-008. Best as a phased/multi-agent effort. +5. Prune merged branches `fix/site-agent-count`, `fix/sitedetail-breadcrumb`; bump claudetools submodule pointer for gururmm whenever desired. + +## Reference Information + +- Cascades client: `42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f` · Cascades site: `c157c399-82d3-4581-979a-b9fad70f4fef` (CascadesTucson, GOLD-MOON-4620) +- gururmm commits: `bdac007` (server agent_count + org-scope), `081b4cc` (dashboard breadcrumb) +- Versions: server **v0.3.44**, dashboard beta **v0.2.52** +- Coord: lock `66b276b6` (ClientDetail) + `6e9533a9` (SiteDetail) released; components `gururmm/server`=deployed v0.3.44, `gururmm/dashboard`=deployed v0.2.52; todo `16c2c7d6` (SPEC-018) +- Specs: `projects/msp-tools/guru-rmm/docs/specs/SPEC-018-valuable-error-messages.md` (Proposed); GuruConnect `SPEC-008` (Proposed) +- Routes: `/clients/:id` (ClientDetail = per-client site list), `/sites` (global), `/sites/:id` (SiteDetail) +- Screenshots: `.claude/tmp/cdp/cascades-verify-top.png`, `.claude/tmp/cdp/sitedetail-breadcrumb-fixed.png`