sync: auto-sync from GURU-5070 at 2026-06-06 08:27:44
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-06 08:27:44
This commit is contained in:
144
session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md
Normal file
144
session-logs/2026-06-06-mike-gururmm-site-counts-breadcrumb.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# 2026-06-06 — GuruRMM: site agent_count fix, dev_admin bootstrap, SiteDetail breadcrumb
|
||||
|
||||
## User
|
||||
- **User:** Mike Swanson (mike)
|
||||
- **Machine:** GURU-5070
|
||||
- **Role:** admin
|
||||
|
||||
## Session Summary
|
||||
|
||||
Resumed "beta site" work from the prior ClientDetail-redesign session. The first thread was the
|
||||
carried-over "0 agents vs 16 offline" discrepancy on the Cascades ClientDetail page. Investigation
|
||||
disproved the previous session's "orphaned/siteless agents" hypothesis: GuruRMM has **no `client_id`
|
||||
column on the agents table** — `agent.client_id` is derived everywhere via `LEFT JOIN clients c ON
|
||||
s.client_id = c.id` (i.e. through `site_id`). The real root cause was that the
|
||||
`GET /api/clients/{id}/sites` handler (`list_sites_by_client`) returned a hardcoded `agent_count: 0`
|
||||
("Would need separate query"), which fed both the ClientDetail strip total AND every per-site row in the
|
||||
Sites table. Live DB confirmed Cascades' single site genuinely holds 32 agents (16 online / 16 offline) —
|
||||
not a data-hygiene issue at all.
|
||||
|
||||
Fixed the server handler to route through the already-existing `db::list_sites_by_clients(&[client_id])`
|
||||
(which computes `COUNT(*) FROM agents WHERE site_id = s.id` per site, plus `client_name`), mirroring the
|
||||
working sibling `list_sites` handler. Mike also approved fixing a pre-existing authz gap in the same change:
|
||||
the endpoint took `_user` and did not org-scope, so any authenticated user could list any client's sites —
|
||||
added the standard `is_admin() || can_access_org()` guard used by `get_site`/`get_client`. Code Review
|
||||
APPROVED. Committed `bdac007`, server auto-built/deployed to **v0.3.44** on the shared API (affects beta and
|
||||
prod together — there is no separate beta API server). Verified live via the authenticated debug-Chrome
|
||||
admin session: Cascades now shows "32 agents · 1 site" and the site row shows 32.
|
||||
|
||||
While verifying credentials in the UI, the credential "view" returned a bare `403 "Access denied"`. Traced
|
||||
to `reveal_credential` being **dev_admin-only by design** (the only endpoint returning decrypted plaintext).
|
||||
The logged-in account (`admin@azcomputerguru.com`) is role `admin`, and the system had **zero dev_admin
|
||||
accounts**, so reveal was effectively dead for everyone. Per Mike's decision, bootstrapped dev_admin via a
|
||||
direct DB update for `admin@` and `howard@` only (claude-api@ automation and test@ left as admin). Because
|
||||
the role lives in the JWT, the affected accounts must log out/in to mint a fresh token before reveal works.
|
||||
|
||||
Mike then flagged that the bare 403 violates the "useful errors" standard. Confirmed that standard was
|
||||
**never implemented** in GuruRMM (SPEC-018, still Proposed) or GuruConnect (SPEC-008, also Proposed) — no
|
||||
`error.rs`/`AppError`/correlation_id, ~38 bare-string returns in `credentials.rs` alone, ~30 files
|
||||
product-wide. Per Mike's choice, logged it as a future task (coord todo `16c2c7d6`) rather than building now.
|
||||
Finally, fixed a SiteDetail breadcrumb bug Mike reported: the "Sites" crumb (and the back-arrow button) were
|
||||
hardcoded to the global `/sites` list instead of the current client's site list. Pointed both at
|
||||
`/clients/{clientRecord.id}` (the ClientDetail page renders that client's Sites table) when the client is
|
||||
known, with `/sites` as fallback. Committed `081b4cc`, dashboard beta auto-built to **v0.2.52**, verified live.
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- **Diagnosed "0 agents" as a server placeholder bug, not data hygiene.** Reasoned from the schema
|
||||
(`agents` has no `client_id` column; it's derived through `site_id`), which makes an agent with
|
||||
`client_id=Cascades` but no Cascades site logically impossible — then confirmed with a live count query.
|
||||
- **Fixed at the source (server) rather than a dashboard workaround.** A dashboard-only change would have
|
||||
fixed only the strip; the server fix repairs the strip AND the per-site rows (which Mike also reported),
|
||||
via one already-tested DB query path.
|
||||
- **Bundled the org-scoping fix into the same commit (Mike approved).** Mirrored the existing
|
||||
`get_site`/`get_client` guard (`is_admin() || can_access_org()`); kept the change otherwise behavior-
|
||||
preserving. Net security tightening, no regression for admins or legit org users.
|
||||
- **dev_admin bootstrap via direct DB, not code-relaxation.** Kept the intended hard posture (reveal stays
|
||||
dev_admin-only) and elevated trusted human accounts instead of lowering the bar. Direct DB was required
|
||||
because only a dev_admin can assign dev_admin and none existed (bootstrap catch). Excluded the
|
||||
`claude-api@` automation account and `test@` from elevation.
|
||||
- **Did not build SPEC-018 now.** Mike chose "log as future task"; filed coord todo with full findings.
|
||||
- **Breadcrumb "Sites" → client page.** No dedicated `/clients/:id/sites` route exists; the ClientDetail
|
||||
page (`/clients/:id`) IS the per-client site list, so it's the correct target. Aligned the back-arrow and
|
||||
its aria-label to match.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- **Submodule working tree lagged `main`.** The gururmm submodule was checked out at the pinned commit
|
||||
`226ba9f` (behind the live `3ff0da5` ClientDetail merge), so `ClientExceptionsBand`/current code wasn't
|
||||
present. Resolved by `git checkout main` + `git pull --ff-only` in the submodule (synced to `be5ddc9`).
|
||||
- **GuruRMM DB connection not at the obvious path.** `/opt/gururmm/.env` wasn't readable by the first grep;
|
||||
found `DATABASE_URL` via the systemd unit's `EnvironmentFile` (`systemctl cat gururmm-server`).
|
||||
- **MCP Chrome was logged in as a non-admin test user** (`testuser_antigravity`, role `user`), which showed
|
||||
"Client not found" for Cascades — actually a correct demonstration of the new org-scoping. Switched to the
|
||||
project's CDP debug driver (`cdp.py`, port 9222, profile authenticated as admin) per Mike ("use debug mode
|
||||
for chrome"); had to `cdp.py launch` first (no Chrome on 9222 yet).
|
||||
- **`whoami-block.sh` reported "no identity.json"** during /save — caused by the shell cwd still being inside
|
||||
the submodule. identity.json is present and correct at the repo root; re-ran from `/d/claudetools`.
|
||||
- **Push race on the breadcrumb commit.** Remote `main` had advanced by a CI version-bump (`c70fe6e`); Gitea
|
||||
Agent rebased cleanly (commit SHA changed `04af3c8` → `081b4cc`) and pushed.
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
All in the **gururmm submodule** (`projects/msp-tools/guru-rmm`), merged to `main`:
|
||||
- MOD `server/src/api/sites.rs` — `list_sites_by_client`: switched fetch to `db::list_sites_by_clients(&state.db, &[client_id])`; mapped real `agent_count: s.agent_count` + `client_name: Some(s.client_name)`; added org-access guard (`is_admin() || can_access_org(client_id)`); param `_user` → `user`. Commit `bdac007`.
|
||||
- MOD `dashboard/src/pages/SiteDetail.tsx` — breadcrumb "Sites" `<Link>` and back-arrow `onClick`/`aria-label` now target `/clients/${clientRecord.id}` (fallback `/sites`). Commit `081b4cc`.
|
||||
- `db::get_sites_by_client` left intact (still used by `clients.rs:144` for `site_count`).
|
||||
|
||||
Database (GuruRMM Postgres):
|
||||
- `UPDATE users SET role='dev_admin' WHERE email IN ('admin@azcomputerguru.com','howard@azcomputerguru.com')` — 2 rows. Role distribution now: dev_admin 2, admin 2 (claude-api@, test@), user 1.
|
||||
|
||||
ClaudeTools repo (this commit): this session log only. Submodule pointer intentionally NOT bumped (lagging main is expected). Merged throwaway branches `fix/site-agent-count` and `fix/sitedetail-breadcrumb` can be pruned later.
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
- **GuruRMM Postgres** (newly located this session; NOT yet vaulted — follow-up):
|
||||
`DATABASE_URL=postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm` on 172.16.3.30.
|
||||
Source: `/opt/gururmm/.env` (systemd `EnvironmentFile` for `gururmm-server`). Reachable only on-host (localhost) — query via `ssh guru@172.16.3.30 "psql '<url>' ..."`. Should be added to SOPS vault under `projects/gururmm/` or `infrastructure/`.
|
||||
- **Role grants:** `admin@azcomputerguru.com` and `howard@azcomputerguru.com` elevated to `dev_admin` (user id of admin@ = `490e2d0f-067d-4130-98fd-83f06ed0b932`). No passwords changed.
|
||||
- SSH `guru@172.16.3.30` is key-based from GURU-5070 for reads. Sudo (prod promote) password in vault `infrastructure/gururmm-server.sops.yaml`.
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- **Shared API:** `https://rmm-api.azcomputerguru.com` (dashboard `API_BASE_URL` default; `VITE_API_URL` unset on beta). Beta AND prod dashboards both use it — no separate beta API. Rust server on 172.16.3.30:3001, systemd `gururmm-server`, `WorkingDirectory=/opt/gururmm`, binary `/opt/gururmm/gururmm-server`. Now **v0.3.44**.
|
||||
- **Beta dashboard:** `https://rmm-beta.azcomputerguru.com` — auto-builds from gururmm `main` push (webhook → `build-dashboard.sh`), deploys `/var/www/gururmm/dashboard-beta`. Now **v0.2.52** (`assets/index-BOagZ71U.js`). Prod = promote-only (`sudo /opt/gururmm/promote-dashboard.sh --confirm`); NOT promoted this session.
|
||||
- **Server build/deploy:** push to `main` → `build-server.sh` (cargo build --release ~5.5 min) → auto-deploys + restarts service. No staging; affects the shared API immediately.
|
||||
- Build logs: `/var/log/gururmm-build-server.log`, `/var/log/gururmm-build-dashboard.log`.
|
||||
- **CDP debug Chrome:** `.claude/scripts/cdp.py` (launch/status/nav/shot/eval), port 9222, profile `C:\Users\guru\.claude\cdp-chrome-profile` (authenticated as admin@ to beta). Left running.
|
||||
- Gitea (gururmm): internal `http://172.16.3.20:3000/azcomputerguru/gururmm.git`.
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
```bash
|
||||
# live Cascades agent counts (the truth the fix now serves)
|
||||
ssh guru@172.16.3.30 "psql '<DATABASE_URL>' -c \"SELECT s.name,(SELECT COUNT(*) FROM agents a WHERE a.site_id=s.id) agents,(SELECT COUNT(*) FROM agents a WHERE a.site_id=s.id AND a.status IN ('offline','error')) offline FROM sites s WHERE s.client_id='42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f';\""
|
||||
# -> CascadesTucson | GOLD-MOON-4620 | 32 | 16
|
||||
|
||||
# locate DB url
|
||||
ssh guru@172.16.3.30 "systemctl cat gururmm-server | grep EnvironmentFile" # -> /opt/gururmm/.env
|
||||
|
||||
# dev_admin bootstrap (2 rows)
|
||||
UPDATE users SET role='dev_admin' WHERE email IN ('admin@azcomputerguru.com','howard@azcomputerguru.com') AND role='admin' RETURNING email, role;
|
||||
|
||||
# verify breadcrumb fix live (debug Chrome)
|
||||
py .claude/scripts/cdp.py nav "https://rmm-beta.azcomputerguru.com/sites/c157c399-82d3-4581-979a-b9fad70f4fef"
|
||||
py .claude/scripts/cdp.py eval "...breadcrumb hrefs..." # -> "Sites" href=/clients/42e1b0e3...; back aria="Back to Cascades of Tucson"
|
||||
```
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
1. **Mike: re-login** (log out/in) so `admin@` picks up the `dev_admin` JWT — then credential reveal works. Same for Howard. Offer to drive debug Chrome to confirm a reveal returns 200 afterward.
|
||||
2. **Promote dashboard beta → prod** when happy with the breadcrumb fix (beta-only until then): `sudo /opt/gururmm/promote-dashboard.sh --confirm` on 172.16.3.30. The server agent_count + org-scoping fix is ALREADY on prod (shared API).
|
||||
3. **Vault the GuruRMM Postgres DATABASE_URL** (currently only in `/opt/gururmm/.env`).
|
||||
4. **SPEC-018 valuable error messages** — coord todo `16c2c7d6` (gururmm, assigned mike). Build AppError envelope + correlation-id middleware; convert handlers (credentials 403 → `AUTH_INSUFFICIENT_ROLE` + message + correlation_id as first adopter). Keep aligned with GuruConnect SPEC-008. Best as a phased/multi-agent effort.
|
||||
5. Prune merged branches `fix/site-agent-count`, `fix/sitedetail-breadcrumb`; bump claudetools submodule pointer for gururmm whenever desired.
|
||||
|
||||
## Reference Information
|
||||
|
||||
- Cascades client: `42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f` · Cascades site: `c157c399-82d3-4581-979a-b9fad70f4fef` (CascadesTucson, GOLD-MOON-4620)
|
||||
- gururmm commits: `bdac007` (server agent_count + org-scope), `081b4cc` (dashboard breadcrumb)
|
||||
- Versions: server **v0.3.44**, dashboard beta **v0.2.52**
|
||||
- Coord: lock `66b276b6` (ClientDetail) + `6e9533a9` (SiteDetail) released; components `gururmm/server`=deployed v0.3.44, `gururmm/dashboard`=deployed v0.2.52; todo `16c2c7d6` (SPEC-018)
|
||||
- Specs: `projects/msp-tools/guru-rmm/docs/specs/SPEC-018-valuable-error-messages.md` (Proposed); GuruConnect `SPEC-008` (Proposed)
|
||||
- Routes: `/clients/:id` (ClientDetail = per-client site list), `/sites` (global), `/sites/:id` (SiteDetail)
|
||||
- Screenshots: `.claude/tmp/cdp/cascades-verify-top.png`, `.claude/tmp/cdp/sitedetail-breadcrumb-fixed.png`
|
||||
Reference in New Issue
Block a user