sync: auto-sync from GURU-5070 at 2026-05-27 08:37:07
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-05-27 08:37:07
This commit is contained in:
Submodule projects/msp-tools/guru-rmm updated: 879d42bdbe...de39e42562
@@ -78,3 +78,75 @@ That drift prompted the session's main work: making `FEATURE_ROADMAP.md` a livin
|
||||
- Coord: Howard "Phase 2 migration done on HOWARD-HOME"; my replies 8618a252 (identity Phase 2), 5ab63a21 (migrate-identity heads-up to Howard). Deleted misrouted BUG-001 note (was 92468218).
|
||||
- GuruScan (Howard's): projects/msp-tools/guru-scan/ — now GuruScan.psm1/.psd1 + README + scanners.json + GURUSCAN_RESULT_JSON. Hands-off until he asks (feedback_rmm_dev_is_mike.md).
|
||||
- Report: projects/msp-tools/guru-rmm/reports/2026-05-27-rmm-audit-roadmap.md.
|
||||
|
||||
---
|
||||
|
||||
## Update: 08:40 PT — Vault-connectivity diagnosis, memory audit, RMM full audit + Phase 1 authz remediation (deployed)
|
||||
|
||||
### Session Summary
|
||||
|
||||
Diagnosed the reported external flap on `git.azcomputerguru.com`. SSHed IX (the ACG website host, unrelated) then traced the real path: the domain is served by **NPM (openresty) on Jupiter `172.16.3.20`** via the office Cox IP `72.194.62.10` — **not Cloudflare**. The flap was a transient NPM SSL-cert renewal (NPM log entry `14:14:36 UTC`). Corrected the machine-local auto-memory `reference_gitea_internal.md`, which wrongly claimed git.azcomputerguru.com sat behind Cloudflare and blocked curl.
|
||||
|
||||
Audited the shared in-repo memory (`.claude/memory/`): indexed 8 orphaned files into `MEMORY.md`, added frontmatter to 5 files, trimmed oversized index lines, de-duplicated, and fixed a broken backlink in the index (`../.claude/POWER_FAILURE_RUNBOOK` → `../POWER_FAILURE_RUNBOOK`).
|
||||
|
||||
Ran a full `/rmm-audit` pass (all six passes on Opus 4.7: parallel agents A–D + F, sequential E build-pipeline). **62 findings — 3 CRITICAL, 9 HIGH, 12 MEDIUM** + lows/info. Report: `projects/msp-tools/guru-rmm/reports/2026-05-27-rmm-audit.md`. The 3 CRITICALs are the same authorization class: handlers that take `_auth: AuthUser` (authenticate-only, **no** org-scope authorization) — a BOLA/IDOR hole on credentials, command dispatch, and script execution.
|
||||
|
||||
On Mike's "fix all → start Phase 1, TODO the rest" direction, implemented **Phase 1 (the 3 CRITICALs)** on branch `remediation/2026-05-27`, plus the create_credential gate that Code Review flagged. While building I discovered **main did not compile** — Howard's `3b19ff0` changed `db::logs::get_fleet_logs` to a 5-arg signature but left 4 stale callers in `logs.rs` (E0061 ×4). That compile break is exactly why Howard's server deploy was "stuck" (binary frozen at the May 25 build). Folded the caller fix into the same branch (`4961923`), so the deploy ships the build fix and the authz fixes together. Code Review returned **APPROVE-WITH-NITS** (caught create_credential ungated → HIGH → fixed). `cargo check` green at `bdefb1f`. Merged the branch to main (fast-forward), CI bumped to `de39e42` (v0.3.30), and deployed via `sudo /opt/gururmm/build-server.sh`. **Verified live:** release build 4m45s, systemd restarted `15:32 UTC`, `ExecStart=/opt/gururmm/gururmm-server` running the fresh binary. Phases 2–5 captured as coord TODOs. Notified Howard of the in-flight fix, the remediation task list, the living-roadmap definition-of-done expectation, and (post-deploy) that his fleet-log fix is now live.
|
||||
|
||||
### Key Decisions
|
||||
|
||||
- **Option B — merge the whole branch + deploy at once** (vs. cherry-picking just the build fix). Ships the get_fleet_logs fix and all Phase 1 authz together; Mike acknowledged the authz changes are behavior-changing (org-scoped 403s where before any authed user passed).
|
||||
- **`authorize_agent_access` is fail-closed** — an agent with no site / orphaned client_id returns **403**, stricter than the reference `get_agent` handler which fails open. A credential/command/script path must never default-allow on missing scope.
|
||||
- **`reveal_credential` gated dev_admin-only BEFORE the DB fetch** — don't even read the secret out of the DB if the caller isn't authorized.
|
||||
- **New commit `bdefb1f` for the create_credential fix, not an amend** — keeps `4961923` (the build fix) byte-stable and cherry-pickable, after an earlier `--amend` mistake rewrote its SHA.
|
||||
- **Roadmap-compliance verification of Howard's sessions = no violation** — his only post-rule commit (`3b19ff0`) was a bug fix to an already-`[x]` feature, which requires no roadmap flip. The rule is brand-new, so the action is forward-looking: confirm his sessions pulled the updated DESIGN.md + memory.
|
||||
|
||||
### Problems Encountered
|
||||
|
||||
- **main wouldn't compile (E0061 ×4 in logs.rs)** — pre-existing breakage from Howard's `3b19ff0` get_fleet_logs signature change; none of my authz files were in the errors. Root-caused, fixed callers to the 5-arg form (`&["ERROR"], None, since, 1000`), committed `4961923`.
|
||||
- **Stale cargo check** — `git fetch origin <branch>` does NOT fast-forward the local branch, so checks ran old code. Fixed by checking out `origin/remediation/2026-05-27` detached.
|
||||
- **`git commit --amend` mistake** — amended the build commit, folding in the credentials fix and changing the `4961923` SHA I'd told Howard to cherry-pick. Recovered with `git reset --hard origin/remediation/2026-05-27`, re-applied the one-liner as the new commit `bdefb1f`.
|
||||
- **`internal_err` not in scope (E0425)** in credentials.rs create_credential gate — `internal_err` isn't imported there; switched to the inline `.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?` pattern the file already uses.
|
||||
- **Deploy binary-path ambiguity** — post-deploy, `/opt/gururmm/gururmm-server` was fresh (May 27 15:32) but `/usr/local/bin/gururmm-server` was still May 25. Verified `systemctl cat` → `ExecStart=/opt/gururmm/gururmm-server`; the `/usr/local/bin` copy is vestigial and unused. No action needed (candidate cleanup item).
|
||||
|
||||
### Configuration Changes (gururmm repo, branch merged to main)
|
||||
|
||||
- MODIFIED `server/src/api/mod.rs` — new `pub async fn authorize_agent_access(state, auth, agent_id)` helper (admin bypass; agent→site→client_id→`can_access_org`; fail-closed 403). Added imports `AuthUser`, `db`, `uuid::Uuid`.
|
||||
- MODIFIED `server/src/api/credentials.rs` — `authorize_credential_access(state, user, cred)` branching on scope_type (global→`is_dev_admin`; client→`is_admin`|`can_access_org`; site→resolve→`can_access_org`; unknown→403). Gated list_global/list_client/list_site/get_credential_meta/reveal_credential (dev_admin-only, pre-fetch)/update/delete AND create_credential.
|
||||
- MODIFIED `server/src/api/commands.rs` — `send_command` calls `authorize_agent_access` before dispatch.
|
||||
- MODIFIED `server/src/api/scripts.rs` — `run_script_on_agent` → `authorize_agent_access(req.agent_id)`; library CRUD → `is_admin()` gate.
|
||||
- MODIFIED `server/src/api/logs.rs` — fixed 4 stale `get_fleet_logs` callers to 5-arg signature (build fix; was breaking main).
|
||||
- Commits: `4961923` (build fix), `bdefb1f` (create_credential gate err-map fix). Merged FF to main; CI auto-bump → `de39e42` (v0.3.30).
|
||||
|
||||
### Configuration Changes (claudetools repo)
|
||||
|
||||
- MODIFIED `.claude/memory/MEMORY.md` — indexed 8 orphans, fixed POWER_FAILURE_RUNBOOK backlink, trimmed oversized lines, dedup.
|
||||
- MODIFIED 5 memory files — added frontmatter.
|
||||
- MODIFIED (machine-local auto-memory) `reference_gitea_internal.md` — corrected the Cloudflare claim (git.azcomputerguru.com = office Cox 72.194.62.10 → NPM/openresty on Jupiter 172.16.3.20).
|
||||
|
||||
### Infrastructure & Servers
|
||||
|
||||
- **git.azcomputerguru.com path:** office Cox IP `72.194.62.10` → **NPM (openresty) on Jupiter `172.16.3.20`** → Gitea `172.16.3.20:3000`. NOT Cloudflare. External flaps = NPM SSL renewal events.
|
||||
- **GuruRMM server:** `172.16.3.30:3001`, systemd `gururmm-server`, `ExecStart=/opt/gururmm/gururmm-server` (NOT `/usr/local/bin/`). Now **v0.3.30 / de39e42**, restarted `2026-05-27 15:32:28 UTC`, MainPID 598071. Deploy is manual: `sudo /opt/gururmm/build-server.sh` (git reset --hard origin/main → cargo build --release → stop/cp/start). No Phase 1 migrations, so `.sqlx` cache untouched.
|
||||
|
||||
### Commands & Outputs
|
||||
|
||||
- Deploy verify: `systemctl cat gururmm-server | grep ExecStart` → `/opt/gururmm/gururmm-server`; `ActiveEnterTimestamp=Wed 2026-05-27 15:32:28 UTC` (== fresh binary mtime); `SubState=running`.
|
||||
- cargo check (warm, origin/remediation/2026-05-27 @ bdefb1f): `CARGO_EXIT=0`, Finished in 25.53s, 0 errors.
|
||||
- get_fleet_logs caller fix shape: `get_fleet_logs(&state.db, &["ERROR"], None, since, 1000)` (was 4-arg `"ERROR", since, 1000`).
|
||||
|
||||
### Pending / Incomplete Tasks (remediation Phases 2–5, coord TODOs)
|
||||
|
||||
- **Phase 2** (`9a1ed577`, HIGH authz/IDOR): org-scope checks.rs / inventory / user_inventory / commands reads / registry; auth on `/agents/status-stream` SSE.
|
||||
- **Phase 3** (`54239760`, HIGH): `sqlx::query!`/`query_as!` → runtime (mspbackups, updates); build-linux.sh stray `n#` + duplicate beta block.
|
||||
- **Phase 4** (`58c3fcad`, HIGH/MED): `internal_err` sweep (~127 sites); log redaction; MSPBackups mappings UI; React error boundary; AgentDetail client enrichment row.
|
||||
- **Phase 5** (`fd677411`, MED/LOW): discovery IP validation, registry wire fields, defer_hours, ws api-key char-boundary, TS `any`, aria-labels, localhost fallback, /metrics+stats wiring.
|
||||
- **Cleanup candidate:** remove the stale `/usr/local/bin/gururmm-server` (unused by systemd).
|
||||
- (Carried) Lonestar Apple MDM enrollment; Glabman wifi quote (todo `1bf0cfef`, due 2026-05-27); quantumwms John Velez consent; 2× Business Premium before 2026-06-03; Western Tire #32199; Kittle HIGH; VWP discovery/deployment testbed.
|
||||
|
||||
### Reference Information
|
||||
|
||||
- gururmm: `4961923` (build fix), `bdefb1f` (create_credential gate), merged to main → `de39e42` (v0.3.30, deployed).
|
||||
- Reports: `reports/2026-05-27-rmm-audit.md` (62 findings), `reports/2026-05-27-rmm-audit-roadmap.md`.
|
||||
- Coord TODOs (gururmm, assigned mike): `9a1ed577` `54239760` `58c3fcad` `fd677411`.
|
||||
- Coord messages to Howard: `114e6209` (fix in flight), `b14e1793` (task list + roadmap guidance + build-check nit), `44ac8984` (server deployed / log fix live). Component `gururmm/server` → `deployed` v0.3.30.
|
||||
|
||||
Reference in New Issue
Block a user