sync: auto-sync from GURU-KALI at 2026-05-25 09:36:22

Author: Mike Swanson
Machine: GURU-KALI
Timestamp: 2026-05-25 09:36:22
This commit is contained in:
2026-05-25 09:36:23 -07:00
parent 3de9c16743
commit ba8ce9a06e

View File

@@ -801,3 +801,92 @@ POST http://172.16.3.30:3001/api/logs/analyze {} (fleet scope)
- Webhook handler: `/opt/gururmm/webhook-handler.py` (port 9000, builds agents only, NOT server)
- gururmm Gitea: `http://172.16.3.20:3000/azcomputerguru/gururmm`
- Beast Ollama: `http://100.101.122.4:11434` (direct), `http://172.16.0.1:11434` (via socat relay from LAN)
---
## Update: 09:34 MST — GuruRMM full audit + submodule infrastructure fixes (Mike Swanson / GURU-KALI)
### Session Summary
Ran `/rmm-audit` against GuruRMM. Because GURU-KALI was freshly recovered (see the MacBook nvidia black-screen recovery earlier today), the `projects/msp-tools/guru-rmm` submodule was uninitialized and empty, so the audit was run against a fresh clone of the active `azcomputerguru/gururmm` repo at commit `7374e8a` placed in `/tmp/gururmm-audit`. Five passes ran: four codebase passes (API coverage, Rust quality+auth, TypeScript, data integrity) as parallel subagents — security/auth/migration passes on opus, the rest on sonnet — plus a sequential build-pipeline pass that SSHed read-only into the build server (172.16.3.30). Aggregated to 61 findings: 2 critical, 10 high, 16 medium, 7 low, 26 info.
The two CRITICALs share one root cause: the server has **no router-level/middleware auth** — every route is protected only by whether its handler includes the `AuthUser` extractor, so a handler that omits it is silently public. Two whole modules omit it: `metrics.rs` (per-agent + fleet metrics readable anonymously) and `logs.rs` (fleet-wide raw logs, plus `POST /logs/analyze` which fires an outbound Ollama call, and `POST /agents/:id/logs/request` which commands an agent to upload logs — all anonymous). HIGH highlights: unauthenticated fleet-wide agent-status SSE stream, Entra SSO callback never validating the ID-token signature, mac builds stuck 7 commits behind HEAD since the 2026-05-24 Pluto outage, and two dead frontend links (`Agent.client_id` / `Agent.update_channel` declared in TS but never returned by the agent endpoints). The agent↔server wire protocol (21 AgentMessage + 18 ServerMessage variants, all handled), policy system (5 sections all merge/default/route), migrations (001045 no gaps), and build pipeline integrity came back clean.
The report was written to the gururmm repo's `reports/` and committed to a non-main branch `audit/2026-05-25-rmm-audit` (commit `da1d4ee`) — verified via the webhook handler that a push to `main` triggers a full build (no path filtering) while a branch push triggers nothing, so the branch keeps the report off the build path. `docs/UI_GAPS.md` was updated in the same commit: Watchdog Alerts marked CLOSED, MSPBackups + Organizations downgraded to in-progress, and four new orphaned-route gaps (#1215) added.
Mike then flagged that this Linux instance was mishandling the RMM submodule. Investigation found the real issues: (1) the submodule was never initialized on GURU-KALI and `sync.sh` Phase 1a used `git submodule foreach` (which only visits initialized submodules), so it silently skipped population yet reported success — the `/tmp` clone workaround was a symptom of this; (2) an orphaned `projects/solverbot` gitlink (mode 160000, committed at `8b6f0bc` with no `.gitmodules` entry) made bare `git submodule` commands throw `fatal: no submodule mapping`. The `.gitmodules` URL for guru-rmm points to the **active** `azcomputerguru/gururmm` repo — the "stale reference copy" wording in CLAUDE.md was misleading.
Fixes applied: initialized + populated the guru-rmm submodule at its proper path (pinned `7374e8a` at the time); rewrote `sync.sh` Phase 1a to explicitly init+populate each `.gitmodules`-declared submodule with credentials inherited from the parent origin URL (so non-interactive init authenticates), then advance to remote tip, with honest reporting; removed the solverbot orphan gitlink (per Mike's choice); normalized `git config user.name` from `Mike-Swanson` to `Mike Swanson`; and corrected the CLAUDE.md submodule wording. A later sync pulled a teammate commit (`6945b42`) bumping the guru-rmm pin to `0a4db53`, which `git submodule update` checked out cleanly — confirming the new flow works.
### Key Decisions
- **Audited a fresh clone, not the empty submodule:** the submodule was uninitialized; rather than block, cloned the active repo to `/tmp`. The *correct* long-term fix (done afterward) was to initialize the submodule properly — the `/tmp` clone was a stopgap, now removed.
- **Report committed to a branch, not main:** confirmed the webhook has no path filtering, so a docs-only push to main would trigger a full agent build. Branch push avoids it; Mike merges to main on his schedule.
- **Reclassified two agent severities during aggregation:** Agent A's "script-runs/:id has no client function" CRITICAL → MEDIUM (no security/data-loss/crash; workaround exists); Agent E's tray-EXE LOW → INFO (count within threshold). Applied the rubric consistently as aggregator.
- **Removed solverbot rather than registering it:** Mike's call. solverbot has its own Gitea repo (`azcomputerguru/solverbot` @ `0ec690f`) but doesn't belong as a claudetools submodule; dropping the gitlink clears the `fatal`. Its own repo is untouched.
- **Credential inheritance in sync.sh, not in `.gitmodules`:** submodule clone URLs get the parent origin's embedded creds written to local `.git/config` only; `.gitmodules` stays credential-free so nothing secret is committed.
### Problems Encountered
- **Submodule empty / `git submodule status` fatal:** root-caused to uninitialized submodule + orphaned solverbot gitlink. Resolved by `git submodule init`/`update` (path-scoped) and `git rm --cached projects/solverbot`.
- **sync.sh false success on submodules:** `git submodule foreach` no-ops on uninitialized submodules. Rewrote Phase 1a to iterate `.gitmodules` entries and init+populate explicitly.
- **Submodule pointer showed as modified after CLAUDE.md push:** the rebase pulled a teammate commit (`6945b42`) that advanced the guru-rmm pin; local submodule was still on the old commit. Resolved with `git submodule update` (checks out the recorded pin `0a4db53`) — not a real local change.
- **git user.name drift:** machine had `Mike-Swanson`; normalized to `Mike Swanson` per identity.json/protocol.
### Configuration Changes
- `.claude/scripts/sync.sh` — Phase 1a rewritten (init+populate submodules w/ credential inheritance; honest reporting). Commit `413df93`.
- `projects/solverbot` — orphaned gitlink removed from index + empty dir deleted. Commit `413df93`.
- `.claude/CLAUDE.md` — corrected guru-rmm submodule wording (lines ~143, ~270). Commit `f2ece8e`.
- `.claude/current-mode` — set to `dev` (local, gitignored).
- guru-rmm submodule: initialized locally; `submodule.projects/msp-tools/guru-rmm.url` in `.git/config` set to the credentialed gururmm URL (local only).
- In the gururmm repo (branch `audit/2026-05-25-rmm-audit`, commit `da1d4ee`): `reports/2026-05-25-rmm-audit.md` (new), `docs/UI_GAPS.md` (modified).
- git `user.name`: `Mike-Swanson` → `Mike Swanson`.
### Credentials & Secrets
- No new credentials created. Submodule clones reuse the shared Gitea account credentials already embedded in the claudetools `remote.origin.url` (account `azcomputerguru`); sync.sh copies that scheme+userinfo+host into each submodule's local `.git/config` URL at init time. Nothing secret is written to tracked files (`.gitmodules` stays credential-free).
- GuruRMM API admin creds used by the build-pipeline pass: vault `infrastructure/gururmm-server.sops.yaml` (admin-email `claude-api@azcomputerguru.com`).
### Infrastructure & Servers
- GuruRMM server / build server: `172.16.3.30` — API `:3001`, webhook handler `:9000` (`/opt/gururmm/webhook-handler.py`, multi-platform split handler, `PLATFORMS`×3). Builds only on push to `refs/heads/main`; no path filtering; skip token `[ci-version-bump]`. Live repo `/home/guru/gururmm`.
- Build artifacts: flat in `/var/www/gururmm/downloads/` with `-latest` symlinks (NOT the `windows/amd64` subdirs the rmm-audit skill assumes — skill Pass 6 paths should be updated). Current artifacts v0.6.39 built 2026-05-25.
- Per-platform last-built-commit: Linux/Windows at HEAD `7374e8a`; mac stuck at `1ed5596` (7 behind) since the 2026-05-24 Pluto outage.
- Pluto (Windows MSI builder): SSH from build-windows.sh pins `StrictHostKeyChecking=yes` against `/opt/gururmm/pluto_known_hosts` (3 entries).
- gururmm Gitea repos: `azcomputerguru/gururmm` (active, main was `7374e8a`→`f5df7a53`→`0a4db53` during/after the session) and `azcomputerguru/guru-rmm` (abandoned hyphenated duplicate). `azcomputerguru/solverbot` @ `0ec690f` exists but is not a claudetools submodule.
### Commands & Outputs
```bash
# Properly initialize the previously-empty submodule (the correct fix):
git submodule init -- projects/msp-tools/guru-rmm
git config submodule."projects/msp-tools/guru-rmm".url \
"https://azcomputerguru:<TOKEN>@git.azcomputerguru.com/azcomputerguru/gururmm.git"
git submodule update -- projects/msp-tools/guru-rmm
# -> checked out 7374e8a...
# Remove the orphaned solverbot gitlink:
git rm --cached projects/solverbot && rmdir projects/solverbot
# git submodule status -> now exits 0, no fatal
# After a pull bumped the pin, sync the submodule working tree to the recorded commit:
git submodule update -- projects/msp-tools/guru-rmm
# -> checked out 0a4db53... ; git status clean
```
- Webhook finding: a docs/reports-only push to `main` DOES trigger a full build (no path inspection in `webhook-handler.py`); a non-main branch push triggers nothing (`return 200 Ignored push to {ref}`).
### Pending / Incomplete Tasks
- **GuruRMM CRITICAL auth fixes (not started):** add `AuthUser` to all `metrics.rs` (`:29,:57`) and `logs.rs` (`:88,101,112,124,133,178`) handlers and scope to accessible orgs; then add a router-level auth layer so "public" must be opt-in (kills the whole class). Offered to start; awaiting Mike's go.
- HIGH follow-ups: validate Entra ID-token signature (`sso.rs:212`); auth+scope the agent-status SSE (`agents.rs:583`); bring the mac builder back online (gate stuck at `1ed5596`); add `client_id`/`update_channel` to the agent response structs (dead frontend links).
- Audit report lives only on branch `audit/2026-05-25-rmm-audit` — merge to main when bundling code fixes (will trigger a build).
- Optional: update the rmm-audit skill's Pass 6 artifact paths (flat `downloads/`, not `windows/amd64`).
### Reference Information
- Audited gururmm commit: `7374e8a`. Audit report: `reports/2026-05-25-rmm-audit.md` on branch `audit/2026-05-25-rmm-audit`, commit `da1d4ee` (gururmm remote). PR URL: `https://git.azcomputerguru.com/azcomputerguru/gururmm/pulls/new/audit/2026-05-25-rmm-audit`
- claudetools commits this session: `413df93` (sync.sh submodule fix + solverbot removal), `f2ece8e` (CLAUDE.md wording).
- Findings tally: API Coverage 14 (0C/5H/4M/1L), Rust+Auth 10 (2C/2H/1M), TypeScript 17 (0C/2H/7M/6L), Data Integrity 10 (0C/0H/4M), Build Pipeline 10 (0C/1H). Total 61 (2C/10H/16M/7L/26I).
- Prior GuruRMM audits: `reports/2026-05-23-rmm-audit.md`, `reports/2026-05-19-rmm-audit.md`.