sync: auto-sync from GURU-5070 at 2026-05-30 17:59:38
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-05-30 17:59:38
This commit is contained in:
@@ -411,3 +411,82 @@ Client-IP investigation: the relay logged repeated agent rejections "from 172.16
|
||||
- guru-connect submodule HEAD: abc55ab. Server component: deployed v0.2.1.
|
||||
- Deploy memory: .claude/memory/project_guruconnect_deploy.md.
|
||||
- Verified reject log post-fix: "Agent connection rejected: 795cbc06-... from 98.172.64.243 - invalid API key".
|
||||
|
||||
---
|
||||
|
||||
## Update: 00:56 PT — GuruConnect/GuruRMM feature specs, RMM CI docs-guard, GC v2 sprint planning
|
||||
|
||||
### Session Summary
|
||||
|
||||
Started with a clean `/sync` (both repos already in sync). Then handled an infra request: a Pavon machine was hammering the GuruConnect relay with auth failures. Used `/rmm` to identify the offending endpoint — only `DESKTOP-I66IM5Q` (Pavon/Raiders, external IP 98.172.64.243) carried the GuruConnect client; the Curves box did not. Removed it cleanly (killed the running `guruconnect-pavon-raidersreef` process, deleted the GeoVision HKCU Run-key entry, the desktop launcher, and the `C:\Program Files\GuruConnect` copy). Mike pushed back twice that I should match the offending IP to the agent rather than reconning every candidate; the RMM agent record carries no IP fields at all, which became GuruRMM todo `7459428e` (capture local + external agent IPs). Saved a feedback memory.
|
||||
|
||||
Answered a Claude Code question (Windows Snipping-Tool clipboard images no longer paste with Alt+V — a confirmed DIB-vs-CF_HDROP regression; copied image files still work). Drafted a `/feedback` writeup and a GitHub issue; Mike submitted the feedback.
|
||||
|
||||
Filed a large batch of feature requests as researched specs. GuruConnect (via `/gc-feature-request`): SPEC-003 machine inventory, SPEC-004 stable machine identity + session lifecycle reaping + operator removal, SPEC-005 machines list view (dual Host/Guest indicators + rich rows), SPEC-006 universal machine search, SPEC-007 managed-agent installer builder, SPEC-008 valuable error messages, SPEC-009 feature-rich documented API. GuruRMM (via `/feature-request`): SPEC-018 valuable error messages, SPEC-019 feature-rich documented API, SPEC-020 migrate CI/CD from webhook+shell to Gitea Actions. Pushed everything to Gitea.
|
||||
|
||||
Implemented the SPEC-020 Phase-0 interim fix live: added a docs-only build guard to the GuruRMM build webhook handler (`/opt/gururmm/webhook-handler.py` on 172.16.3.30) so pushes touching only `docs/`, `*.md`, `.claude/`, `session-logs/` skip the build. Patched on-host (a local `/tmp` path-mapping bug made the edit round-trip unreliable), backed up the original, unit-tested 12 cases + syntax-checked before deploy, restarted the service, and verified live (a real docs push and a test POST both returned "build skipped", no build locks). Recorded a project memory.
|
||||
|
||||
Closed with GC sprint planning. Mike chose "v2 reset first." While scoping Sprint 0 (the 3 relay-auth CRITICAL hotfix), discovered from the git log and the running server that v2 Phase 1 (secure-session-core Tasks 1-7) is ALREADY implemented and DEPLOYED, and the 3 CRITICALs are already closed in production. The roadmap banner written minutes earlier (claiming the bypasses were live) was wrong; corrected it and re-baselined. Created a 5-task tracked list for the actual remainder (verification + code review, not building).
|
||||
|
||||
### Key Decisions
|
||||
|
||||
- Accepted Mike's correction to identify the offending Pavon endpoint by matching the known external IP rather than reconning all candidates; root-caused that GuruRMM stores no agent IPs and filed todo 7459428e.
|
||||
- For SPEC-004, made stable machine-derived identity (deterministic `machine_uid`, MachineGuid-based, bound to the per-agent key) the PRIMARY fix per Mike — reaping/removal became defense-in-depth. Flagged that a client-asserted hash is spoofable and must be auth-bound.
|
||||
- RMM CI: chose the minimal host-script path guard (Phase 0) over migrating to Gitea Actions immediately; the full migration is SPEC-020. Guard is fail-safe toward building (skips only when every changed file is provably non-buildable).
|
||||
- GC direction: v2 reset first (Mike). Then corrected course on discovering Phase 1 is already done — the planned Sprint 0 CRITICAL hotfix was a no-op. Re-scoped to verification.
|
||||
- Did NOT patch the stale repo copy `scripts/webhook-handler.py` (109 lines vs deployed 206) — would have triggered a wasteful build and implied maintenance it lacks. Host is source of truth until SPEC-020.
|
||||
|
||||
### Problems Encountered
|
||||
|
||||
- Local `/tmp` path mismatch: the editor tools and the Bash shell resolved `/tmp/webhook-handler.py` to different physical files, so `pscp` uploaded the un-edited copy. Resolved by patching the file on-host via a Python script piped over SSH stdin.
|
||||
- `py_compile` failed writing to root-owned `/tmp/__pycache__` — used `python3 -B` / `ast.parse` instead.
|
||||
- importlib refused a `.py.new` extension; tested via `exec(open(...).read(), ns)` into a namespace.
|
||||
- Roadmap banner factual error (CRITICALs "live") — self-introduced from the stale 2026-05-29 audit narrative; caught by reading the actual relay code + git log, then corrected.
|
||||
|
||||
### Configuration Changes
|
||||
|
||||
- guru-connect repo: added `docs/specs/SPEC-003..009*.md`; edited `docs/FEATURE_ROADMAP.md` (entries, v2-first banner, then v2 re-baseline correction).
|
||||
- guru-rmm repo: added `docs/specs/SPEC-018/019/020*.md`; edited `docs/FEATURE_ROADMAP.md`.
|
||||
- claudetools (parent): submodule pointer bumps for both; new memories `.claude/memory/feedback_rmm_identify_by_ip.md`, `.claude/memory/project_rmm_webhook_docs_guard.md`; updated `.claude/memory/MEMORY.md`.
|
||||
- BUILD HOST 172.16.3.30 (NOT in git): `/opt/gururmm/webhook-handler.py` patched with the docs-only guard; backup `/opt/gururmm/webhook-handler.py.bak-20260530-guard`.
|
||||
- Endpoint `DESKTOP-I66IM5Q`: removed GuruConnect client (Run-key, desktop exe, Program Files dir).
|
||||
|
||||
### Credentials & Secrets
|
||||
|
||||
- Build/host SSH used: `guru@172.16.3.30:22` — already vaulted at `infrastructure/gururmm-server.sops.yaml` (sudo password same as SSH). No new secrets created.
|
||||
- RMM API admin creds: `infrastructure/gururmm-server.sops.yaml` `credentials.gururmm-api.*`.
|
||||
- Gitea webhook secret `gururmm-build-secret`: `projects/gururmm/ci-cd.sops.yaml`.
|
||||
|
||||
### Infrastructure & Servers
|
||||
|
||||
- 172.16.3.30 (Ubuntu 22.04) — hosts BOTH GuruConnect (`guruconnect.service`, listening :3002, deployed checkout `abc55ab`) and GuruRMM (server :3001, build host). GuruRMM build webhook: `gururmm-webhook.service` → `/opt/gururmm/webhook-handler.py` (binds 127.0.0.1:9000, nginx proxies `/webhook/build`); per-platform builds via `build-shared.sh` + `build-{linux,windows,mac}.sh`; Pluto (172.16.3.36) does the Windows/MSI build over SSH.
|
||||
- Gitea internal: `http://172.16.3.20:3000` (preferred on-network).
|
||||
- GC v2 secure-session-core: Tasks 1-7 committed; CRITICALs closed in deployed prod (verified `abc55ab` descends from CRITICAL#1 fix `a453e79` + Task 7 `f9bdecb`).
|
||||
|
||||
### Commands & Outputs
|
||||
|
||||
- RMM GuruConnect removal verified: `guruconnect procs running: none`, Run-key gone, files deleted.
|
||||
- Webhook guard live test: docs-only POST → `Docs-only change -- build skipped`; non-main ref → `Ignored push`; no build locks; `last-built-commit` unchanged (`ef0830f`).
|
||||
- GC prod check: `guruconnect.service active running`, `ss -tlnp` shows `:3002 guruconnect-ser pid 1287186`.
|
||||
|
||||
### Pending / Incomplete Tasks
|
||||
|
||||
Tracked list (TaskCreate #1-5) — the real GC Phase-1-exit remainder:
|
||||
1. Code-review secure-session-core Tasks 3-5 (pending review; written without a compiler, since built+deployed). Highest priority.
|
||||
2. Security re-audit — `/gc-audit --pass=security` + 4 manual CRITICAL checks.
|
||||
3. Functional verification — consent flow, key fidelity (Win+R/clipboard/Ctrl+Alt+Del/no stuck modifiers), rate limiting, fresh-DB migrations. Needs a real Windows desktop.
|
||||
4. Live HW-H.264 validation — GPU needed on the AGENT (encode: QuickSync/NVENC/AMF) and the VIEWER (decode); server needs NO GPU. Then flip `DEFAULT_PREFER_H264`. Non-blocking (raw is default).
|
||||
5. Retire deprecated shared `AGENT_API_KEY` fallback — GATED on confirming zero agents depend on it.
|
||||
|
||||
Other open threads:
|
||||
- SPEC-020 (RMM CI → Gitea Actions) staged as a spec; Phase-0 guard is live. Ratify as RMM ADR-009 when started.
|
||||
- GuruRMM todo `7459428e` — capture agent local + external IPs.
|
||||
- GC SPEC-003..009 fold into v2 Phase 2/3 (annotated on the roadmap).
|
||||
|
||||
### Reference Information
|
||||
|
||||
- GC commits: SPEC-003 `abf499c` → SPEC-009 `7ab8738`; roadmap v2 banner `03f62d4`; v2 re-baseline `786d3e4`.
|
||||
- RMM commits: SPEC-018 `be2b6f0`, SPEC-019 `ef0830f`, SPEC-020 `950fa08`.
|
||||
- GC secure-session-core: plan at `specs/v2-secure-session-core/plan.md`; Tasks: 1 `fef8111`, 2 `41691bf`, 3 `0f25878`, CRITICAL#1 split `a453e79`, 4 `bfcdbb5`, 5 `9082e11`, 6 `bb73ba6`, 7 `f9bdecb`.
|
||||
- Pavon/Raiders endpoint: `DESKTOP-I66IM5Q`, WAN 98.172.64.243; Curves: `DESKTOP-VRBQ6LM`, WAN 174.78.94.186 / LAN 192.168.1.128 / MAC 04:42:1A:0C:8C:A6.
|
||||
- Claude Code clipboard regression: Alt+V is the correct Windows binding; DIB bitmap (Snipping Tool) fails, CF_HDROP file paste works; CLI v2.1.158.
|
||||
|
||||
Reference in New Issue
Block a user