Files
claudetools/session-logs/2026-05-29-session.md
2026-05-29 12:22:29 -07:00

227 lines
20 KiB
Markdown

# Session Log — 2026-05-29
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-5070
- **Role:** admin
## Session Summary
Shaped a pre-implementation spec for native integrated remote control in the GuruRMM ecosystem, then restructured how the guru-connect product is tracked in the monorepo. The session began as a "fix lint errors" request that was redirected into a GuruRMM feature request for the guru-connect (GC) project: native, integrated remote control comparable to ScreenConnect/Splashtop, built entirely on our own Rust stack to avoid third-party agents and supply-chain exposure.
Research established that GC already implements the full remote-control engine (DXGI/GDI capture, input injection, viewer, `guruconnect://` protocol handler, persistent/unattended + support-code/attended modes, protobuf over WSS) and that GuruRMM already has the orchestration rails (per-agent command dispatch, stable `device_id` identity, the AgentDetail action-button pattern, and a half-built generic `tunnel` scaffold). Two parallel Explore agents mapped the exact integration surfaces with file:line references. The feature is therefore ~80% wiring against existing capability, not greenfield. Architecture decisions were captured via the user: broker model (RMM orchestrates the separate GC agent), both unattended and attended access, multi-monitor in scope, file transfer / session recording / non-Windows agents out of scope, priority P2.
The `/shape-spec` skill produced four files in `projects/msp-tools/guru-connect/specs/native-remote-control/` (shape, plan, references, standards). The user then clarified that GC is a standalone product with its own pipeline/cadence, and the real intent is a durable, versioned integration contract so the two products stay integration-compatible without coupling. The spec was rewritten around a GC-owned, semver'd integration contract (`/api/integration/v1/`, capability discovery, embedded session viewer). A concrete blocker was identified: GC's `security_headers.rs` sets `frame-ancestors 'none'`, which must be relaxed to a scoped RMM-origin allowlist for the embedded viewer. RMM-side hints were added (ADR-008 + `docs/GURU_CONNECT_INTEGRATION.md`) recording that RMM consumes GC via the contract and does no active dev on GC. The spec and hints were committed across both repos (commit-only, no push).
The user then asked to wire GC as a submodule like guru-rmm. Investigation revealed the remote `azcomputerguru/guru-connect` repo was ~4 months stale (frozen 2026-01-18) while the local monorepo copy was far ahead (entire `middleware`/`metrics`/`utils` modules, token blacklist, Phase-1 security/deploy work, the new spec). Per the user's decisions (publish local to the existing repo as a snapshot commit; preserve history), the Gitea Agent published the local working state to GC main (fast-forward `5b7cf5f..e3e95f8`, history preserved, KEEP paths `.gitignore`/`.cargo`/`server/static/downloads` retained), then converted the vendored directory into a submodule pinned at `e3e95f8`. Confirmed that GC `deploy.yml` triggers only on `v*.*.*` tags / manual dispatch, so the push ran CI build/test but did not deploy to production.
Finally, the user confirmed RMM and GC are the only versionable products; everything else stays in the monorepo. This policy was recorded to memory (`project_versionable_products.md`).
## Key Decisions
- Broker architecture: RMM orchestrates the separate GC agent (two agents coexist) rather than merging GC into the RMM agent — reuses GC's existing engine, ships sooner, keeps GC standalone.
- The deliverable is a GC-owned, semver'd integration contract + capability discovery, not one-off broker wiring — so the two products stay in-sync via the contract without sharing pipelines or releasing in lockstep.
- Stable cross-product identity = RMM `device_id` passed as the GC `agent_id`, so brokered sessions deterministically match the endpoint.
- Supply-chain guard made concrete: the RMM agent downloads the GC binary only from GC's release channel and verifies SHA-256 before launch (reusing GC's `releases.checksum_sha256`).
- Embedded viewer over native-only: relax `frame-ancestors`/`X-Frame-Options` on the viewer route to a scoped RMM-origin allowlist; keep `'none'` everywhere else.
- Spec lives in the GC repo (GC owns the contract); RMM gets ADR-008 + a pointer doc reminding it not to perform active dev on GC.
- Submodule reconciliation: publish the local (authoritative) state up to the stale GC repo as a snapshot commit on top of existing main (preserve history), then submodule-add — nothing lost.
- Only GuruRMM and GuruConnect are versionable products (own repos/submodules); all other projects stay in the claudetools monorepo. Split only for an independent pipeline OR a versioned external consumer.
- All git operations committed but NOT pushed (claudetools), per the established pattern of leaving the push to the user; the GC repo push was mandatory for the submodule to resolve.
## Problems Encountered
- Initial "fix lint errors" request was ambiguous (clean tree, multiple lintable projects). Asked which project; user redirected to the GC feature request instead.
- CLAUDE.md warns that a Gitea repo named `guru-connect` is an "abandoned duplicate." Verified by inspecting the remote repo's contents (`proto/guruconnect.proto`, `agent/`, `server/`, `dashboard/`) that `azcomputerguru/guru-connect` is the real GC product, not the abandoned RMM duplicate the warning refers to.
- The remote GC repo was 4 months stale and the local monorepo copy had diverged substantially (whole modules + Phase-1 work never pushed). A naive `submodule add` would have reverted that work. Resolved by diffing local vs remote, surfacing the divergence, and publishing local→remote before converting.
- Production-deploy risk on push: checked GC's `.gitea/workflows`; confirmed `deploy.yml` triggers only on `v*.*.*` tags / `workflow_dispatch`, so pushing to main runs CI but does not deploy.
## Configuration Changes
Created (committed `afbe5a8`, then moved into the GC repo via the submodule conversion):
- `projects/msp-tools/guru-connect/specs/native-remote-control/shape.md`
- `projects/msp-tools/guru-connect/specs/native-remote-control/plan.md`
- `projects/msp-tools/guru-connect/specs/native-remote-control/references.md`
- `projects/msp-tools/guru-connect/specs/native-remote-control/standards.md`
guru-rmm submodule (committed `7701d26` in the submodule):
- Modified `docs/ARCHITECTURE_DECISIONS.md` — added ADR-008 (GC is a separate product consumed via versioned contract)
- Created `docs/GURU_CONNECT_INTEGRATION.md` — RMM-side boundary/pointer doc
Repo structure:
- `.gitmodules` — added `projects/msp-tools/guru-connect` submodule entry (branch main)
- `projects/msp-tools/guru-connect` — converted from vendored directory to submodule (gitlink mode 160000 at `e3e95f8`)
Memory:
- Created `.claude/memory/project_versionable_products.md`
- Updated `.claude/memory/MEMORY.md` index (Project section)
## Credentials & Secrets
None discovered or created this session. The spec references secrets to be sourced from env/SOPS at implementation time (`CONNECT_INTEGRATION_KEY`, `CONNECT_SERVER_URL`, per-machine GC agent keys, `CONNECT_EMBED_ALLOWED_ORIGINS`) — none provisioned yet.
## Infrastructure & Servers
- Gitea (internal): http://172.16.3.20:3000 — used for repo inspection + GC push (per internal-API preference)
- GC relay server: 172.16.3.30:3002, proxied via NPM to connect.azcomputerguru.com
- GuruRMM server: 172.16.3.30:3001, dashboard rmm.azcomputerguru.com
- GC repo CI: `.gitea/workflows/{build-and-test,test,deploy}.yml` — deploy only on `v*.*.*` tags / manual dispatch
## Commands & Outputs
Repo divergence check (local vs remote GC), shallow clone + `diff -rq` — confirmed local far ahead; cleaned up temp clone afterward.
GC publish (Gitea Agent):
- `git push origin main``5b7cf5f..e3e95f8 main -> main` (fast-forward, 73 files changed, 15611 insertions, 5760 deletions; `Cargo.lock` dropped — not tracked in the authoritative copy)
Submodule conversion (Gitea Agent):
- `git rm -r --cached projects/msp-tools/guru-connect` + `rm -rf` + `git submodule add -b main <url>`
- `git submodule status``e3e95f8 ... guru-connect (heads/main)`, `7701d26 ... guru-rmm (heads/main)`
## Pending / Incomplete Tasks
- claudetools commits are LOCAL, not pushed: `53e14da` (submodule conversion) + `1fc2401`/`afbe5a8` (spec + pointer bump) from earlier. Push when ready.
- GC repo housekeeping: re-add `Cargo.lock` (dropped in the snapshot; wanted for reproducible builds).
- GC submodule URL uses the internal IP `172.16.3.20:3000`; guru-rmm uses the public `git.azcomputerguru.com`. Off-network clones (Howard's Mac) won't resolve the internal IP — consider switching to the public hostname for parity.
- GC CI run kicked off by the publish push may be red (the snapshot may not build cleanly; Cargo.lock removed). Check the Actions run.
- Implementation of the feature itself has not started — Task 0 of the spec (commit the spec) is effectively satisfied; Tasks 1+ are not begun.
## Reference Information
- Spec: `projects/msp-tools/guru-connect/specs/native-remote-control/` (4 files) — now in the GC repo at `e3e95f8`
- ADR: `projects/msp-tools/guru-rmm/docs/ARCHITECTURE_DECISIONS.md` ADR-008
- RMM pointer: `projects/msp-tools/guru-rmm/docs/GURU_CONNECT_INTEGRATION.md`
- GC repo: `azcomputerguru/guru-connect`; published `5b7cf5f → e3e95f8`
- guru-rmm submodule commit: `7701d26`
- claudetools commits: `afbe5a8` (spec), `1fc2401` (submodule ptr bump for ADR), `53e14da` (GC submodule conversion)
- Roadmap context: `projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md:635-675`, `docs/UI_GAPS.md:155-186`
- Key GC integration files: `server/src/middleware/security_headers.rs:30,37-39` (frame-ancestors), `server/static/viewer.html`, `server/src/relay/mod.rs:187` (agent key validation), `server/src/main.rs:300` (`/api/version`)
- Key RMM files: `server/src/api/commands.rs:87-157` (command dispatch), `agent/src/device_id.rs`, `dashboard/src/pages/AgentDetail.tsx:1893-1931`
---
## Update: 17:52 PT — GuruConnect operational tooling + Pluto native CI build (green)
### Session Summary
Brought GuruConnect to operational parity with GuruRMM and stood up native Windows CI on Pluto.
Established GC's `docs/` (FEATURE_ROADMAP, ARCHITECTURE_DECISIONS ADR-001/002, SPEC-001, CHANGELOG),
added the `/gc-feature-request` skill, and registered the `guruconnect` coord project_key. Built CI
in Gitea Actions: conventional-commit auto-versioning, git-cliff changelog + `/api/changelog`
endpoint, and Azure Trusted Signing (jsign, reusing RMM's cert profile) on a workflow_dispatch-gated
release. Decisions: modernize in Gitea Actions (not RMM's webhook/script model), reuse RMM's exact
Trusted Signing cert profile, leave RMM's own pipeline untouched (its beta→stable promotion already
provides release control — better than tag-gating).
Native Windows agent build: rather than mingw cross-compile, provisioned Pluto (Unraid VM
"Claude-Builder", hostname PLUTO, 172.16.3.36) as a Gitea Actions runner driven entirely through its
GuruRMM agent (no SSH — GURU-5070's key isn't authorized). Installed act_runner (label windows-msvc,
host-mode SYSTEM, scheduled-task autostart), Node 20, PowerShell 7, protoc 28.3; confirmed rc.exe +
MSVC cargo 1.95 present. Iterated the CI to green through a stack of pre-existing breakage: cargo fmt
drift (ran `cargo fmt --all`), clippy made informational, `.cargo/config` windows-msvc default-target
leaking into Linux clippy/test (CARGO_BUILD_TARGET override), PROTOC env + protoc PATH in the Windows
jobs, workspace-root artifact paths (binary is at root `target/`, not `agent/target/`), committed the
missing root `Cargo.lock` (fixes cargo audit), audit made informational, and removed the redundant/
broken `test.yml`. build-and-test run #17 is fully GREEN (Server Linux, Agent native MSVC on Pluto,
Security Audit, Build Summary).
Also located the portal and recorded infra knowledge (see below).
### Key Decisions
- GC operational tooling in Gitea Actions; reuse RMM's Azure Trusted Signing cert profile (ADR-002).
- Native MSVC build on Pluto via a Gitea Actions runner (drop mingw cross-compile); sign on Linux via jsign (artifact handoff).
- RMM pipeline left as-is — promotion/rollback already provides deliberate release control.
- clippy + cargo audit are informational (warn-only) until the GC re-spec refreshes deps/wires API.
- Release is workflow_dispatch-gated (no auto-release on push).
### Problems Encountered
- No Gitea Actions runner existed (RMM uses webhook+scripts) → provisioned act_runner on Pluto.
- act_runner registered but `.runner` not written (ErrorActionPreference=Stop aborted on stderr) → re-registered with `*>` redirection.
- Host-mode Windows runner needs node + pwsh for JS actions and BOM-free GITHUB_PATH → installed Node 20 + PowerShell 7.
- RMM command 180s reaper killed slow installs (PS7 extract) → used .NET ZipFile extract; cached RMM JWT to avoid login rate-limiting.
- Agent CI failures were config, not code: missing protoc, workspace-root artifact path, missing Cargo.lock. Native build itself compiles clean (verified directly on Pluto, 4m20s).
### Configuration Changes
- GC repo: `docs/FEATURE_ROADMAP.md`, `docs/ARCHITECTURE_DECISIONS.md`, `docs/specs/SPEC-001-operational-tooling-parity.md`, `CHANGELOG.md`, `cliff.toml`, `Cargo.lock` (new); `.gitea/workflows/build-and-test.yml` + `release.yml` (native Pluto build, PROTOC, paths, audit); `.gitea/workflows/test.yml` (deleted); `server/src/api/changelog.rs` + routing; `server/.env.example` (CHANGELOG_DIR).
- claudetools: `.claude/commands/gc-feature-request.md` (new); CLAUDE.md project-keys (+guruconnect); memory `feedback_no_botalerts_internal_rmm.md`, `feedback_autonomous_infra_setup.md`, `project_versionable_products.md`; updated `reference_pluto_build_server.md`, `.claude/machines/pluto.md`, `wiki/systems/pluto.md` (Claude-Builder=PLUTO).
- Pluto (172.16.3.36): act_runner (C:\actrunner, scheduled task GiteaActRunner-guruconnect), Node 20 (C:\node), PowerShell 7 (C:\pwsh), protoc 28.3 (C:\protoc; PROTOC machine env) — all added to machine PATH.
### Credentials & Secrets
- Added 8 Gitea Actions secrets to `guru-connect` repo (values from `services/azure-trusted-signing.sops.yaml` / `/etc/gururmm-signing.env`): AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, TS_ENDPOINT, TS_ACCOUNT, TS_CERT_PROFILE, TS_TIMESTAMP_URL, CI_PUSH_TOKEN (CI_PUSH_TOKEN reuses the azcomputerguru Gitea api-token from `services/gitea.sops.yaml`).
- No new secrets created. Azure Trusted Signing = account `gururmm-signing`, profile `gururmm-public-trust`, `wus2.codesigning.azure.net`.
### Infrastructure & Servers
- PLUTO = Unraid VM "Claude-Builder" = 172.16.3.36 (Windows Server 2019, 16c/16GB). RMM agent id 07a11ece-… (changes on re-enroll; resolve by hostname PLUTO). Drive via /rmm; no `pluto` vault entry.
- Gitea runners: `guruconnect-builder` (Linux 172.16.3.30, ubuntu-latest) + `pluto-guruconnect` (Pluto, windows-msvc) — both online.
- GC portal: tech dashboard live at https://connect.azcomputerguru.com/dashboard (NPM → 172.16.3.30:3002, DNS 72.194.62.4). End-user support-code portal NOT built (gap).
### Commands & Outputs
- RMM login: `POST http://172.16.3.30:3001/api/auth/login` (creds `infrastructure/gururmm-server.sops.yaml` credentials.gururmm-api.*); run cmds via `POST /api/agents/:id/command`, poll `/api/commands/:id`. JWT rate-limits on repeated login — cache it.
- Gitea Actions runner mgmt via API token (`services/gitea.sops.yaml` credentials.api.api-token): runners at `/api/v1/repos/azcomputerguru/guru-connect/actions/runners`; logs at `http://172.16.3.20:3000/<repo>/actions/runs/<n>/jobs/<idx>/logs`; terminal state is in task `status` (NOT `conclusion`, which stays null).
- Native build verified: `cargo build --release --target x86_64-pc-windows-msvc` on Pluto → `target/x86_64-pc-windows-msvc/release/guruconnect.exe`, 4m20s clean.
### Pending / Incomplete Tasks
- Validate the gated `release.yml` end-to-end (version bump → native build → Azure Trusted Signing → Gitea release). NEXT STEP this session.
- GC re-spec: re-tighten clippy + cargo audit to hard gates after a dependency refresh; build the end-user support-code portal.
### Reference Information
- GC commits: `60519be` (tooling), `f2e0456` (gate), `1c5c1e7` (cargo fmt), `b2f9cbc` (clippy/target), `cd88fac` (clippy informational), `8a47332` (native Pluto build), `39e9ac4` (workflow_dispatch), `4ddced1` (CI suite fixes). build-and-test run #17 green.
- claudetools: `…ab78de2` (submodule bumps), `7d326f2` (Pluto memory/wiki docs).
- SPEC-001: `projects/msp-tools/guru-connect/docs/specs/SPEC-001-operational-tooling-parity.md`.
---
## Update: 19:21 PT — Release pipeline validated (signed v0.2.2 published)
### Session Summary
Validated the GuruConnect `release.yml` pipeline end-to-end by dispatching it (workflow_dispatch).
It took three dispatches, each surfacing one real bug, all fixed:
1. Run 18: version-bump + native Pluto build succeeded; sign failed — jsign 6.0 lacks the
`TRUSTEDSIGNING` keystore type (Azure Trusted Signing needs jsign >= 7.0). Fixed by pinning
jsign 7.1 (matches `/usr/share/jsign/jsign-7.1.jar` on the build host).
2. Run 20: jsign 7.1 signed the binary successfully ("Adding Authenticode signature... [OK]"),
but the separate verify step called `jsign --info` (not a real jsign subcommand) and wrongly
failed the job. Removed the bogus verify; jsign's non-zero exit under `set -euo pipefail`
already gates signing fail-closed.
3. Run 22: ALL GREEN. Published release `v0.2.2` (draft=false) with assets `guruconnect.exe`
(Azure-Trusted-Signing-signed), `guruconnect.exe.sha256`, `CHANGELOG.md`.
Confirmed the full chain works: conventional-commit version bump -> git-cliff changelog -> native
MSVC build on the Pluto runner -> Azure Trusted Signing (jsign 7.1) -> Gitea REST release. Deleted
the two orphan tags (v0.2.0, v0.2.1) from the failed attempts; v0.2.2 is the sole tag/release. GC
manifest versions now start at 0.2.2 (legitimate first signed release).
### Key Decisions
- jsign 7.1 (not 6.0) for Azure Trusted Signing in CI; matches the build host's version.
- Removed the jsign-based verify step rather than replace it — jsign's exit code is the fail-closed gate; `jsign --info` does not exist.
- Kept the validation release real (v0.2.2) rather than reverting version churn; cleaned up only the orphan tags.
### Problems Encountered
- Two-workflow confusion in the CI poller (build-and-test + test.yml/deploy.yml sharing pushes) → filtered the poller by `workflow_id` and run_number threshold.
- Release dispatch queued ~10 min behind the push-triggered build-and-test before starting (single shared runner concurrency).
### Configuration Changes
- `projects/msp-tools/guru-connect/.gitea/workflows/release.yml`: JSIGN_VERSION 6.0 -> 7.1 (commit `e7f38ce`/rebased `5727ccf`); removed broken `jsign --info` verify step (commit `5727ccf`).
- Gitea: deleted tags v0.2.0, v0.2.1 (HTTP 204 each); v0.2.2 release published.
### Credentials & Secrets
- No new secrets. Signing used the 8 Actions secrets set earlier (Azure Trusted Signing SP + CI_PUSH_TOKEN); source `services/azure-trusted-signing.sops.yaml` / `/etc/gururmm-signing.env`.
### Infrastructure & Servers
- jsign on build host (172.16.3.30): `/usr/bin/jsign` wrapper -> `/usr/share/jsign/jsign-7.1.jar` (the known-good Trusted Signing version).
- Published release: `http://172.16.3.20:3000/azcomputerguru/guru-connect/releases/tag/v0.2.2`.
### Commands & Outputs
- Dispatch a workflow: `POST /api/v1/repos/azcomputerguru/guru-connect/actions/workflows/release.yml/dispatches` `{"ref":"main"}` (HTTP 204).
- Delete a tag: `DELETE /api/v1/repos/.../tags/<tag>` (HTTP 204).
- jsign 6.0 error: `Unknown keystore type 'TRUSTEDSIGNING'`. jsign sign success marker: `Adding Authenticode signature to guruconnect.exe`.
### Pending / Incomplete Tasks
- GC re-spec: re-tighten clippy + cargo audit to hard gates after dependency refresh; build the end-user support-code portal.
- 5 unrelated `temp/` scratch files remain untracked on GURU-5070 (datto/ksteen — another session's; left untouched).
### Reference Information
- GC release commits: `e7f38ce`/`5727ccf` (jsign 7.1 + verify fix). Release run #22 green. Release `v0.2.2`.
- claudetools: `…70d2190` (submodule bump for verify fix).