Files
claudetools/wiki/projects/guruconnect.md
Mike Swanson ae0efb87ca wiki: seed guruconnect + fix Gonzvar Syncro, Golden Corral mail/colocation
- guruconnect: seeded wiki/projects/guruconnect.md (v0.3.0 production; artifact-based
  from guru-connect repo @ origin/main ded99c5 + session logs + project_guruconnect
  memory). [[guruconnect]] backlinks now resolve. Indexed.
- gonzvar-tax-services: found in Syncro via fuzzy `query=` — customer is "Gonzvar Tax
  Service" (singular), id 1830740, break-fix/~$175hr, 6 assets. Billing fields corrected.
- tucson-golden-corral: email platform set to Neptune Exchange (per owner/Mike); IX
  cPanel kept as a caveat to reconcile. TGC-SERVER documented as colocated at ACG main
  office (behind ACG office network, not a naked public box at the restaurant).
2026-06-12 08:21:58 -07:00

255 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
type: project
name: guruconnect
display_name: GuruConnect
last_compiled: 2026-06-12
compiled_by: GURU-5070/claude-main
sources:
- projects/msp-tools/guru-connect/ (origin/main @ ded99c5)
- .claude/memory/project_guruconnect.md
- projects/msp-tools/guru-connect/session-logs/2025-12-29-session.md
- session-logs/2026-05-29-session.md
- session-logs/2026-05-30-guruconnect-specs.md
- session-logs/2026-05-31-mike-guruconnect-h264-decoder.md
- session-logs/2026-06-02-mike-guruconnect-enrollment.md
backlinks:
- gururmm
- msp-tools
---
# GuruConnect
## Summary
GuruConnect is ACG's proprietary remote-access and remote-support tool (ScreenConnect/ConnectWise Control class). It provides real-time screen capture, remote control, and managed-agent enrollment for Windows endpoints, serving MSP technicians. It is a **standalone product** with its own Gitea repo (`azcomputerguru/guru-connect`), independent release cadence, and versioned integration contract with [[gururmm]] (ADR-001). Sibling project to GuruRMM within the [[msp-tools]] portfolio.
Current version: **v0.3.0** (live in production at `connect.azcomputerguru.com` since 2026-06-01). v2 Phase 1 (secure session core) exited 2026-05-31 with a clean security audit (0 CRITICAL/HIGH/MEDIUM/LOW). Active development continues on Phase 2 (dashboard enrichment) and SPEC-018 Phase 2 (SYSTEM service session broker, the critical-path item for true unattended access).
---
## Capabilities / Feature Set
### Connectivity / Session
- **Attended (support-code) sessions:** agent presents or receives a support code (`XXX-XXX-XXX` format, high-entropy base32-style, single-use via `consumed_at`); technician enters code in dashboard to initiate a viewer session.
- **Unattended (persistent) sessions:** agent registers with a per-machine credential (`cak_` key) and maintains a persistent WSS connection. Accessible any time from the operator console without requiring an active user.
- **Zero-touch per-site enrollment (SPEC-016, shipped Phase A + B):** a signed per-site installer carries a rotatable enrollment key (`cek_` prefix, hashed Argon2id server-side). On first run the agent calls `POST /api/enroll` with `site_code` + enrollment key + hardware-derived `machine_uid`; server mints a per-machine `cak_` returned once in the response. Auto-approve posture; `machine_uid` collision (e.g. template-cloned VMs) drops the machine to `enrollment_state = 'pending'` and alerts the operator. Site-key rotation blocks new enrollments from old installers; already-enrolled agents (holding `cak_`) are unaffected.
- **Deterministic machine identity (SPEC-004):** `machine_uid` is derived from SMBIOS UUID (primary) with motherboard/disk serial fallback, hardware-salted — survives OS re-image on the same hardware. Unique per-tenant `(tenant_id, machine_uid)` in the DB.
- **Session lifecycle management (SPEC-004):** stale persistent sessions reaped automatically (TTL); same-machine supersede for racing connect-path agents; operator soft-delete (`?purge=true` on DELETE /api/machines/:id or DELETE /api/sessions/:id sets `deleted_at`) preserves audit history while removing ghost rows from the console.
- **Attended-consent state:** schema column `consent_state` (not_required / pending / granted / denied) on sessions. `not_required` for managed/unattended sessions; attended consent flow (Task 5) shipped; dashboard surfaces "awaiting consent" state.
- **Session source tagging:** `source` column on sessions: `standalone` (ad-hoc support code) vs `gururmm` (managed via RMM integration contract).
- **Tenancy-ready schema:** `tenant_id` (nullable, backfilled to fixed default `00000000-0000-0000-0000-000000000001`) on every scoped table (machines, sessions, support_codes, events, users, agent_keys). Multi-tenancy not enforced until Phase 4; schema flip requires no migration rewrite.
### Remote Control (Agent Surface, Windows only)
- **Screen capture:** DXGI primary (GPU-accelerated, handles D3D content), GDI fallback. `agent/src/capture/` module.
- **Input injection:** mouse (absolute/relative movement, button events) and keyboard (Win32 `SendInput`). Full key fidelity including Win+R, Ctrl+C/V, and Ctrl+Alt+Del (via separate `guruconnect-sas-service` SAS helper service). `agent/src/input/` module.
- **Video encoding:** H.264 hardware encoder via Windows Media Foundation async-MFT (`agent/src/encoder/h264.rs`) with raw+Zstd fallback. Current shipping default: **raw+Zstd** (`DEFAULT_PREFER_H264=false`); H.264 enabled per-agent via `h264-test` tag on the server pending cross-GPU validation (beast→5070 run not yet completed).
- **Native viewer:** windowed Rust viewer (`agent/src/viewer/`) using winit + softbuffer for rendering, decodes H.264 via Windows Media Foundation. Three viewer bugs fixed 2026-05-31 (MF back-pressure handling, output-type negotiation, event-loop pump removal). Launched via `guruconnect://` protocol handler or `guruconnect view <session_id>` CLI.
- **Transport:** Protobuf-over-WSS. Wire format: `proto/guruconnect.proto`. Frames compressed with Zstd (raw path) or H.264 NAL units.
- **Agent run modes:**
- `agent` — background persistent agent (currently `HKCU Run` autostart; SPEC-018 Phase 1 shipped the SYSTEM service scaffolding, but the session broker / capture worker is Phase 2)
- `view <session_id>` — viewer window
- `install` — install and register `guruconnect://` protocol handler
- `launch <url>` — handle protocol URL
- `service-run` — internal service entry point (SPEC-018 Phase 1, LocalSystem, SCM-controlled)
- **SYSTEM service (SPEC-018 Phase 1, shipped):** the managed agent registers as an auto-start LocalSystem Windows service (`GuruConnectAgent`, `sc failure` restart/5000ms). Phase 1 covers service install/lifecycle and SCM dispatch; Phase 2 (session broker — `CreateProcessAsUserW` into the active desktop, ACL'd named pipe IPC, `SERVICE_CONTROL_SESSIONCHANGE`) is the critical-path next item.
- **Agent binary:** single `.exe` (`guruconnect.exe`); also ships a separate `guruconnect-sas-service.exe` for Ctrl+Alt+Del injection. Statically linked, no .NET/VC++ runtime dependencies.
- **Per-machine credential store (SPEC-016 Phase B):** `cak_` key stored DPAPI-machine-encrypted at `%ProgramData%\GuruConnect\credentials\agent.cak`, ACL'd SYSTEM + Administrators. Readable only when the agent runs as SYSTEM; interim fail-fast guard in `agent/src/main.rs:resolve_agent_credential` until SPEC-018 Phase 2 is live.
### Security
- **Authentication model (dashboard/viewer):** JWT (HS256, short-lived). Login via `POST /api/auth/login` (Argon2id password verify). Logout revokes the JWT blacklist entry and stops log-tail.
- **Viewer tokens:** session-scoped JWTs (sub = viewer, `session_id` claim, `access: control | view-only` split). Issued server-side; viewer presents token in `?token=` query param on the WS handshake. Wrong-session tokens → 403.
- **Agent authentication:** per-agent `cak_`-prefixed high-entropy key (SHA-256 hashed in `connect_agent_keys`). Relay (`server/src/relay/mod.rs::validate_agent_api_key`) accepts `cak_` keys and falls back with a deprecation warning to the old shared `AGENT_API_KEY` env var. Login-JWT rejected on the agent plane (401).
- **Enrollment security:** two-tier — per-site `cek_` (rotatable gate) vs per-machine `cak_` (durable operating credential). Argon2id KDF for both. Timing-oracle mitigated by a constant-time dummy verify on all reject paths. Cross-site enroll default-refused (409 `ENROLL_SITE_CONFLICT`).
- **Rate limiting:** in-memory per-IP limiter (`server/src/middleware/rate_limit.rs`; replaced `tower_governor`). Applied to login and enrollment endpoints.
- **Trusted-proxy IP extraction:** `CONNECT_TRUSTED_PROXIES` env var (comma-separated; required for correct client-IP audit keying when behind NPM). Fail-closed to loopback if unset.
- **Security headers:** applied via `tower-http`.
- **Code signing:** Windows agent `.exe` signed via Azure Trusted Signing (`jsign`, TRUSTEDSIGNING) in `release.yml`, fail-closed. Reuses GuruRMM's cert profile (ADR-002). `beta` channel input signs and publishes a prerelease tag without semver bump.
- **Security audit (2026-05-31):** `/gc-audit --pass=security` — PASS, 0 CRITICAL/HIGH/MEDIUM/LOW. Three CRITICALs from the May 29 pre-reset audit closed: (1) session-scoped viewer tokens + session-claim match, (2) JWT blacklist on WS logout, (3) agent plane rejects user-JWTs via `cak_` key check.
### Dashboard
- **Tech:** React/TypeScript SPA (`dashboard/src/`), Vite build, served from `server/static/app/` (SPA deep-link fallback). API client (`dashboard/src/api/`).
- **Machines list:** live status, hostname, OS, tags, company/site columns. Operator purge (soft-delete) UI.
- **Sessions view:** active + historical sessions, viewer count, consent state.
- **Support codes view:** create, list, revoke support codes.
- **Users admin:** create/edit/disable users, role assignment (admin / viewer).
- **Authentication context:** JWT-based, AdminRoute guard for admin-only views.
### Server API Surface
| Endpoint | Auth | Purpose |
|---|---|---|
| `POST /api/auth/login` | None | Issue JWT |
| `DELETE /api/auth/logout` | JWT | Revoke JWT + stop WS log |
| `GET/POST/DELETE /api/users` | JWT (admin) | User management |
| `GET/POST/PUT/DELETE /api/machines` | JWT | Machine list + management |
| `GET/POST/DELETE /api/sessions` | JWT | Session history + management |
| `GET/POST /api/codes` | JWT | Support code CRUD |
| `POST /api/enroll` | None (cek_) | Zero-touch agent enrollment |
| `GET/POST /api/sites` | JWT (admin) | Site management |
| `POST /api/sites/:id/keys` | JWT (admin) | Issue/rotate enrollment keys |
| `GET/POST /api/machine_keys` | JWT (admin) | Manage per-machine cak_ keys |
| `GET /api/releases` | JWT | Agent release metadata |
| `GET /api/changelog` | JWT | Version changelogs |
| `GET /api/downloads` | None | Agent binary download |
| `WS /ws/agent` | cak_ / support-code | Agent relay plane |
| `WS /ws/viewer` | session-scoped JWT | Viewer relay plane |
### Integrations (Planned)
- **GuruRMM native remote control (ADR-001):** versioned contract `/api/integration/v1/` + capability discovery. GuruRMM acts as broker; GuruConnect owns the contract. Spec: `docs/specs/native-remote-control/`. Status: planned (v2 Phase 3).
- **End-user (sub-user) portal (SPEC-017):** deny-by-default `end_user` login role; locked-down portal listing only granted machines. Grant primitive already in DB (`user_client_access`, migration 002). Spec authored by Mike. Status: proposed, post v2-console.
---
## Architecture
### Components
| Component | Location | Tech | State |
|---|---|---|---|
| Relay server | 172.16.3.30 port 3002, systemd `guruconnect.service` | Rust, Axum 0.7, sqlx 0.8 | Live, v0.3.0 |
| PostgreSQL DB | 172.16.3.30 `guruconnect` DB (shared PG cluster with GuruRMM) | Postgres 14 | Live |
| Dashboard SPA | Served from server/static/app/ | React 18, TypeScript, Vite | Live, v0.3.0 |
| Windows agent | Distributed .exe (Pluto CI build) | Rust, winit, Win32 APIs | v0.3.0 signed |
| SAS helper service | Bundled with agent | Rust, Windows service | Shipped |
| NPM reverse proxy | Jupiter 172.16.3.20 | Nginx Proxy Manager | Live |
| CI runners | 172.16.3.30 (ubuntu-latest), Pluto 172.16.3.36 (windows-msvc) | Gitea Actions | Active |
### Key Files and Repos
- **Repo (Gitea):** `azcomputerguru/guru-connect` (standalone, not a pinned submodule)
- **ClaudeTools pointer:** `projects/msp-tools/guru-connect/` (local clone, NOT a pinned submodule — working tree may differ from origin/main)
- **Proto schema:** `proto/guruconnect.proto`
- **Migrations:** `server/migrations/001``010`
- **Specs:** `docs/specs/SPEC-001` through `SPEC-018`
- **ADRs:** `docs/ARCHITECTURE_DECISIONS.md`
- **Roadmap:** `docs/FEATURE_ROADMAP.md`
- **Server env:** `server/.env` (gitignored; contains `JWT_SECRET`, `DATABASE_URL`, `CONNECT_TRUSTED_PROXIES`)
- **DB credentials:** vault at `projects/guruconnect/database.sops.yaml`
- **Portal credentials:** vault at `projects/guruconnect/portal.sops.yaml`
- **Service unit installed:** `/etc/systemd/system/guruconnect.service` — NOTE: do NOT overwrite with the repo's `server/guruconnect.service` (repo unit has `WatchdogSec=30s`; installed unit does not — copying it causes restart loops)
- **Public URL:** `https://connect.azcomputerguru.com`
- **Agent WS:** `wss://connect.azcomputerguru.com/ws/agent`
- **Viewer WS:** `wss://connect.azcomputerguru.com/ws/viewer`
- **Issue tracker:** `https://git.azcomputerguru.com/azcomputerguru/guru-connect/issues`
---
## Development
### Current Focus
- **SPEC-018 Phase 2 (critical path):** session broker — `CreateProcessAsUserW` into the active interactive desktop session, ACL'd named pipe IPC between the SYSTEM service and per-session worker, `SERVICE_CONTROL_SESSIONCHANGE` handling for logon/logoff/fast-user-switch. Required before the managed agent can capture and inject on a machine running as SYSTEM. Requires a Windows VM for integration testing (not runnable on dev host alone).
- **H.264 cross-GPU validation:** beast→5070 run pending. Three viewer-side bugs fixed 2026-05-31 (MF back-pressure, output-type negotiation, event-loop freeze). Raw+Zstd remains the shipping default; `DEFAULT_PREFER_H264=false` until sign-off.
- **v2 Phase 2 (dashboard + data layer):** machine inventory (SPEC-003), machines list view parity (SPEC-005), universal search (SPEC-006), installer builder (SPEC-007).
### Architecture Decisions
| ADR | Decision |
|---|---|
| ADR-001 | GuruConnect is standalone; integrates with GuruRMM via versioned `/api/integration/v1/` contract. GuruConnect owns the contract; GuruRMM is a consumer only. |
| ADR-002 | Release engineering via Gitea Actions; reuses GuruRMM's Azure Trusted Signing cert profile (same ACG identity, no new Azure provisioning). |
### Patterns and Anti-Patterns
- **Per-agent `cak_` keys, not a shared secret.** The legacy `AGENT_API_KEY` shared env var is deprecated; relay warns on use. All new enrollments mint per-machine keys.
- **Migrations are idempotent and runtime-applied.** `sqlx::migrate!()` runs on server startup (`db.migrate()`). Never pre-apply via psql. Never edit a committed migration (breaks sqlx checksum, crash-loops the server on startup — confirmed in production 2026-05-30 via `0059b21`).
- **`DEFAULT_PREFER_H264=false`.** Do not flip to `true` without a confirmed cross-GPU validation run.
- **Soft-delete, not hard-delete, for machines and sessions.** `deleted_at IS NULL` is the live filter; purged rows stay for the audit trail. The legacy hard-delete path on `/api/machines/:agent_id` still exists (no `?purge=true`), but new removal UI uses the soft-delete path.
- **Sign once, wrap per-site.** The agent binary is signed once by CI (`release.yml`). Per-site config is delivered via a per-site wrapper/MSI that writes the site config around the signed bytes — never appended into the PE (which voids Authenticode). The legacy `download_agent` append path is a known signing-incompatible debt (SPEC-007).
- **Agent logging requires `--verbose`.** `FmtSubscriber::with_max_level(INFO)` in `agent/src/main.rs` ignores `RUST_LOG`. Debug output only via `--verbose` / `-v` CLI flag. This burned an investigation session (2026-05-31).
- **Non-interactive SSH for build.** The server's cargo/protoc are only on PATH in a login shell (`bash -lc "..."`). A non-interactive shell silently misses them — confirmed on the 2026-05-30 deploy.
- **sqlx uses runtime queries.** No `query!` macros, no `.sqlx` cache directory. Server builds without a live DB connection; no `DATABASE_URL` required at compile time.
### Build and Deploy
#### Windows agent (CI, Pluto)
```
CI: .gitea/workflows/release.yml
Runner: pluto-guruconnect (windows-msvc, Pluto 172.16.3.36)
Signing: jsign Azure Trusted Signing (GuruRMM cert profile)
Output: guruconnect.exe + guruconnect-sas-service.exe
```
- Beta builds: `channel: beta` workflow_dispatch input → prerelease tag, signed, no semver bump.
- Build locally on GURU-5070 (for dev/test): `cargo build -p guruconnect --release --target x86_64-pc-windows-msvc` (~36-43s).
#### Linux server (on 172.16.3.30 itself)
```bash
# Login shell required (PATH includes cargo/protoc only in login context):
ssh guru@172.16.3.30 'bash -lc "..."'
# Backup DB before deploy:
pg_dump "$DATABASE_URL" | gzip > ~/backups/guruconnect/pre-deploy-$(date +%F-%H%M).sql.gz
# Sync to live origin/main (if history was rewritten, ff-only will refuse — use reset):
git fetch origin && git reset --hard origin/main
# Build dashboard SPA:
cd dashboard && npm ci && npm run build
# Emits to ../server/static/app/ (gitignored)
# Build server binary (scoped to server crate; skips Windows-only agent crate):
cargo build --release -p guruconnect-server --target x86_64-unknown-linux-gnu
# ~3 min. Output: target/x86_64-unknown-linux-gnu/release/guruconnect-server
# Cutover (migrations auto-run on startup via sqlx::migrate!()):
sudo systemctl restart guruconnect
# Watch: journalctl -u guruconnect -f
# Expect: "Migrations complete" then "Server listening"
```
#### Gotchas (all hit 2026-05-30 deploy)
- **Installed service unit has no `WatchdogSec`.** The repo's `server/guruconnect.service` has `WatchdogSec=30s`. The installed `/etc/systemd/system/guruconnect.service` does not. Do NOT run `setup-systemd.sh` or copy the repo unit — it causes restart loops every 30s.
- **`CONNECT_TRUSTED_PROXIES` must include Jupiter (172.16.3.20).** Without it, every agent's client IP logs as 172.16.3.20 instead of the real public IP. Set `CONNECT_TRUSTED_PROXIES=127.0.0.1,::1,172.16.3.20` in `server/.env`. (Jupiter is the NPM host, not the relay host.)
- **`connect_machines.tags` NULL rows** break the startup machine reconcile and the machines list. Migration 007 added a NOT NULL constraint + backfill; if a row somehow regains NULL (e.g. old agent insert path), hot-patch: `UPDATE connect_machines SET tags='{}' WHERE tags IS NULL`.
#### Rollback
```bash
git reset --hard <saved-sha>
# Rebuild binary + restart
# If migration is incompatible: psql < backup
```
---
## Active State
v0.3.0 deployed and stable. SPEC-018 Phase 2 (session broker) is the next major milestone. GuruConnect is in active development; the feature roadmap is at `docs/FEATURE_ROADMAP.md` in the repo.
Key open items:
- SPEC-018 Phase 2: session broker + per-session capture worker (blocks true unattended access as SYSTEM; blocks SPEC-013 session selection)
- H.264 cross-GPU validation (beast→5070); raw+Zstd is current shipping default
- ~18s StartStream latency in agent message pump (`agent/src/transport/websocket.rs:try_recv` 1ms-timeout spin)
- Branch protection on `guru-connect` `main` not yet enabled (auto-sync bypassed it once in June 2026)
- GC agent not yet deployed to Howard's test machines (parked until SPEC-018 Phase 2)
---
## History Highlights
| Date | Event |
|---|---|
| 2025-12-29 | Machine deletion API added; AdminCommand protobuf message; 12 offline ghost machines purged from DB. |
| 2026-01-17 | Phase 1 security/infrastructure work begins; project in early MVP state. |
| 2026-05-29 | Security audit finds 3 CRITICALs in relay auth; v2 modernization direction set (SPEC-002). Operational tooling parity sprint begins: Gitea Actions CI, Azure Trusted Signing, Pluto windows-msvc runner, `/gc-feature-request` skill, coord project registration. |
| 2026-05-30 | v2 Phase 1 "secure session core" Task 17 committed and deployed. 3 audit CRITICALs closed. v2 dashboard (React SPA) replaces v1 portal. Specs SPEC-010/011/012/013/014/015 authored. First production deployment under the new architecture at `connect.azcomputerguru.com`. |
| 2026-05-31 | v2 Phase 1 formally exited. `/gc-audit --pass=security` PASS, 0 findings. v0.3.0 released. SPEC-004 (machine_uid, session reaping, operator removal) fully implemented and deployed; 11 live ghost rows purged. H.264 viewer bugs (MF back-pressure, output-type negotiation, event-loop freeze) diagnosed and fixed on GURU-5070. |
| 2026-06-01 | v0.3.0 tag published; changelogs finalized. |
| 2026-06-02 | SPEC-016 Phase A (enrollment backend, `POST /api/enroll`, per-site key issue/rotate, migration 010) built, reviewed, merged via PR #5. SPEC-016 Phase B (hardware-salted `machine_uid`, first-run enroll client, DPAPI+ACL `cak_` store, run-mode wiring) built, reviewed, merged. SPEC-017 (end-user access, authored by Mike) added. SPEC-018 Phase 1 (LocalSystem service scaffolding, SCM dispatch) built, reviewed, merged via PR #7. Beta release channel added to `release.yml`. |
---
## Backlinks
- [[gururmm]] — sibling product; planned integration via `/api/integration/v1/` contract (ADR-001 / GuruRMM ADR-008)
- [[msp-tools]] — portfolio context; both products share the 172.16.3.30 host and PG cluster