Commit Graph

149 Commits

Author SHA1 Message Date
7602b4346a feat(agent): SPEC-018 Phase 1 managed-agent SYSTEM service host
Run the managed/persistent GuruConnect agent as a LocalSystem Windows
service so it is reachable at the login screen and across reboots, and
so the SPEC-016 per-machine cak_ store (ACL-restricted to SYSTEM +
Administrators) is finally readable in-context.

Phase 1 scope (host + lifecycle only):
- New agent/src/service/mod.rs: registers "GuruConnectAgent" with the
  SCM via the windows-service dispatcher, reports a correct lifecycle
  (StartPending -> Running -> StopPending -> Stopped), handles
  Stop/Shutdown via an AtomicBool the agent loop polls (graceful WS
  close), and provides install/uninstall/start (LocalSystem, AutoStart,
  sc-failure crash recovery). Idempotent install/uninstall.
- main.rs: hidden `service-run` subcommand routes the SCM-launched
  process into the dispatcher; new run_managed_agent_service() runs the
  existing RunMode::PermanentAgent logic (resolve/enroll cak_, hold the
  relay) as SYSTEM. run_agent() now takes an optional SCM shutdown flag,
  skips the HKCU Run autostart and the tray when run as the service, and
  interrupts the reconnect backoff promptly on stop. An interactive
  launch of a managed binary now installs+starts the service and exits
  instead of double-running.
- install.rs: a managed install (embedded config present) installs the
  LocalSystem service as the single autostart and removes the legacy
  HKCU Run entry; uninstall stops+deletes the service (idempotent).
  Attended/viewer installs are untouched.
- Kept the SPEC-016 Phase B fail-fast guard as a harmless safety net for
  any non-SYSTEM invocation; updated its comment to name this service as
  the managed run context.

Phase 2 NOT built (seams documented): session broker, per-session
capture/input worker, CreateProcessAsUserW token handoff, service/worker
IPC, and SERVICE_CONTROL_SESSIONCHANGE. Phase 1 enrolls/connects as
SYSTEM but does not capture a desktop (a Session-0 process cannot).

No service is installed/started on the dev host; that is a VM/admin
integration step. fmt + clippy -D warnings + release build + 55 tests
all pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 13:43:01 -07:00
55b9c97b28 fix(agent): point Phase B fail-fast guard at SPEC-018
Some checks failed
Build and Test / Build Agent (Windows) (push) Failing after 11m16s
Build and Test / Build Server (Linux) (push) Successful in 12m22s
Build and Test / Security Audit (push) Successful in 8m19s
Build and Test / Build Summary (push) Has been skipped
The SPEC-016 Phase B credential-store guard referenced "SPEC-017" for the
forthcoming SYSTEM service host, but 017 is now Mike's end-user-access
spec; the service host is SPEC-018. Comment + error-string text only, no
logic change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 13:13:13 -07:00
94c07c2431 spec: add SPEC-018 managed-agent SYSTEM service host + session broker
LocalSystem service that runs the persistent agent unattended and brokers
per-session capture/input workers (Session 0 can't capture directly).
Unblocks SPEC-016 Phase B end-to-end (SYSTEM-ACL'd cak_ store readable;
removes the Phase B fail-fast guard) and is the broker primitive SPEC-013
builds on. 017 was taken by Mike's end-user-access spec, so this is 018.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 13:13:04 -07:00
4c49b73a71 spec: add SPEC-017 end-user (sub-user) remote access
Some checks failed
Build and Test / Build Agent (Windows) (pull_request) Successful in 10m54s
Build and Test / Build Server (Linux) (push) Has been cancelled
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Build and Test / Build Server (Linux) (pull_request) Successful in 15m39s
Build and Test / Security Audit (pull_request) Successful in 5m54s
Build and Test / Build Summary (pull_request) Successful in 36s
2026-06-02 12:56:15 -07:00
367906bd54 fix(agent): SPEC-016 Phase B review fixes (re-image-stable machine_uid, ACL TOCTOU, load_cak error classes, PS timeout, fail-fast guard)
H1: derive machine_uid from the durable hardware salt ALONE (SMBIOS UUID, or
board+disk serial) plus a fixed namespace, so it survives an OS re-image (which
regenerates MachineGuid). MachineGuid is demoted to a last-resort signal used
only when no hardware salt is readable (volatile, reboot-only floor). Re-image
stability proven by salted_uid_is_reimage_stable_independent_of_machine_guid.

H2: in store_cak, lock the directory ACL BEFORE any secret bytes are written;
the temp file is created inside the already-locked dir, then renamed. No
ciphertext ever exists at an inherited/world-readable path. Ordering made an
explicit precondition, not an unstated inheritance assumption.

M1: load_cak now returns a LoadCakError enum distinguishing Io (incl.
PermissionDenied — operational) from Decrypt (the real tamper/wrong-machine
signal). Only a successful READ whose DPAPI decrypt fails hard-stops.

M2: the PowerShell SMBIOS/board/disk shell-out is spawned and waited on with a
10s wall-clock bound; on timeout the child is killed and the signal is treated
as missing (falls back through the chain), never panics. Keeps
CREATE_NO_WINDOW -NonInteractive -NoProfile.

L1: warn! breadcrumb when the salted derivation degrades to MachineGuid-only,
so the server-side collision-gate operator has a clue. No secret values logged.

C1: keep the SYSTEM+Administrators ACL (Option A target). store_cak now does a
read-back verification immediately after writing and fails at ENROLL time if
this context cannot read its own store; resolve_agent_credential fails fast with
an actionable SPEC-017 message on an access-denied store instead of silently
re-enrolling/bricking. Guarded comment notes this is satisfied once the SYSTEM
service host lands.

Deferred items (clear_cak placeholder, legacy api_key path) left as-is.

Verification on x86_64-pc-windows-msvc: cargo fmt --check clean, clippy
-D warnings clean, release build OK, 52 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 12:54:18 -07:00
52477e4c4a feat(agent): first-run enrollment client + run-mode wiring (SPEC-016 Phase B items 3,5)
New enroll module: on a managed agent with no stored cak_ but with
enrollment_key + site_code, POST machine_uid + hostname + labels to
<https-base>/api/enroll and persist the minted cak_. Handles every Phase A
status code distinctly:
  - 201 new / 200 reuse -> persist cak_ (DPAPI store) and connect
  - 202 collision_pending -> log "pending operator confirmation", slow
    re-check loop (no key issued; cannot connect until confirmed)
  - 401 ENROLL_REJECTED / 409 ENROLL_SITE_CONFLICT -> distinct actionable
    errors, long backoff (won't fix without operator action, but recovers
    automatically once it does) — no tight loop
  - 429 -> honor Retry-After, short backoff
  - network / 5xx / decode -> short backoff
The enrollment_key and cak_ are never logged. Uses the existing reqwest
client and the update path's TLS posture (rustls; dev-insecure only in
debug + opt-in). Wire-contract unit tests pin the request shape against
the server's EnrollRequest/EnrollLabels and decode active + pending bodies.

main.rs run-mode wiring: before a managed agent connects, resolve the
operating credential by precedence — stored cak_ (steady state, no
network) -> first-run enrollment -> DEPRECATED legacy api_key (transition
only, logged at WARNING) -> error. The relay already accepts the cak_ as
the api_key query param, so the persistent transport authenticates with it
unchanged. Attended/support-code and viewer paths are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:44:40 -07:00
87c6e17d4a feat(agent): cak_ at-rest credential store (SPEC-016 Phase B item 4)
Store the per-machine cak_ with BOTH layers Mike locked: DPAPI-machine
encryption (CryptProtectData with CRYPTPROTECT_LOCAL_MACHINE — a copied
blob is inert off the box) inside a SYSTEM/Administrators-only ACL'd file
at %ProgramData%\GuruConnect\credentials\agent.cak. The directory + file
ACL is hardened via icacls (/inheritance:r + grant to the well-known SIDs
*S-1-5-18 and *S-1-5-32-544, locale-independent) — auditable, with far
less unsafe FFI than building a registry-key security descriptor by hand.
Co-locates with the existing %ProgramData%\GuruConnect config/seed dir.

Provides store_cak / load_cak / clear_cak. store_cak writes atomically
(temp file + rename in the locked dir). load_cak treats a present-but-
undecryptable blob as a hard error (tamper / cross-machine copy) rather
than silently re-enrolling over it. The plaintext is never logged; the
transient plaintext copy is scrubbed after encryption. DPAPI output blobs
are LocalFree'd. Enables the Win32_Security_Cryptography windows feature.

Round-trip unit tests cover encrypt/decrypt recovery across lengths and
that a tampered blob fails to decrypt (DPAPI authenticates its blobs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:44:23 -07:00
6a000d012f feat(agent): extend config contract for enrollment (SPEC-016 Phase B item 2)
Add enrollment_key + site_code to EmbeddedConfig and the resolved Config
alongside the existing labels, and add department/device_type label fields
(SPEC-007 AgentStatus parity). The legacy api_key is retained but made
optional/defaulted so a SPEC-016 site installer can carry only the
enrollment credentials; existing pre-enrollment installers still parse.

The enrollment fields are #[serde(skip)] on Config so they are never
written to the on-disk TOML (install-time material only); apply_enrollment_env
layers them from GURUCONNECT_ENROLLMENT_KEY / GURUCONNECT_SITE_CODE on the
file and env load paths. The embedded path carries them from the install
blob. Config delivery itself (signed wrapper) is Phase C and unchanged here.

Add Config::https_base() deriving the REST API base (https://host[:port])
from the wss:// server_url so the enroll client and the persistent
transport share one authority.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:44:09 -07:00
d0b8db070f feat(agent): hardware-salt machine_uid (SPEC-016 Phase B item 1)
Extend the SPEC-004 machine_uid derivation with the locked SPEC-016
hardware salt: combine the Windows MachineGuid with the SMBIOS system
UUID (Win32_ComputerSystemProduct.UUID), falling back to motherboard
serial (Win32_BaseBoard.SerialNumber) + primary disk serial when the
SMBIOS UUID is absent or a degenerate placeholder (all-zeros / all-FFs,
emitted by some OEMs and hypervisor templates).

Signals are read via narrow PowerShell CIM queries (hidden window, no
profile) rather than adding a WMI crate or hand-rolling COM IWbemServices
for two scalar reads. Values are normalized (trim + upper-case) so vendor
case/space drift never perturbs the digest. The combined string is
SHA-256'd into the existing opaque muid_<hex> shape, preserving the wire
identity the relay connect path already reports while making it survive an
OS re-image on the same hardware. Which signal set fed the result is
logged (source label only, never the secret values).

Adds unit tests for derivation determinism + signal-sensitivity,
degenerate-SMBIOS rejection, and signal normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:43:56 -07:00
89c3718266 Merge pull request 'SPEC-016 Phase A: zero-touch enrollment backend + migration' (#5) from feat/spec-016-enrollment into main
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 10m37s
Build and Test / Build Server (Linux) (push) Successful in 15m25s
Build and Test / Security Audit (push) Successful in 5m28s
Build and Test / Build Summary (push) Successful in 23s
2026-06-02 11:19:37 -07:00
4106fc4bc4 style(enroll): cargo fmt --all (satisfy CI fmt gate)
All checks were successful
Build and Test / Build Agent (Windows) (pull_request) Successful in 16m35s
Build and Test / Build Server (Linux) (pull_request) Successful in 19m7s
Build and Test / Security Audit (pull_request) Successful in 5m27s
Build and Test / Build Summary (pull_request) Successful in 26s
The Phase A work passed cargo check + clippy + tests locally but missed
`cargo fmt --all -- --check` (the first step of the Linux CI job): module
ordering in db/mod.rs and two trailing-comment alignments in rate_limit.rs.
No logic change. Agent build failure on the prior run was transient infra
(verified: agent crate compiles clean locally).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:48:51 -07:00
0f02f23765 fix(enroll): SPEC-016 Phase A review fixes (cross-site guard, timing oracle, TOCTOU)
Some checks failed
Build and Test / Build Agent (Windows) (pull_request) Failing after 10m11s
Build and Test / Build Server (Linux) (pull_request) Failing after 10m5s
Build and Test / Security Audit (pull_request) Successful in 8m5s
Build and Test / Build Summary (pull_request) Has been skipped
Applies the four review fixes to POST /api/enroll, all in server/src/api/enroll.rs
(+ a new ENROLL_SITE_CONFLICT event type in server/src/db/events.rs):

1. HIGH — close the within-tenant cross-site silent-move hijack. A valid key for
   site B presented for a machine_uid already bound to a DIFFERENT site is now
   REFUSED (409 ENROLL_SITE_CONFLICT) instead of silently repointing the row and
   minting a fresh cak_. No move, no key. Emits an ENROLL_SITE_CONFLICT audit event
   + alert TODO. Same-site match still resolves to reuse; a NULL prior site_id is a
   first relational bind, not a move. The unauthenticated site_move mint path is
   removed; deliberate moves are deferred to the Phase-B --reassign flow + dashboard.

2. MEDIUM — kill the timing/enumeration oracle. Unknown site_code and no-active-key
   early rejects now pay a dummy Argon2id verify against a fixed, valid throwaway PHC
   constant (TIMING_EQUALIZER_PHC) before returning the identical 401, so every
   rejection path pays one KDF. The constant is asserted valid + verifying in tests.

3. LOW — fix the new-enroll TOCTOU. The dedup lookup + INSERT is wrapped in a bounded
   retry loop: a concurrent first-enroll of the same machine_uid whose INSERT loses
   the unique-index race (classified by is_machine_uid_conflict on SQLSTATE 23505 +
   machine_uid constraint) now re-looks-up and converges to reuse instead of 500ing.
   A non-machine_uid unique violation still surfaces as 500.

4. LOW — make the collision-gate doc honest + leave an enforcement TODO. The module
   doc now states the gate withholds only a NEWLY minted cak_ (a prior clean cak_
   survives) and that nothing consults enrollment_state at control time yet, with a
   TODO(SPEC-016 Phase B/D) marker for relay/control-plane enforcement + revocation.

Verify: cargo check, cargo clippy --all-targets, and cargo test all clean on this
Windows host (104 tests pass). Two DB-gated tests (cross-site bound-site_id exposure,
machine_uid-vs-agent_id conflict classification) no-op without TEST_DATABASE_URL and
run against real Postgres in CI; the Linux target / real-Postgres handler path is
validated there, not on this host.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:28:31 -07:00
59e40c8019 feat(enroll): SPEC-016 Phase A — enrollment backend + migration
Server-side zero-touch per-site enrollment (Phase A: backend + DB only;
agent-side machine_uid derivation is Phase B, server treats it as opaque).

Migration 010_spec016_enrollment.sql:
- connect_sites: relational site anchor (site_code natural key, per-tenant
  unique). The spec assumed a sites table existed; it did not (site/company
  were free-text columns on connect_machines), so this creates a minimal one.
- site_enrollment_keys: rotatable, Argon2id-hashed cek_ secret + monotonic
  version + hex fingerprint + active flag; one-active-per-site partial unique.
- connect_machines: + site_id (FK), + enrollment_state ('active'|'pending')
  collision gate, + per-tenant (tenant_id, machine_uid) unique index added
  ALONGSIDE the 008 global index (the connect-path upsert_machine ON CONFLICT
  arbiter binds to 008 — dropping it would break live reconnect).
- connect_sites.enrollment_policy: reserved (default auto-approve), not enforced.

auth/enrollment_keys.rs: cek_ mint (256-bit, OS CSPRNG), Argon2id hash/verify
(reuses auth::password), and hex fingerprint vN (XXXX) per resolved-decision #3.

db/sites.rs + db/enrollment_keys.rs: runtime sqlx persistence; rotate_key
deactivates+inserts in one tx to hold the one-active-key invariant.

POST /api/enroll (public, api/enroll.rs): site_code+cek_ verify against active
key -> dedup on (tenant, machine_uid) -> new / reuse / site-move / collision.
Collision gate (PROVISIONAL heuristic: online existing row + different hostname)
-> pending, no usable cak_, alert. Mints cak_ via existing agent_keys path in the
exact form relay::validate_agent_api_key expects. Per-(site_code,IP) rate-limit +
lockout (EnrollLimiter). Audit events + [ENROLL] alert markers with
TODO(SPEC-016) #dev-alerts notes.

Admin (JWT) api/sites.rs: POST /api/sites/:id/enrollment-key/rotate (plaintext +
fingerprint once) and GET .../enrollment-key (fingerprint/version, no secret).

Routes wired in main.rs (enroll public, rotation admin). 13 new unit tests;
full server suite 99 passing. cargo check + clippy clean on the host (Windows)
target — Linux cross-target not installed here; server crate is platform-neutral
Rust. No sqlx offline cache needed (codebase uses runtime queries, no query!).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:12:35 -07:00
c286a29b9d spec: SPEC-016 resolve all 5 open questions (enrollment design decisions)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 14m25s
Build and Test / Build Server (Linux) (push) Successful in 20m31s
Build and Test / Security Audit (push) Successful in 8m28s
Build and Test / Build Summary (push) Successful in 30s
Fold the 2026-06-02 interview decisions into SPEC-016:
- Installer wrapper: ship BOTH signed .exe and signed MSI per site
- cak_ at-rest storage: DPAPI-machine-encrypted blob in a SYSTEM-ACL'd location
- Fingerprint: hex (7F2A), deliberately unlike RMM word-codes
- machine_uid: per-tenant scope + hardware-derived salt (survives re-image,
  separates distinct boxes) + collision-gated activation (template-cloned VMs
  sharing a hardware UUID drop to pending + alert, need dashboard confirm)
- Attended support-code path: unchanged (filename-based, already signing-safe)

Open Questions section -> Resolved decisions + a short Remaining-for-planning
list (exact hardware salt signal set, WiX/MSI authoring approach).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:54:19 -07:00
18429f6fe3 spec: add SPEC-016 zero-touch per-site agent enrollment
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 10m46s
Build and Test / Build Server (Linux) (push) Successful in 15m33s
Build and Test / Security Audit (push) Successful in 6m3s
Build and Test / Build Summary (push) Successful in 25s
ScreenConnect-class managed enrollment: one signed installer per site,
machines self-register on first run and the server mints a per-machine
cak_ key bound to a deterministic machine_uid (dedups re-installs).
Per-site rotatable enrollment key (long secret + vN (XXXX) fingerprint);
rotating blocks new enrollments from old installers, leaves enrolled
agents untouched. Auto-approve + new-enrollment/site-move alert.

Resolves SPEC-007's signature-vs-appended-config open question:
sign the base agent once in CI + per-site signed wrapper that writes
site config around the signed bytes (never appended into the PE).

Deferred (room reserved): enrollment policy + per-seat licensing,
--enroll-key/--site-code/--reassign flag overrides, technician-assisted
interactive install. Tracking todo dbfe6a56.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:13:59 -07:00
3b9e4068c9 docs(roadmap): mark release signing shipped; add signed beta channel as P1-NOW
All checks were successful
Build and Test / Build Server (Linux) (push) Successful in 14m11s
Build and Test / Build Agent (Windows) (push) Successful in 8m3s
Build and Test / Security Audit (push) Successful in 5m38s
Build and Test / Build Summary (push) Successful in 17s
Release-path Azure Trusted Signing and auto-versioning were already
shipped with v0.3.0 (stale [ ] -> [x]). Add a new P1/NOW item for a
signed beta/test release channel: the auto build-and-test.yml agent
artifact is unsigned, so testers can receive unsigned binaries. The
beta channel (now implemented in release.yml) closes that gap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 07:57:04 -07:00
87f229509b ci(release): add signed beta/test release channel
Some checks failed
Build and Test / Build Server (Linux) (push) Has started running
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Add a `channel: stable | beta` workflow_dispatch input to release.yml.
`stable` is unchanged (byte-for-byte). `beta` produces a Windows agent
binary signed by the identical fail-closed Azure Trusted Signing path,
but skips the semver bump, changelog, and release commit, and publishes
a prerelease-tagged Gitea release (vX.Y.Z-beta.<run_number>) at HEAD.

So every binary handed to a tester is signed, not just formal releases.

- prerelease tags excluded from stable LAST_TAG detection (both lookups)
  so a beta tag can't corrupt the next stable version computation
- beta tag force-created/pushed -> idempotent on failed-run re-runs
- changelog download gated to stable; release prerelease flag plumbed
  through to the Gitea REST payload

Reviewed-by: Code Review Agent (APPROVE WITH NITS; N1 hardened)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 07:56:17 -07:00
40c7d860cc spec(v2-session-core): add Task 9 — cak_ auto-enroll provisioning (TOFU) + shared-key retirement
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m10s
Build and Test / Build Server (Linux) (push) Successful in 10m31s
Build and Test / Security Audit (push) Successful in 4m1s
Build and Test / Build Summary (push) Successful in 9s
2026-06-01 14:40:14 -07:00
0059b21db6 fix(server): revert migration 008 comment edit — modifying an applied sqlx migration breaks its checksum and crash-loops the server on startup; machines.rs ON CONFLICT fix retained
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m33s
Build and Test / Build Server (Linux) (push) Successful in 11m57s
Build and Test / Security Audit (push) Successful in 4m33s
Build and Test / Build Summary (push) Successful in 11s
2026-06-01 10:05:38 -07:00
f950511e3e fix(server): bind machine_uid upsert ON CONFLICT to the partial index (WHERE machine_uid IS NOT NULL)
Some checks failed
Build and Test / Build Agent (Windows) (push) Successful in 8m16s
Build and Test / Build Server (Linux) (push) Successful in 11m58s
Build and Test / Security Audit (push) Has started running
Build and Test / Build Summary (push) Has been cancelled
Bare ON CONFLICT (machine_uid) could not bind to migration 008's partial unique index, so no connect_machines row was persisted for any agent reporting a machine_uid. Confirmed live on 172.16.3.30 with a signed 0.3.0 test agent.
2026-06-01 09:50:34 -07:00
16017456aa docs: 2026-05-31 security re-audit (Phase-1 EXIT) + roadmap reconcile
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 6m59s
Build and Test / Build Server (Linux) (push) Successful in 10m35s
Build and Test / Security Audit (push) Successful in 4m3s
Build and Test / Build Summary (push) Successful in 7s
/gc-audit --pass=security re-pass over the deployed v0.3.0 code: PASS,
0 CRITICAL/HIGH/MEDIUM/LOW. The 3 relay CRITICALs stay closed (verified in
code AND live against the deployed binary), the prior agent-update-TLS HIGH
and chat-logging LOW are fixed, and the net-new SPEC-004 surface (machine_uid
dedup gate, session reaper/supersede, operator removal API) audits clean —
no non-admin removal path, no uid-spoof hijack, no auth-plane crossover.

Marks v2 Phase 1 formally exited (secure-session-core Task 8 complete).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:19:09 -07:00
guruconnect-ci
e967cce1a1 chore: release v0.3.0 [skip ci] v0.3.0 2026-06-01 00:10:58 +00:00
16586c4a1b chore: reconcile manifest versions to v0.2.2 baseline
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m14s
Build and Test / Build Server (Linux) (push) Successful in 11m25s
Build and Test / Security Audit (push) Successful in 7m13s
Build and Test / Build Summary (push) Successful in 1m2s
agent + server Cargo.toml hardcoded 0.2.0 (below the workspace.package
0.2.2 and the last release tag v0.2.2); dashboard was on a divergent
2.0.0 scheme. Align all component manifests + the dashboard lockfile to
the v0.2.2 baseline so the next release bumps them coherently to 0.3.0
rather than decreasing the dashboard. No code change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 16:50:59 -07:00
96f9c0ab45 feat(dashboard): operator removal UI for stale machines/sessions (SPEC-004 Task 5)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m13s
Build and Test / Build Server (Linux) (push) Successful in 11m21s
Build and Test / Security Audit (push) Successful in 4m12s
Build and Test / Build Summary (push) Successful in 11s
Admin-only per-row Remove + multi-select bulk removal on the machines view, plus
per-row purge Remove on the sessions view, wired to the Task-5 admin API
(DELETE /api/machines|sessions/:id?purge=true, POST /api/machines/bulk-remove).
Confirm modals (danger-styled, focus-trapped), TanStack refetch so purged rows
leave the console, structured ApiError surfacing, honest partial-bulk summary,
and admin-gating via useAuth().isAdmin as defense-in-depth over the server 403.
Replaces the legacy all-user delete trigger. typecheck/lint/build clean.

Implements specs/v2-stable-identity/plan.md Task 5 (dashboard portion).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 14:14:49 -07:00
5ee6675337 feat(server): operator removal of stale sessions/machines (SPEC-004 Task 5, server)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m29s
Build and Test / Build Server (Linux) (push) Successful in 10m58s
Build and Test / Security Audit (push) Successful in 4m4s
Build and Test / Build Summary (push) Successful in 8s
Admin-gated soft-delete + purge so operators can clear ghost machines/sessions
(the ~15-rows-for-one-host accumulation) from the console.

- migration 009: deleted_at on connect_sessions + connect_machines, with partial
  indexes WHERE deleted_at IS NULL.
- DELETE /api/machines/:agent_id?purge=true and DELETE /api/sessions/:id?purge=true
  soft-delete the row and purge the in-memory session (remove_session); the
  non-purge path keeps the legacy hard-delete / live-only disconnect. POST
  /api/machines/bulk-remove handles multi-select (batch cap 500). All admin-gated
  (AdminUser -> 403; tightens the prior any-user delete) and audited to
  connect_session_events (actor + target + trusted client IP).
- list/get queries filter deleted_at IS NULL so removed units leave the console;
  upsert revives (deleted_at = NULL) a genuinely-reconnecting machine. The
  keyed-reattach identity resolver (get_machine_by_id) is intentionally unfiltered.

Dashboard removal UI is the A3b follow-up. 86 server tests pass; fmt/clippy/test
clean. Implements specs/v2-stable-identity/plan.md Task 5 (server portion).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 13:52:36 -07:00
cef1928379 style(server): cargo fmt for SPEC-004 Task 2 + Task 4
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 6m40s
Build and Test / Build Server (Linux) (push) Successful in 10m18s
Build and Test / Security Audit (push) Successful in 4m12s
Build and Test / Build Summary (push) Successful in 12s
Pure rustfmt reflow of the Task 2 (machine_uid dedup) and Task 4 (session
reaping) code; no logic change. The CI Build-Server-Linux job gates on
cargo fmt --check, which the two feature commits failed because local
validation ran check/clippy/test but not fmt --check. fmt --check, check,
and clippy -D warnings all clean now.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:27:01 -07:00
4e80573cbd feat(server): reap stale persistent sessions + same-machine supersede (SPEC-004 Task 4)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m32s
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has started running
Build and Test / Build Summary (push) Has been cancelled
A periodic reaper removes persistent, offline, viewerless sessions whose last
heartbeat is older than a 10-minute TTL (60s sweep spawned at startup), and a
same-machine supersede on the new-session path drops a stranded prior session
when a legacy no-uid agent upgrades to a fresh agent_id + machine_uid. Both
removals re-assert the predicate under the write lock (remove_session_if) to
close a snapshot->remove TOCTOU.

Security: keyed (cak_) agents pass machine_uid=None, so they never trigger
supersede and are never reaped as a uid victim; online, viewer-attached, and
support sessions are never reaped. 82 server tests pass; clippy clean.

Implements specs/v2-stable-identity/plan.md Task 4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:21:15 -07:00
ffca7f0cee feat(server): dedup machines on machine_uid (SPEC-004 Task 2)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m48s
Build and Test / Build Agent (Windows) (push) Successful in 7m34s
Build and Test / Security Audit (push) Successful in 4m44s
Build and Test / Build Summary (push) Has been skipped
Persist the agent-reported machine_uid and dedup connect_machines on it so a
single physical machine can't register duplicate rows when its config-file
agent_id regenerates (the ghost-session root cause).

- migration 008: nullable connect_machines.machine_uid + partial unique index
  (WHERE machine_uid IS NOT NULL); idempotent, startup-applied.
- upsert_machine: two-path dedup (ON CONFLICT machine_uid when present, else
  the legacy ON CONFLICT agent_id path, unchanged).
- session reattach: a machine_uid index consulted before agent_id, with all
  removal paths purging it.
- security: keyed (cak_) agents stay authoritative — their claimed machine_uid
  is dropped (effective_machine_uid=None); uid is dedup-only for un-keyed /
  support-code agents. Startup restore skips uid-indexing keyed machines and
  fails closed if the keyed-set query errors.

74 server tests pass; clippy clean. Implements specs/v2-stable-identity/plan.md Task 2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:06:50 -07:00
97780304e7 fix(agent): make native H.264 viewer render live frames
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m2s
Build and Test / Build Server (Linux) (push) Successful in 10m24s
Build and Test / Security Audit (push) Successful in 4m15s
Build and Test / Build Summary (push) Successful in 9s
The native viewer's H.264 path (Task 7 first-cut, compile-verified only)
never rendered a frame. Three stacked bugs, all confirmed via live loopback:

1. decoder: MF_E_NOTACCEPTING (0xC00D36B5) was treated as fatal and only
   one output was drained per call, so once the MFT filled it rejected
   every subsequent frame. decode() now returns Vec<DecodedFrame>, drains
   on back-pressure and retries the unconsumed sample, then drains all
   ready outputs.
2. decoder: the NV12 output type was hand-built and rejected by the MS
   H.264 decoder MFT (MF_E_TRANSFORM_TYPE_NOT_SET, 0xC00D6D60). It is now
   negotiated by enumerating GetOutputAvailableType on STREAM_CHANGE /
   TYPE_NOT_SET.
3. render: a manual pump_messages() in about_to_wait stole winit's own
   thread messages and froze the event loop after one iteration, so frames
   were never drained from the channel. Removed; winit's run_app pump
   already services the WH_KEYBOARD_LL hook.

Validated on a 5070 loopback: 0 decode errors, frames decode/paint/present
(present count 0 -> 1740). Reviewed (APPROVE-WITH-NITS); diagnostics stripped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 11:25:05 -07:00
afbf0d81b8 spec: add SPEC-015 Configurable Notification Overlay
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 8m0s
Build and Test / Build Server (Linux) (push) Successful in 11m26s
Build and Test / Security Audit (push) Successful in 4m37s
Build and Test / Build Summary (push) Successful in 12s
Comprehensive specification for on-screen notification when technician connects.

- Semi-transparent topmost window with configurable message, position, duration
- Dashboard admin settings page (enable/disable, message template, position, duration)
- Template variables: {{technician_name}}, {{company}}, {{time}}
- Agent displays overlay on StartStream, auto-hides after duration or manual dismiss
- Database: notification_config singleton table
- Protobuf: NotificationConfig message in StartStream
- Priority: P2, Effort: Medium (3-4 weeks)
- Added to roadmap under Core Remote Control

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-31 08:40:53 -07:00
b45c683a51 spec: add SPEC-014 Branding and White-Label Configuration
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 8m16s
Build and Test / Build Server (Linux) (push) Successful in 11m48s
Build and Test / Security Audit (push) Successful in 4m35s
Build and Test / Build Summary (push) Successful in 13s
Comprehensive specification for branding/whitelabel configuration.

- Dashboard admin settings page (logo, brand hue, product name, company name, favicon)
- OKLCH color system with CSS variables for dynamic theming
- Agent tray tooltip customization via registry key
- Singleton database table with public GET endpoint
- Priority: P2, Effort: Medium (4-6 weeks)
- Added to roadmap under Server/API (v2 Phase 2)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-31 08:12:37 -07:00
5637e4c1f9 spec: add SPEC-013 Windows Session Selection and Backstage Mode
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 8m5s
Build and Test / Build Server (Linux) (push) Successful in 11m24s
Build and Test / Security Audit (push) Successful in 4m30s
Build and Test / Build Summary (push) Successful in 12s
2026-05-31 07:54:25 -07:00
b3e8f32734 feat(agent): derive + report deterministic machine_uid (SPEC-004 Task 1)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m4s
Build and Test / Build Server (Linux) (push) Successful in 9m41s
Build and Test / Security Audit (push) Successful in 4m11s
Build and Test / Build Summary (push) Successful in 10s
Agent now derives a recomputable, opaque machine_uid (Windows: SHA-256 of the OS
MachineGuid at HKLM\SOFTWARE\Microsoft\Cryptography\MachineGuid -> muid_<hex>;
non-Windows / registry-failure: persisted random UUID, warn-logged). Raw GUID
never exposed; OnceLock-cached. Reported ALONGSIDE agent_id (unchanged) on
AgentStatus (new additive proto field 12) and in the connect handshake query.
This is the stable identity that fixes config-loss duplicate registrations
(DESKTOP-I66IM5Q x9); server-side dedup keying that consumes it is SPEC-004
Task 2. Non-breaking, isolated. 5 unit tests; cargo fmt/clippy(-D warnings)/test
green on GURU-5070.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:23:11 -07:00
92bc522c3a spec: add v2-stable-identity implementation plan (SPEC-004 breakdown)
Some checks failed
Build and Test / Build Server (Linux) (push) Has started running
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Ordered, execution-ready plan for SPEC-004 (stable machine identity + session
reaping + operator removal). Works out the core integration: machine_uid =
deterministic MachineGuid-based hardware identity (recomputable, so config loss
can't duplicate); per-agent cak_ key stays the credential/trust boundary; they
compose so one cak_ key per machine_uid = one key per real machine (the
prerequisite the fleet key-migration #7 needs). Root cause grounded in code:
agent_id is a random UUID (config.rs:90), connect_machines dedups on ON CONFLICT
(agent_id), so config loss -> duplicate rows (DESKTOP-I66IM5Q x9 live). 5 ordered
tasks (agent uid -> server dedup -> reconcile/age-out -> reaping -> operator
removal). Unblocks #7 -> #5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:17:49 -07:00
df51d40094 feat(server): per-agent H.264 test override (h264-test tag) [Task 8 prep]
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m32s
Build and Test / Build Server (Linux) (push) Successful in 10m55s
Build and Test / Security Audit (push) Successful in 4m14s
Build and Test / Build Summary (push) Successful in 11s
Lets the HW-H.264 path be live-validated on tagged test agents without affecting
the live client fleet. Adds H264_TEST_TAG="h264-test" + a pure prefer_h264_for(tags)
helper (DEFAULT_PREFER_H264 || tags contains the tag, case-insensitive); StartStream
codec negotiation now computes prefer_h264 from the agent's reported tags instead of
the bare const, and logs the computed value. SAFETY: untagged sessions are byte-for-
byte unchanged (prefer_h264 == DEFAULT_PREFER_H264 == false -> raw); the supports_h264
guard still forces raw for a no-HW agent even when tagged. DEFAULT_PREFER_H264 stays
false (flipping the global default is a separate future step). 3 unit tests added.
cargo fmt/clippy(-D warnings)/test green on GURU-5070 (37 agent + 64 server).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:17:38 -07:00
7be8f454e0 Merge remote security fixes with local specs
All checks were successful
Build and Test / Build Server (Linux) (push) Successful in 9m56s
Build and Test / Build Agent (Windows) (push) Successful in 6m21s
Build and Test / Security Audit (push) Successful in 4m21s
Build and Test / Build Summary (push) Successful in 9s
2026-05-30 19:21:42 -07:00
c98692e424 fix(server): revoke viewer tokens on logout + stop logging chat content
Some checks failed
Build and Test / Build Server (Linux) (push) Has started running
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Security follow-ups (audit 2026-05-30, both reviewed APPROVE):
- MEDIUM: viewer tokens were never blacklisted on logout, so a minted
  session-scoped viewer token stayed valid up to its 5-min TTL after the user
  logged out. Add a per-user ViewerTokenRegistry (Arc<Mutex<HashMap<sub,
  Vec<(token, expires_at)>>>>, prune-on-insert) on AppState; mint_viewer_token
  registers each token under the user sub; logout drains take_for_user(sub) and
  blacklists each via the existing token_blacklist. The viewer WS already calls
  is_revoked, so no WS change. Key chain user.user_id == ViewerClaims.sub ==
  registry key verified consistent. 8 new tests.
- LOW: relay chat logs now emit content length, not the chat body (support-chat
  can carry secrets/PII).
cargo fmt/clippy(-D warnings)/test green on GURU-5070 (37 agent + 61 server).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:20:15 -07:00
761bae5d01 spec: update SPEC-012 to include both Serial Console + PTY Shell modes
Major update to SPEC-012 adding dual-mode terminal access:

Mode 1: Serial Console Mode (True Remote Console)
- Direct access to system serial console (/dev/ttyS0 or /dev/console)
- Sees GRUB bootloader, kernel boot messages, login prompts, kernel panics
- Boot-time interaction: select GRUB entries, edit kernel parameters, single-user mode
- Requires root privileges or CAP_SYS_TTY_CONFIG capability
- Setup: GRUB + kernel parameters configured for serial console output
- Like KVM-over-IP or IPMI Serial-over-LAN (text-mode equivalent)

Mode 2: PTY Shell Mode (Interactive Shell)
- Spawn pseudo-TTY with bash/zsh shell session
- Normal server management (package updates, log review, etc.)
- Runs as unprivileged agent service user
- Standard interactive shell with full ANSI/VT100 support

Architecture:
- Agent mode selection based on viewer request (console vs. shell)
- Dashboard shows two buttons: "Console" and "Shell" for headless agents
- Same xterm.js viewer handles both modes transparently
- Protobuf extensions: TerminalModeRequest enum, console_mode flag

Security:
- Console mode requires root (boot-level control risk)
- Recommend RBAC: separate console_access and shell_access permissions
- Console sessions should require MFA (Phase 2)
- Audit logging for both modes

Setup Requirements:
- One-time GRUB configuration for serial console
- systemd service with CAP_SYS_TTY_CONFIG for console mode
- serial-getty@ttyS0.service enabled for login prompt

Updated effort: Medium (5-7 weeks, up from 4-6)
Priority remains P2

Addresses user request for "remote console" (as if at the machine)
not just shell access.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 19:02:27 -07:00
8119292bcd fix(agent): close auto-update TLS bypass (MITM -> RCE) [HIGH]
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m24s
Build and Test / Build Server (Linux) (push) Successful in 10m41s
Build and Test / Security Audit (push) Successful in 4m19s
Build and Test / Build Summary (push) Successful in 9s
The auto-update path built both reqwest clients with an unconditional
danger_accept_invalid_certs(true), so a network MITM could serve an arbitrary
update .exe (checksum is no defense — same unverified channel) and gain RCE on
every managed endpoint. Replace with dev_insecure_tls() = cfg!(debug_assertions)
&& env GURUCONNECT_DEV_INSECURE_TLS: the cfg gate compiles out of release builds,
so a shipped agent ALWAYS verifies certs; dev keeps a self-signed escape hatch.
Loud warn when the insecure path is taken; verify_checksum kept + documented as
transport-integrity (not tamper) defense; TODO + follow-up for embedded-key
update signing (defense-in-depth). Release-invariant unit test added.
cargo fmt/clippy(-D warnings)/test green on GURU-5070 (90 tests). Closes the
2026-05-30 security-audit HIGH (reports/2026-05-30-gc-audit.md).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:02:23 -07:00
9f44807230 audit: security pass re-audit (2026-05-30) — 3 CRITICALs verified CLOSED
Some checks failed
Build and Test / Build Agent (Windows) (push) Successful in 7m1s
Build and Test / Build Server (Linux) (push) Successful in 10m17s
Build and Test / Security Audit (push) Has started running
Build and Test / Build Summary (push) Has been cancelled
Independent /gc-audit --pass=security re-derivation of the v2 secure-session-core
rebuild: all three 2026-05-29 relay CRITICALs confirmed closed with no bypass
(any-JWT-joins-session, viewer-WS blacklist, JWT-as-agent-key). Relay plane clean;
consent/code paths fail closed; abuse surface bounded; rate limiting proxy-aware.
Net-new: 1 HIGH (agent auto-update disables TLS cert verification -> MITM-RCE,
agent/src/update.rs:45,111 — outside the relay plane), 1 LOW (chat content logged),
2 INFO. Report: reports/2026-05-30-gc-audit.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 18:48:48 -07:00
a062a825ea spec: add SPEC-012 Headless Linux Mode (Direct TTY Access)
Comprehensive specification for terminal-based remote access to headless
Linux servers (no X11/Wayland GUI):

Core Capabilities:
- PTY spawn via openpty() + fork/exec shell (/bin/bash or $SHELL)
- Terminal I/O: PTY output → TerminalData protobuf → WebSocket relay
- Input: keyboard → TerminalInput protobuf → PTY master write
- Resize: SIGWINCH on terminal window resize, TIOCSWINSZ ioctl
- Auto-detection: agent detects headless environment (no DISPLAY) at runtime

Viewer:
- xterm.js-based web terminal (80x24 default, resizable)
- Full ANSI/VT100 support (colors, cursor control, vim/nano/htop)
- Same protobuf-over-WSS protocol, support-code/agent-key auth
- Dashboard shows "Terminal" badge, routes to terminal viewer

Use Cases:
- Server management (headless Ubuntu Server, VMs, containers)
- Emergency recovery (systemd rescue mode, single-user mode)
- Container debugging (exec into running containers)
- SSH replacement with centralized audit logging

Protobuf Extensions:
- TerminalData, TerminalInput, TerminalResize messages
- AgentStatus.terminal_mode flag

Security:
- Run agent as unprivileged user + sudo for privileged commands
- Session recording to terminal_recordings table (asciicast format)
- Same auth model as GUI agents (support-code / per-agent key)

Estimated effort: Medium (4-6 weeks)
Priority: P2 (server management is market-critical)

Extends SPEC-010 Linux agent with PTY alternative to screen capture.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 18:28:34 -07:00
b1862800a1 spec: add SPEC-011 Mobile Agent Support (iOS and Android)
Comprehensive specification for iOS/Android devices as remote control targets:

iOS Agent (View-Only):
- ReplayKit 2 screen capture (user consent required)
- VideoToolbox H.264 encoding
- NO input injection (iOS sandboxing limitation)
- APNs push notifications for session requests
- Foreground-only operation (OS requirement)

Android Agent (View + Control):
- MediaProjection API screen capture (user consent)
- MediaCodec H.264 encoding
- Accessibility Service for input injection (tap/swipe/type)
- FCM push notifications
- Foreground service with persistent notification

Architecture:
- Native Swift/SwiftUI (iOS) and Kotlin/Jetpack Compose (Android) apps
- Same protobuf-over-WSS protocol as desktop agents
- Support-code authentication (persistent mode deferred to Phase 2)
- Minor protobuf additions: MobileCapabilities, TouchEvent
- Server push module: APNs (a2 crate) + FCM HTTP v1

Key constraints:
- Attended-only sessions (user must grant permission)
- Foreground-only (cannot capture in background on either platform)
- iOS view-only (platform sandbox prevents input injection)
- Consent-first model (MediaProjection/ReplayKit user prompts)

Estimated effort: X-Large (16-20 weeks, requires mobile expertise)
Priority: P3

Distinct from GuruRMM SPEC-017 (MDM/inventory) — this is remote
control, not device management.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 18:24:16 -07:00
442eecefc0 fix(server,agent): apply Tasks 3-5 review fixes (non-blocking)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m6s
Build and Test / Build Server (Linux) (push) Successful in 10m39s
Build and Test / Security Audit (push) Successful in 4m14s
Build and Test / Build Summary (push) Successful in 8s
From the secure-session-core Tasks 3-5 code review (APPROVE-WITH-FIXES):
- MEDIUM-2: delete the dead `validate_agent_key` "accept-any-key" placeholder +
  its AuthenticatedAgent/AuthState scaffolding (zero callers; the real agent
  auth is validate_agent_api_key + per-agent cak_ keys). Removes an auth landmine.
- LOW-3: stop interpolating support-code values into 3 relay log lines (bearer
  credentials).
- LOW-1: document the X-Real-IP trust requirement in ip_extract.rs (NPM must set
  it from $remote_addr); behavior unchanged.
- LOW-2: correct the consent/heartbeat comment in agent session loop (the loop
  awaits the dialog; safe because CONSENT_TIMEOUT 60s < HEARTBEAT_TIMEOUT 90s).
cargo fmt/clippy(-D warnings)/test all green on GURU-5070 (89 tests, 0 warnings).
MEDIUM-1 (viewer-token logout revocation) remains a tracked follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 18:23:03 -07:00
5e2325507f spec: add SPEC-010 Cross-Platform Agent Support (macOS and Linux)
Comprehensive specification for expanding agent support beyond Windows:

macOS Agent (Priority 1):
- ScreenCaptureKit API (macOS 13+) with AVFoundation fallback
- CGEvent input injection
- VideoToolbox H.264 encoding
- NSStatusItem menu bar icon
- Universal binary (x86_64 + arm64)
- Code signing and notarization

Linux Agent (Priority 2):
- X11 XShm screen capture with Wayland detection
- XTest input injection
- VA-API hardware H.264 encoding with software fallback
- StatusNotifier system tray
- .deb and .rpm packaging

Architecture:
- Platform abstraction layer (traits for capture/input/encoder/tray)
- Refactor existing Windows code behind PlatformCapture/Input/Encoder
- No protobuf protocol changes
- Same authentication (support codes and agent keys)

Estimated effort: X-Large (12-16 weeks)
Priority: P2 (market-critical for multi-platform MSP adoption)

Updated roadmap: promoted from P3 to P2 with full spec link.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 18:15:16 -07:00
c736a710a1 docs: record Tasks 3-5 code review (APPROVE-WITH-FIXES) in plan status
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m43s
Build and Test / Build Agent (Windows) (push) Successful in 7m43s
Build and Test / Security Audit (push) Successful in 4m57s
Build and Test / Build Summary (push) Has been skipped
Formal review on GURU-5070: cargo fmt/clippy/test green (89 tests, 0 warnings);
the 3 audit CRITICALs verified closed with no bypass; all security paths fail
closed. Non-blocking follow-ups tracked (viewer-token logout revocation, delete
dead validate_agent_key placeholder, X-Real-IP/log hygiene). Remaining for
Phase-1 exit: Task 8 e2e verification + /gc-audit security re-audit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 18:14:02 -07:00
786d3e47af docs: correct roadmap — v2 Phase 1 already landed, not a future sprint
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m12s
Build and Test / Security Audit (push) Successful in 4m53s
Build and Test / Build Agent (Windows) (push) Successful in 7m14s
Build and Test / Build Summary (push) Has been skipped
Re-baseline against actual git/deploy state: secure-session-core Tasks 1-7 are
committed and DEPLOYED; the 3 audit CRITICALs are closed and live in prod
(verified: deployed checkout abc55ab descends from the CRITICAL#1 fix + Task 7;
guruconnect.service running on :3002). The prior "Sprint 0: bypasses are live"
banner was wrong (stale 2026-05-29 audit narrative) and is removed. Remaining
to exit Phase 1 = secure-session-core Task 8 (e2e verification + security
re-audit) + Code-Review sign-off on Tasks 3-5. Schema note corrected
(connect_agent_keys + tenancy already exist via migration 004).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 17:36:18 -07:00
03f62d413f docs: annotate roadmap with v2-first direction + phase mapping
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 4m54s
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has started running
Build and Test / Build Summary (push) Has been cancelled
Mark SPEC-003..009 as work-items inside the SPEC-002 v2 phases (not standalone
v1 backlog): banner records the v2-reset decision + the Sprint-0 relay-auth
CRITICAL hotfix, a phase-mapping table (004->P1, 008->P0/1, 003/005/006/007->P2,
009->P3), inline [-> v2 Phase N] tags per spec, and a note to bake SPEC-003
inventory cols + SPEC-004 machine_uid + connect_agent_keys into the Phase-0
fresh schema. Sprint planning 2026-05-30 (Mike: v2 reset first).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 17:26:47 -07:00
7ab87384a7 spec: add SPEC-009 feature-rich documented API
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m42s
Build and Test / Build Agent (Windows) (push) Successful in 7m39s
Build and Test / Security Audit (push) Successful in 4m34s
Build and Test / Build Summary (push) Has been skipped
Everything the console does should be callable by API, documented and
discoverable. Adds: OpenAPI 3.x generated from code (utoipa) + Swagger/Redoc at
/api/docs (drift-proof, route<->spec parity test); long-lived revocable scoped
API tokens (connect_api_tokens, hashed like agent keys) distinct from the 24h
dashboard JWT and agent keys; an API-completeness gap audit (folds in SPEC-004/
006/007 endpoints); consistent pagination/filtering + versioning policy. Today
there is zero API doc tooling and no programmatic token. Depends on SPEC-008 for
the documented error envelope; distinct from the ADR-001 integration contract.
Large. Parallel guru-rmm SPEC-019. Requested by Mike 2026-05-30.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 16:35:57 -07:00
65eff5cf50 spec: add SPEC-008 valuable error messages
Cross-cutting error-quality initiative: one structured AppError envelope
(stable error_code + message + correlation_id) replacing the current ad-hoc
mix (bare (StatusCode,&str) tuples, per-file ErrorResponse, two JSON envelopes
the dashboard already unions); correlation-id middleware tied to tracing spans
+ response header so a reported id greps the log; contextual error logging with
identifiers + error chain; sweep the 37 server `let _ =` swallows (the pattern
that silently hid migration-005's missing columns); dashboard renders the real
cause + correlation id (drop the hardcoded generic at MachinesPage.tsx:202);
agent logs why/where auth/connection failed (the auth-loop incident gave no
local signal). Phaseable; Large. Parallel RMM request keeps conventions aligned.
Requested by Mike 2026-05-30.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 16:30:07 -07:00
008d2bf30b spec: add SPEC-007 managed-agent installer builder
Dashboard "Build Installer" wizard for pre-labeled managed/persistent agents
(Name/Company/Site/Department/Device Type/Tag/Type) with Download / Copy URL /
Send Link, ScreenConnect-style. The embed-config build path already exists
(downloads.rs appends EmbeddedConfig GURUCONFIG blob; AgentDownloadParams takes
company/site/tags/api_key; agent reads it at config.rs:223) - missing is the UI,
department + device_type fields (EmbeddedConfig/AgentStatus/connect_machines),
name strategy, and Copy-URL/Send-Link actions. Labels persist at install time,
feeding SPEC-003/005/006. Embedded key should be revocable per-machine/site
(pairs with SPEC-004). Biggest open question: appending config after Authenticode
signing invalidates the signature. Requested by Mike 2026-05-30.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 16:24:56 -07:00