# SPEC-018: Managed-Agent SYSTEM Service Host + Session Broker **Status:** Proposed **Priority:** P1 (blocks SPEC-016 Phase B end-to-end runtime and SPEC-013) **Requested By:** Mike (2026-06-02) **Estimated Effort:** X-Large ## Overview Convert the managed/persistent GuruConnect agent from a user-context `HKCU\…\Run` autostart into a **Windows SYSTEM service** that runs unattended — at the login screen, with no user logged in, across reboots — and **brokers per-session capture/input worker processes** into the active interactive desktop. A SYSTEM service lives in the isolated **Session 0** and cannot capture or inject the interactive desktop directly, so the service spawns a worker into the target user session (the ScreenConnect architecture). This is foundational, not cosmetic. It unblocks three things at once: 1. **SPEC-016 Phase B end-to-end runtime** — the per-machine `cak_` store is ACL'd to SYSTEM + Administrators; today the agent runs as the interactive *user* and can't read its own store (the Phase B C1 *fail-fast guard* exists precisely because of this). Running as SYSTEM makes the store readable and removes the guard. 2. **True unattended access** — a user-context agent only runs while that user is logged in. Reaching a rebooted server or a machine sitting at the login screen (table-stakes for remote support) requires SYSTEM. 3. **SPEC-013 session selection / backstage** — the session-broker primitive built here is the substrate SPEC-013's session-switching UX drives. **Success criteria:** the managed agent installs as an auto-start SYSTEM service; it holds the relay connection and performs SPEC-016 enrollment as SYSTEM (reading/writing the SYSTEM-ACL'd `cak_`); it spawns a capture/input worker into the active interactive session and relays frames; the worker is respawned/retargeted on logon/logoff/console-connect; and the Phase B fail-fast guard is removed because the store is now readable in-context. ## Background — why this is needed (confirmed in code) - The persistent agent autostarts via `HKCU\…\Run` (`agent/src/startup.rs:21`, `STARTUP_KEY` = HKCU) → interactive-user token, not SYSTEM. The only SYSTEM service today is the separate `sas_service` (Secure Attention Sequence helper). - SPEC-016 Phase B (`agent/src/credential_store.rs`) ACLs the `cak_` store to `*S-1-5-18` (SYSTEM) + `*S-1-5-32-544` (Administrators). In the current user context the agent writes but cannot read it back → the Phase B fail-fast guard (`agent/src/main.rs` `resolve_agent_credential`) emits "must run as the GuruConnect SYSTEM service (see SPEC-018)" instead of bricking. - Capture/input live in the agent process (`agent/src/capture/`, `agent/src/input/`); a Session-0 SYSTEM service cannot drive these against the interactive desktop without a per-session worker. ## Scope ### Included in v1 1. **Windows service install/lifecycle** (`agent/src/install.rs` + a new service module): register the managed agent as a **LocalSystem auto-start service** (`CreateServiceW` / a service crate), configure failure/recovery (restart on crash), and **replace the HKCU `Run` autostart for managed mode** (remove the Run entry on service install). Clean uninstall (stop + delete service). 2. **Service control loop** (Session 0, SYSTEM): owns the persistent WSS connection to the relay, performs SPEC-016 enrollment as SYSTEM (now able to read/write the `cak_` store), and dispatches session/connect requests to workers. Handles `SERVICE_CONTROL_STOP`/`SHUTDOWN` and `SERVICE_CONTROL_SESSIONCHANGE`. 3. **Session broker:** enumerate sessions (`WTSEnumerateSessionsW`), resolve the active interactive session (`WTSGetActiveConsoleSessionId`), obtain its user token (`WTSQueryUserToken` → `DuplicateTokenEx`), and spawn a **per-session capture/input worker** into that session's desktop (`CreateProcessAsUserW`, `winsta0\default`). The worker does DXGI capture + input injection in the user's session; the service relays frames over the existing transport. 4. **Service ↔ worker IPC:** a local, ACL'd channel (named pipe `\\.\pipe\guruconnect-`) carrying frames/input/control; pipe ACL restricted to SYSTEM + the target session user. 5. **Session-change handling:** on logon/logoff/console-connect/disconnect/lock/unlock, (re)spawn or retarget the worker so the active desktop is always the one being served. 6. **Remove the SPEC-016 Phase B fail-fast guard** once the service runs as SYSTEM (the store is readable in-context); keep the SYSTEM+Administrators ACL. ### Explicitly out of scope (anticipated, separate specs) - **Session-selection / backstage UX** — the operator-facing picker and Session-0/secure-desktop command surface are **SPEC-013**; this spec only provides the broker primitive it drives. - **Login-screen / secure-desktop (winlogon) capture** beyond the broker hook — the hard Secure-Desktop case is coordinated with SPEC-013; v1 here targets the active interactive session. - **macOS/Linux service equivalents** — future SPEC-010 (cross-platform agents). ## Architecture - **Agent splits into two roles:** - **service-host** (LocalSystem, Session 0): service lifecycle, relay transport, SPEC-016 enrollment + `cak_` store, session broker, IPC server. - **session-worker** (per interactive session, user token): DXGI/GDI capture, input injection, IPC client. Spawned by the service via `CreateProcessAsUserW`. - **Service install** (`install.rs`): `CreateServiceW` with `SERVICE_AUTO_START`, `SERVICE_WIN32_OWN_PROCESS`, recovery actions; uninstall stops + deletes. Replaces managed-mode `HKCU Run`. - **Token handoff:** `WTSGetActiveConsoleSessionId` → `WTSQueryUserToken` → `DuplicateTokenEx` (primary token) → `CreateProcessAsUserW` with `lpDesktop = "winsta0\\default"`. - **IPC:** named pipe per session, length-prefixed protobuf (reuse `proto/` message types where sensible), pipe security descriptor granting only SYSTEM + the session user. - **Session events:** the service registers for `SERVICE_CONTROL_SESSIONCHANGE` and reacts to `WTS_CONSOLE_CONNECT`, `WTS_SESSION_LOGON/LOGOFF`, `WTS_SESSION_LOCK/UNLOCK`. ## Security considerations - **LocalSystem is maximal privilege** — minimize the service's attack surface; validate every relay-delivered command; never spawn a worker except into a legitimately-enumerated active session. - **IPC pipe must be ACL'd** (SYSTEM + the specific session user only) so a non-admin user can't inject capture/input commands by connecting to the pipe. - **Token hygiene:** close duplicated tokens promptly; don't leak SYSTEM or user primary tokens. - The SPEC-016 `cak_` store (SYSTEM-ACL'd) is now correctly readable; the fail-fast guard is removed but the ACL stays. - **Audit:** service start/stop, enrollment-as-SYSTEM, worker spawn, session attach/retarget — written to the existing event pipeline. ## Implementation details - New service module (e.g. `agent/src/service/{mod.rs, broker.rs, ipc.rs}`); worker entry split out of the current capture path. New `Commands` variants or an internal `--service`/`--session-worker` dispatch in `agent/src/main.rs`. - `install.rs`: service create/recovery/delete; drop the managed-mode HKCU `Run` write. - `windows` crate features: `Win32_System_Services`, `Win32_System_RemoteDesktop` (`WTS*`), `Win32_Security`, `Win32_System_Threading` (`CreateProcessAsUserW`), `Win32_System_Pipes`. - Remove the `resolve_agent_credential` fail-fast guard branch added in SPEC-016 Phase B. ## Testing strategy - **Service:** install → auto-start on boot → stop → uninstall on a clean VM. - **`cak_` end-to-end:** SYSTEM service enrolls (SPEC-016), stores + reads the `cak_`, connects — the integration test SPEC-016 Phase B currently cannot run. - **Session broker:** worker spawns into the active session; capture/input work; survives logoff→logon (respawn) and console-connect (retarget); fast-user-switch retarget. - **Security:** non-admin cannot connect to the IPC pipe; worker runs with the user's token (not SYSTEM) in the user's desktop. ## Effort estimate & dependencies - **Size:** X-Large (service host + worker split + token-handoff + IPC + session-change handling + install/uninstall). - **Depends on:** SPEC-016 (enrollment + `cak_` store); the existing capture/input cores. - **Unblocks:** SPEC-016 Phase B end-to-end runtime (and the parked managed-agent enrollment test on the internal beta machines); **SPEC-013** (session selection builds on this broker). ## Open questions 1. **Service vs. SYSTEM scheduled task** — a true Windows service (recovery, SCM integration) is the standard, robust choice; recommend service. Lock in planning. 2. **One multi-session worker vs. one worker per session** — per-session worker is simpler to reason about and isolates a crash to one session; confirm. 3. **IPC transport** — named pipe (recommended) vs. local TCP/loopback; pipe ACLing is the cleaner security story. 4. **Login-screen / Secure-Desktop capture** — how much (if any) in this spec vs. deferred to SPEC-013 (it needs a worker in the winlogon/secure desktop, a distinct hard problem). 5. **Migration** — on upgrade, cleanly transition existing HKCU-`Run` managed installs to the service (remove the Run entry, install the service) without a gap.