LocalSystem service that runs the persistent agent unattended and brokers per-session capture/input workers (Session 0 can't capture directly). Unblocks SPEC-016 Phase B end-to-end (SYSTEM-ACL'd cak_ store readable; removes the Phase B fail-fast guard) and is the broker primitive SPEC-013 builds on. 017 was taken by Mike's end-user-access spec, so this is 018. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9.2 KiB
SPEC-018: Managed-Agent SYSTEM Service Host + Session Broker
Status: Proposed Priority: P1 (blocks SPEC-016 Phase B end-to-end runtime and SPEC-013) Requested By: Mike (2026-06-02) Estimated Effort: X-Large
Overview
Convert the managed/persistent GuruConnect agent from a user-context HKCU\…\Run autostart into a
Windows SYSTEM service that runs unattended — at the login screen, with no user logged in, across
reboots — and brokers per-session capture/input worker processes into the active interactive
desktop. A SYSTEM service lives in the isolated Session 0 and cannot capture or inject the
interactive desktop directly, so the service spawns a worker into the target user session (the
ScreenConnect architecture).
This is foundational, not cosmetic. It unblocks three things at once:
- SPEC-016 Phase B end-to-end runtime — the per-machine
cak_store is ACL'd to SYSTEM + Administrators; today the agent runs as the interactive user and can't read its own store (the Phase B C1 fail-fast guard exists precisely because of this). Running as SYSTEM makes the store readable and removes the guard. - True unattended access — a user-context agent only runs while that user is logged in. Reaching a rebooted server or a machine sitting at the login screen (table-stakes for remote support) requires SYSTEM.
- SPEC-013 session selection / backstage — the session-broker primitive built here is the substrate SPEC-013's session-switching UX drives.
Success criteria: the managed agent installs as an auto-start SYSTEM service; it holds the relay
connection and performs SPEC-016 enrollment as SYSTEM (reading/writing the SYSTEM-ACL'd cak_); it
spawns a capture/input worker into the active interactive session and relays frames; the worker is
respawned/retargeted on logon/logoff/console-connect; and the Phase B fail-fast guard is removed
because the store is now readable in-context.
Background — why this is needed (confirmed in code)
- The persistent agent autostarts via
HKCU\…\Run(agent/src/startup.rs:21,STARTUP_KEY= HKCU) → interactive-user token, not SYSTEM. The only SYSTEM service today is the separatesas_service(Secure Attention Sequence helper). - SPEC-016 Phase B (
agent/src/credential_store.rs) ACLs thecak_store to*S-1-5-18(SYSTEM) +*S-1-5-32-544(Administrators). In the current user context the agent writes but cannot read it back → the Phase B fail-fast guard (agent/src/main.rsresolve_agent_credential) emits "must run as the GuruConnect SYSTEM service (see SPEC-018)" instead of bricking. - Capture/input live in the agent process (
agent/src/capture/,agent/src/input/); a Session-0 SYSTEM service cannot drive these against the interactive desktop without a per-session worker.
Scope
Included in v1
- Windows service install/lifecycle (
agent/src/install.rs+ a new service module): register the managed agent as a LocalSystem auto-start service (CreateServiceW/ a service crate), configure failure/recovery (restart on crash), and replace the HKCURunautostart for managed mode (remove the Run entry on service install). Clean uninstall (stop + delete service). - Service control loop (Session 0, SYSTEM): owns the persistent WSS connection to the relay,
performs SPEC-016 enrollment as SYSTEM (now able to read/write the
cak_store), and dispatches session/connect requests to workers. HandlesSERVICE_CONTROL_STOP/SHUTDOWNandSERVICE_CONTROL_SESSIONCHANGE. - Session broker: enumerate sessions (
WTSEnumerateSessionsW), resolve the active interactive session (WTSGetActiveConsoleSessionId), obtain its user token (WTSQueryUserToken→DuplicateTokenEx), and spawn a per-session capture/input worker into that session's desktop (CreateProcessAsUserW,winsta0\default). The worker does DXGI capture + input injection in the user's session; the service relays frames over the existing transport. - Service ↔ worker IPC: a local, ACL'd channel (named pipe
\\.\pipe\guruconnect-<sessionId>) carrying frames/input/control; pipe ACL restricted to SYSTEM + the target session user. - Session-change handling: on logon/logoff/console-connect/disconnect/lock/unlock, (re)spawn or retarget the worker so the active desktop is always the one being served.
- Remove the SPEC-016 Phase B fail-fast guard once the service runs as SYSTEM (the store is readable in-context); keep the SYSTEM+Administrators ACL.
Explicitly out of scope (anticipated, separate specs)
- Session-selection / backstage UX — the operator-facing picker and Session-0/secure-desktop command surface are SPEC-013; this spec only provides the broker primitive it drives.
- Login-screen / secure-desktop (winlogon) capture beyond the broker hook — the hard Secure-Desktop case is coordinated with SPEC-013; v1 here targets the active interactive session.
- macOS/Linux service equivalents — future SPEC-010 (cross-platform agents).
Architecture
- Agent splits into two roles:
- service-host (LocalSystem, Session 0): service lifecycle, relay transport, SPEC-016
enrollment +
cak_store, session broker, IPC server. - session-worker (per interactive session, user token): DXGI/GDI capture, input injection,
IPC client. Spawned by the service via
CreateProcessAsUserW.
- service-host (LocalSystem, Session 0): service lifecycle, relay transport, SPEC-016
enrollment +
- Service install (
install.rs):CreateServiceWwithSERVICE_AUTO_START,SERVICE_WIN32_OWN_PROCESS, recovery actions; uninstall stops + deletes. Replaces managed-modeHKCU Run. - Token handoff:
WTSGetActiveConsoleSessionId→WTSQueryUserToken→DuplicateTokenEx(primary token) →CreateProcessAsUserWwithlpDesktop = "winsta0\\default". - IPC: named pipe per session, length-prefixed protobuf (reuse
proto/message types where sensible), pipe security descriptor granting only SYSTEM + the session user. - Session events: the service registers for
SERVICE_CONTROL_SESSIONCHANGEand reacts toWTS_CONSOLE_CONNECT,WTS_SESSION_LOGON/LOGOFF,WTS_SESSION_LOCK/UNLOCK.
Security considerations
- LocalSystem is maximal privilege — minimize the service's attack surface; validate every relay-delivered command; never spawn a worker except into a legitimately-enumerated active session.
- IPC pipe must be ACL'd (SYSTEM + the specific session user only) so a non-admin user can't inject capture/input commands by connecting to the pipe.
- Token hygiene: close duplicated tokens promptly; don't leak SYSTEM or user primary tokens.
- The SPEC-016
cak_store (SYSTEM-ACL'd) is now correctly readable; the fail-fast guard is removed but the ACL stays. - Audit: service start/stop, enrollment-as-SYSTEM, worker spawn, session attach/retarget — written to the existing event pipeline.
Implementation details
- New service module (e.g.
agent/src/service/{mod.rs, broker.rs, ipc.rs}); worker entry split out of the current capture path. NewCommandsvariants or an internal--service/--session-workerdispatch inagent/src/main.rs. install.rs: service create/recovery/delete; drop the managed-mode HKCURunwrite.windowscrate features:Win32_System_Services,Win32_System_RemoteDesktop(WTS*),Win32_Security,Win32_System_Threading(CreateProcessAsUserW),Win32_System_Pipes.- Remove the
resolve_agent_credentialfail-fast guard branch added in SPEC-016 Phase B.
Testing strategy
- Service: install → auto-start on boot → stop → uninstall on a clean VM.
cak_end-to-end: SYSTEM service enrolls (SPEC-016), stores + reads thecak_, connects — the integration test SPEC-016 Phase B currently cannot run.- Session broker: worker spawns into the active session; capture/input work; survives logoff→logon (respawn) and console-connect (retarget); fast-user-switch retarget.
- Security: non-admin cannot connect to the IPC pipe; worker runs with the user's token (not SYSTEM) in the user's desktop.
Effort estimate & dependencies
- Size: X-Large (service host + worker split + token-handoff + IPC + session-change handling + install/uninstall).
- Depends on: SPEC-016 (enrollment +
cak_store); the existing capture/input cores. - Unblocks: SPEC-016 Phase B end-to-end runtime (and the parked managed-agent enrollment test on the internal beta machines); SPEC-013 (session selection builds on this broker).
Open questions
- Service vs. SYSTEM scheduled task — a true Windows service (recovery, SCM integration) is the standard, robust choice; recommend service. Lock in planning.
- One multi-session worker vs. one worker per session — per-session worker is simpler to reason about and isolates a crash to one session; confirm.
- IPC transport — named pipe (recommended) vs. local TCP/loopback; pipe ACLing is the cleaner security story.
- Login-screen / Secure-Desktop capture — how much (if any) in this spec vs. deferred to SPEC-013 (it needs a worker in the winlogon/secure desktop, a distinct hard problem).
- Migration — on upgrade, cleanly transition existing HKCU-
Runmanaged installs to the service (remove the Run entry, install the service) without a gap.