LocalSystem service that runs the persistent agent unattended and brokers per-session capture/input workers (Session 0 can't capture directly). Unblocks SPEC-016 Phase B end-to-end (SYSTEM-ACL'd cak_ store readable; removes the Phase B fail-fast guard) and is the broker primitive SPEC-013 builds on. 017 was taken by Mike's end-user-access spec, so this is 018. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
147 lines
9.2 KiB
Markdown
147 lines
9.2 KiB
Markdown
# SPEC-018: Managed-Agent SYSTEM Service Host + Session Broker
|
|
|
|
**Status:** Proposed
|
|
**Priority:** P1 (blocks SPEC-016 Phase B end-to-end runtime and SPEC-013)
|
|
**Requested By:** Mike (2026-06-02)
|
|
**Estimated Effort:** X-Large
|
|
|
|
## Overview
|
|
|
|
Convert the managed/persistent GuruConnect agent from a user-context `HKCU\…\Run` autostart into a
|
|
**Windows SYSTEM service** that runs unattended — at the login screen, with no user logged in, across
|
|
reboots — and **brokers per-session capture/input worker processes** into the active interactive
|
|
desktop. A SYSTEM service lives in the isolated **Session 0** and cannot capture or inject the
|
|
interactive desktop directly, so the service spawns a worker into the target user session (the
|
|
ScreenConnect architecture).
|
|
|
|
This is foundational, not cosmetic. It unblocks three things at once:
|
|
1. **SPEC-016 Phase B end-to-end runtime** — the per-machine `cak_` store is ACL'd to SYSTEM +
|
|
Administrators; today the agent runs as the interactive *user* and can't read its own store (the
|
|
Phase B C1 *fail-fast guard* exists precisely because of this). Running as SYSTEM makes the store
|
|
readable and removes the guard.
|
|
2. **True unattended access** — a user-context agent only runs while that user is logged in. Reaching
|
|
a rebooted server or a machine sitting at the login screen (table-stakes for remote support)
|
|
requires SYSTEM.
|
|
3. **SPEC-013 session selection / backstage** — the session-broker primitive built here is the
|
|
substrate SPEC-013's session-switching UX drives.
|
|
|
|
**Success criteria:** the managed agent installs as an auto-start SYSTEM service; it holds the relay
|
|
connection and performs SPEC-016 enrollment as SYSTEM (reading/writing the SYSTEM-ACL'd `cak_`); it
|
|
spawns a capture/input worker into the active interactive session and relays frames; the worker is
|
|
respawned/retargeted on logon/logoff/console-connect; and the Phase B fail-fast guard is removed
|
|
because the store is now readable in-context.
|
|
|
|
## Background — why this is needed (confirmed in code)
|
|
|
|
- The persistent agent autostarts via `HKCU\…\Run` (`agent/src/startup.rs:21`, `STARTUP_KEY` = HKCU)
|
|
→ interactive-user token, not SYSTEM. The only SYSTEM service today is the separate `sas_service`
|
|
(Secure Attention Sequence helper).
|
|
- SPEC-016 Phase B (`agent/src/credential_store.rs`) ACLs the `cak_` store to `*S-1-5-18` (SYSTEM) +
|
|
`*S-1-5-32-544` (Administrators). In the current user context the agent writes but cannot read it
|
|
back → the Phase B fail-fast guard (`agent/src/main.rs` `resolve_agent_credential`) emits
|
|
"must run as the GuruConnect SYSTEM service (see SPEC-018)" instead of bricking.
|
|
- Capture/input live in the agent process (`agent/src/capture/`, `agent/src/input/`); a Session-0
|
|
SYSTEM service cannot drive these against the interactive desktop without a per-session worker.
|
|
|
|
## Scope
|
|
|
|
### Included in v1
|
|
|
|
1. **Windows service install/lifecycle** (`agent/src/install.rs` + a new service module): register the
|
|
managed agent as a **LocalSystem auto-start service** (`CreateServiceW` / a service crate),
|
|
configure failure/recovery (restart on crash), and **replace the HKCU `Run` autostart for managed
|
|
mode** (remove the Run entry on service install). Clean uninstall (stop + delete service).
|
|
2. **Service control loop** (Session 0, SYSTEM): owns the persistent WSS connection to the relay,
|
|
performs SPEC-016 enrollment as SYSTEM (now able to read/write the `cak_` store), and dispatches
|
|
session/connect requests to workers. Handles `SERVICE_CONTROL_STOP`/`SHUTDOWN` and
|
|
`SERVICE_CONTROL_SESSIONCHANGE`.
|
|
3. **Session broker:** enumerate sessions (`WTSEnumerateSessionsW`), resolve the active interactive
|
|
session (`WTSGetActiveConsoleSessionId`), obtain its user token (`WTSQueryUserToken` →
|
|
`DuplicateTokenEx`), and spawn a **per-session capture/input worker** into that session's desktop
|
|
(`CreateProcessAsUserW`, `winsta0\default`). The worker does DXGI capture + input injection in the
|
|
user's session; the service relays frames over the existing transport.
|
|
4. **Service ↔ worker IPC:** a local, ACL'd channel (named pipe `\\.\pipe\guruconnect-<sessionId>`)
|
|
carrying frames/input/control; pipe ACL restricted to SYSTEM + the target session user.
|
|
5. **Session-change handling:** on logon/logoff/console-connect/disconnect/lock/unlock, (re)spawn or
|
|
retarget the worker so the active desktop is always the one being served.
|
|
6. **Remove the SPEC-016 Phase B fail-fast guard** once the service runs as SYSTEM (the store is
|
|
readable in-context); keep the SYSTEM+Administrators ACL.
|
|
|
|
### Explicitly out of scope (anticipated, separate specs)
|
|
|
|
- **Session-selection / backstage UX** — the operator-facing picker and Session-0/secure-desktop
|
|
command surface are **SPEC-013**; this spec only provides the broker primitive it drives.
|
|
- **Login-screen / secure-desktop (winlogon) capture** beyond the broker hook — the hard
|
|
Secure-Desktop case is coordinated with SPEC-013; v1 here targets the active interactive session.
|
|
- **macOS/Linux service equivalents** — future SPEC-010 (cross-platform agents).
|
|
|
|
## Architecture
|
|
|
|
- **Agent splits into two roles:**
|
|
- **service-host** (LocalSystem, Session 0): service lifecycle, relay transport, SPEC-016
|
|
enrollment + `cak_` store, session broker, IPC server.
|
|
- **session-worker** (per interactive session, user token): DXGI/GDI capture, input injection,
|
|
IPC client. Spawned by the service via `CreateProcessAsUserW`.
|
|
- **Service install** (`install.rs`): `CreateServiceW` with `SERVICE_AUTO_START`, `SERVICE_WIN32_OWN_PROCESS`,
|
|
recovery actions; uninstall stops + deletes. Replaces managed-mode `HKCU Run`.
|
|
- **Token handoff:** `WTSGetActiveConsoleSessionId` → `WTSQueryUserToken` → `DuplicateTokenEx`
|
|
(primary token) → `CreateProcessAsUserW` with `lpDesktop = "winsta0\\default"`.
|
|
- **IPC:** named pipe per session, length-prefixed protobuf (reuse `proto/` message types where
|
|
sensible), pipe security descriptor granting only SYSTEM + the session user.
|
|
- **Session events:** the service registers for `SERVICE_CONTROL_SESSIONCHANGE` and reacts to
|
|
`WTS_CONSOLE_CONNECT`, `WTS_SESSION_LOGON/LOGOFF`, `WTS_SESSION_LOCK/UNLOCK`.
|
|
|
|
## Security considerations
|
|
|
|
- **LocalSystem is maximal privilege** — minimize the service's attack surface; validate every
|
|
relay-delivered command; never spawn a worker except into a legitimately-enumerated active session.
|
|
- **IPC pipe must be ACL'd** (SYSTEM + the specific session user only) so a non-admin user can't
|
|
inject capture/input commands by connecting to the pipe.
|
|
- **Token hygiene:** close duplicated tokens promptly; don't leak SYSTEM or user primary tokens.
|
|
- The SPEC-016 `cak_` store (SYSTEM-ACL'd) is now correctly readable; the fail-fast guard is removed
|
|
but the ACL stays.
|
|
- **Audit:** service start/stop, enrollment-as-SYSTEM, worker spawn, session attach/retarget — written
|
|
to the existing event pipeline.
|
|
|
|
## Implementation details
|
|
|
|
- New service module (e.g. `agent/src/service/{mod.rs, broker.rs, ipc.rs}`); worker entry split out of
|
|
the current capture path. New `Commands` variants or an internal `--service`/`--session-worker`
|
|
dispatch in `agent/src/main.rs`.
|
|
- `install.rs`: service create/recovery/delete; drop the managed-mode HKCU `Run` write.
|
|
- `windows` crate features: `Win32_System_Services`, `Win32_System_RemoteDesktop`
|
|
(`WTS*`), `Win32_Security`, `Win32_System_Threading` (`CreateProcessAsUserW`),
|
|
`Win32_System_Pipes`.
|
|
- Remove the `resolve_agent_credential` fail-fast guard branch added in SPEC-016 Phase B.
|
|
|
|
## Testing strategy
|
|
|
|
- **Service:** install → auto-start on boot → stop → uninstall on a clean VM.
|
|
- **`cak_` end-to-end:** SYSTEM service enrolls (SPEC-016), stores + reads the `cak_`, connects — the
|
|
integration test SPEC-016 Phase B currently cannot run.
|
|
- **Session broker:** worker spawns into the active session; capture/input work; survives logoff→logon
|
|
(respawn) and console-connect (retarget); fast-user-switch retarget.
|
|
- **Security:** non-admin cannot connect to the IPC pipe; worker runs with the user's token (not
|
|
SYSTEM) in the user's desktop.
|
|
|
|
## Effort estimate & dependencies
|
|
|
|
- **Size:** X-Large (service host + worker split + token-handoff + IPC + session-change handling +
|
|
install/uninstall).
|
|
- **Depends on:** SPEC-016 (enrollment + `cak_` store); the existing capture/input cores.
|
|
- **Unblocks:** SPEC-016 Phase B end-to-end runtime (and the parked managed-agent enrollment test on
|
|
the internal beta machines); **SPEC-013** (session selection builds on this broker).
|
|
|
|
## Open questions
|
|
|
|
1. **Service vs. SYSTEM scheduled task** — a true Windows service (recovery, SCM integration) is the
|
|
standard, robust choice; recommend service. Lock in planning.
|
|
2. **One multi-session worker vs. one worker per session** — per-session worker is simpler to reason
|
|
about and isolates a crash to one session; confirm.
|
|
3. **IPC transport** — named pipe (recommended) vs. local TCP/loopback; pipe ACLing is the cleaner
|
|
security story.
|
|
4. **Login-screen / Secure-Desktop capture** — how much (if any) in this spec vs. deferred to SPEC-013
|
|
(it needs a worker in the winlogon/secure desktop, a distinct hard problem).
|
|
5. **Migration** — on upgrade, cleanly transition existing HKCU-`Run` managed installs to the service
|
|
(remove the Run entry, install the service) without a gap.
|