ScreenConnect-class managed enrollment: one signed installer per site, machines self-register on first run and the server mints a per-machine cak_ key bound to a deterministic machine_uid (dedups re-installs). Per-site rotatable enrollment key (long secret + vN (XXXX) fingerprint); rotating blocks new enrollments from old installers, leaves enrolled agents untouched. Auto-approve + new-enrollment/site-move alert. Resolves SPEC-007's signature-vs-appended-config open question: sign the base agent once in CI + per-site signed wrapper that writes site config around the signed bytes (never appended into the PE). Deferred (room reserved): enrollment policy + per-seat licensing, --enroll-key/--site-code/--reassign flag overrides, technician-assisted interactive install. Tracking todo dbfe6a56. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
13 KiB
SPEC-016: Zero-Touch Per-Site Agent Enrollment
Status: Proposed Priority: P1 Requested By: Mike (2026-06-02) Estimated Effort: X-Large
Overview
Give GuruConnect a ScreenConnect-class managed-agent enrollment flow: a technician runs
one signed installer per site on every machine at that site — no per-machine key
minting, no flags, no typing — and each machine self-registers on first run, the
server minting it a per-machine cak_ key bound to a stable, machine-derived
machine_uid. Each site installer carries a rotatable per-site enrollment key (a long
server-generated secret) plus a short human-readable fingerprint (vN (XXXX)) so an
operator can tell at a glance whether an installer is current. Rotating a site's key blocks
new enrollments from old installers while leaving already-enrolled machines untouched
(they hold their own cak_).
This is the missing piece that turns the v2 secure-session-core (SPEC-004 per-agent keys +
machine_uid) into a real product workflow, and it resolves SPEC-007's open
signature-vs-appended-config question: the agent binary is signed once in CI
(already shipped via release.yml), and per-site customization rides in a thin signed
wrapper that writes site config to the endpoint at install time — never appended into the
signed PE.
Success criteria:
- A tech installs one site installer on N machines; all N appear in the console under the correct company/site, each as a distinct, deduplicated machine — zero per-machine setup.
- Re-installing / re-imaging the same hardware reuses the existing machine row (no ghost duplicates — the failure mode SPEC-004 documents).
- Rotating a site's enrollment key makes old installers unable to enroll new machines, while every already-enrolled agent keeps working.
- Every distributed installer is validly Authenticode-signed (SmartScreen/WDAC clean).
Background — what exists today (confirmed in code)
- Embedded config is append-based and breaks signing.
server/src/api/downloads.rs(download_agent, ~:152) readsstatic/downloads/guruconnect.exeand appendsMAGIC_MARKER+len:u32+ JSON (:196) to the end of the PE. The agent reads it back inagent/src/config.rs(read_embedded_config,:223). Appending bytes after a signed PE invalidates the Authenticode signature — so the current customization path and the newly-shipped CI signing are mutually exclusive. - No self-registration exists. Per-agent
cak_keys are minted admin-only inserver/src/api/machine_keys.rs(create_key,:119; "Admin issued a per-agent key",:146). There is no endpoint where an agent first-run exchanges an enrollment credential for its own key. - Relay already accepts per-agent keys.
server/src/relay/mod.rs(validate_agent_api_key,:417) callscrate::auth::agent_keys::verify_agent_key(:422) — thecak_path — then falls back to the deprecated sharedAGENT_API_KEY(:444, logs a "migrate to per-agentcak_" warning). - Key primitives exist.
server/src/auth/agent_keys.rs:generate_agent_keymints acak_-prefixed high-entropy key (:36/:46);verify_agent_key(:71).server/src/db/agent_keys.rsalready inserts intoconnect_agent_keys (machine_id, key_hash, tenant_id)(:47) — the v2 tenancy column is present (migration004_v2_secure_session_core.sql). - Identity is a random config UUID, not machine-derived — the root cause of duplicates
per SPEC-004 (
agent/src/config.rsgenerate_agent_id,:90). - Agent mode dispatch:
agent/src/main.rsCommands::Install(:160) →run_install;agent/src/config.rsdetect_run_mode(:162) returnsRunMode::PermanentAgentwhen embedded config is present.
Scope
Included in v1 (CORE)
-
machine_uid— deterministic machine identity. Derive a stable id from the WindowsMachineGuid(HKLM\SOFTWARE\Microsoft\Cryptography\MachineGuid), independent of the config-fileagent_id. (Shared root with SPEC-004; whichever lands first owns the impl, the other consumes it.) Used as the dedup key for register/move. -
Per-site enrollment key + fingerprint.
- Long (≥256-bit) server-generated secret per site, stored hashed (Argon2id, same
as
cak_/passwords), never recoverable in plaintext after issue. - A non-secret fingerprint = monotonic version + short derived code, rendered
vN (XXXX)(e.g.v3 (7F2A)), shown in the dashboard, baked into the installer filename, and reported by the agent at enrollment. - Rotate regenerates the secret and bumps the version; old installers are rejected
for new enrollments; existing agents (holding
cak_) are unaffected.
- Long (≥256-bit) server-generated secret per site, stored hashed (Argon2id, same
as
-
Self-registration endpoint. New
POST /api/enroll(public, unauthenticated by JWT — gated by the enrollment key) accepting{ site_code, enrollment_key, machine_uid, hostname, labels{company,site,department,device_type,tags} }:- Verify
(site_code, enrollment_key)against the current per-site key. - Dedup by
machine_uidwithin the site: if the machine exists, reuse the row and rotate itscak_; else create the machine row. - Mint a
cak_(reusegenerate_agent_key), store hashed viadb::agent_keysbound tomachine_id(+tenant_idfrom the site), return the plaintextcak_once. - Emit an audit event + new-enrollment alert (and a site-move alert when an
existing
machine_uidenrolls under a different site). - Rate-limit + lockout per
(site_code, source-IP)as defense-in-depth (the key is long, so this is belt-and-suspenders, not load-bearing).
- Verify
-
Agent first-run enrollment. On
RunMode::PermanentAgentwith no storedcak_: read site config → call/api/enrollwithmachine_uid→ persist the returnedcak_to a SYSTEM-only protected store (HKLM under a SYSTEM-only ACL, or DPAPI-machine) → connect towss://connect.azcomputerguru.com/ws/agentusing thecak_. On subsequent runs, use the storedcak_directly (no re-enroll). -
Sign-once base + per-site signed wrapper (resolves SPEC-007 open question).
- The base agent is signed once in CI (
release.yml, already shipped) and stays byte-identical for everyone. - Per-site customization (labels + enrollment key + fingerprint) is delivered to the endpoint at install time via a signing-safe channel — NOT appended to the signed PE. v1 mechanism: a small signed wrapper/bootstrapper (or signed MSI) that carries the site config, lays down the signed agent, and writes the site config to the protected config location. Decision to lock in planning: wrapper-exe vs MSI (see Open Questions).
- Deprecate the append path in
downloads.rsfor managed installs (keep only for attended/support-code if still needed), eliminating the signature-invalidation defect.
- The base agent is signed once in CI (
-
Auto-approve posture. A self-registered machine is live and controllable immediately (ScreenConnect parity). The new-enrollment alert is the tripwire.
Explicitly out of scope (ANTICIPATED — reserve room, do NOT build in v1)
The v1 data model and agent mode-dispatch must leave room for these without building them:
- Per-site enrollment POLICY — a
sites.enrollment_policyfield (defaultauto-approve; futurepending-approval) plus per-seat/per-endpoint licensing controls. Commercial, multi-tenant (thetenant_idcolumn already exists). Its own future SPEC. - Flag overrides —
--enroll-key/--site-code(generic installer, key supplied on the command line) and--reassign(move an existing machine to a new site, gated by possession of the destination site's key, with an explicit accidental-move guard: a different-site re-run refuses unless--reassignis passed) + cross-client move policy. Backend (machine_uid+ authorized site +cak_) is designed to support it; CLI surface is deferred. - Technician-assisted interactive install —
--technicianon a generic installer: prompts for the tech's own server credentials, and on auth presents a validated Company/Site/tags picker from the live authorized list (authz-by-identity, full audit trail). Heaviest path (interactive UI + auth/list callback); deferred.
All three converge on the same backend operation delivered in v1: machine_uid +
authorized site + issued cak_. v1 only ships the per-site-embedded-key door.
Architecture
- Agent (
agent/): computemachine_uid; first-run enroll → storecak_; use storedcak_thereafter; read site config from the wrapper-written location instead of an appended PE blob. Touchesconfig.rs(EmbeddedConfig/detect_run_mode/storage),main.rs(Install/run-mode), a newenrollclient module, transport auth. - Relay-server (
server/): newPOST /api/enroll; per-site key issue/rotate/verify;machine_uiddedup + site-move on register; audit + alert emission; rate-limit/lockout. Touchesapi/(newenroll.rs,siteskey endpoints),auth/agent_keys.rs,db/agent_keys.rs,relay/mod.rs(enrollment vs. connect),main.rsroutes. - Dashboard: per-site enrollment-key display (fingerprint
vN (XXXX)), Rotate action, "current installer" download wired to the signed wrapper build. (Builder UI is SPEC-007; this spec supplies the key/fingerprint/rotation it consumes.) - DB migration:
site_enrollment_keys(or columns on the site):site_id,key_hash,version,fingerprint,created_at,rotated_at,active. Reservesites.enrollment_policy(nullable, defaultauto-approve) for the anticipated policy work.connect_machinesgainsmachine_uid(unique per tenant/site). - Protobuf (
proto/guruconnect.proto): no wire change required for enrollment if/api/enrollis REST;AgentStatuslabel fields per SPEC-007 (department,device_type) ride along if landed together.
Security considerations
- Two-tier credential model: low-sensitivity enrollment key (gates "may register",
shared per site, rotatable) vs. high-sensitivity per-machine
cak_(operating credential, per-machine revocation). Compromise of an enrollment key is recovered by rotating one site — no fleet-wide re-key. - Enrollment keys stored hashed (Argon2id); plaintext shown once at issue/rotate.
cak_at rest on the endpoint must be SYSTEM-only (HKLM SYSTEM ACL or DPAPI-machine) so a non-admin user can't read it.machine_uidbinding is the spoof-guard SPEC-004 wants: acak_is bound to amachine_uid; a different box presenting another box'scak_is detectable.- Authorization model for moves/enrolls is possession-of-destination-key in v1 (identity-based authz deferred to the technician-assisted path).
- Open registration risk is mitigated by requiring
(site_code + long key)and rate-limit/lockout; auto-approve is acceptable because the enrollment key is the gate and every enrollment/site-move fires an alert. - Audit events: enroll, re-enroll/reuse, site-move, key-rotate — all logged with
machine_uid, site, and source IP.
Testing strategy
- Unit:
machine_uidderivation stability; enrollment-key verify/rotate; fingerprint derivation;cak_mint/hash/verify; dedup decision (new vs. reuse vs. move). - Integration: enroll new → row +
cak_issued; re-enroll samemachine_uid→ reuse, no duplicate; enroll with rotated (old) key → rejected; oldcak_still connects after rotation; rate-limit/lockout trips; site-move emits alert. - Manual: build a site wrapper installer → run on a clean VM → appears in console under
correct site, immediately controllable; re-image VM → same row reused;
signtool verify /papasses on the distributed wrapper and the laid-down agent.
Effort estimate & dependencies
- Size: X-Large (agent + relay + DB migration + CI build/sign wrapper + dashboard key/rotation surface).
- Depends on: SPEC-004
machine_uid(shared root); the CI signing already shipped (SPEC-001 §2 /release.yml). - Unblocks: SPEC-007 (installer builder gets a real per-site key + the signing resolution), and the parked managed-agent test deployment on the internal beta machines.
- Relationship to v2 phases: sits with the Phase-1 secure-session-core (per-agent keys
- identity) and feeds Phase-2 dashboard work.
Open questions
- Wrapper shape: signed standalone bootstrapper
.exevs. signed MSI for the per-site installer. MSI gives clean install/uninstall + GPO/Intune deploy; bootstrapper is lighter. Lock in planning. cak_storage: HKLM SYSTEM-ACL registry value vs. DPAPI-machine-protected file — pick one for the protected store.- Fingerprint code style: raw hex (
7F2A) vs. the RMM-house word style (GREEN-FALCON). Cosmetic; pick for operator readability. - Cross-tenant
machine_uidcollisions (same hardware imaged across tenants) — scopemachine_uiduniqueness per tenant, not globally. - Attended (support-code) path: confirm whether the append-based
download_supportpath is retained as-is or also migrated off appending.