SPEC-016 Phase B: agent enrollment (machine_uid, first-run enroll, cak_ storage) #6
Reference in New Issue
Block a user
Delete Branch "feat/spec-016-phase-b-agent"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase B of SPEC-016 (zero-touch per-site enrollment) — agent side. Builds on the merged Phase A server backend.
What's here
Review
REQUEST CHANGES -> all 6 findings fixed + focused re-review CONFIRMED CLOSED:
Dependency / not-yet-runnable
The managed agent must run as a SYSTEM service for the SYSTEM-ACL'd cak_ store to be readable end-to-end; that service host is a SEPARATE upcoming spec (SPEC-017, per Mike's decision to split it from SPEC-013). Until it lands, the agent fail-fasts with a clear message rather than bricking. Enrollment logic is unit-tested (52 tests) but the live enroll->store->connect cycle is integration-tested once the service host exists.
Local verify (Windows host): fmt --check, clippy -D warnings, release build (x86_64-pc-windows-msvc), 52 tests — all green.
Spec: docs/specs/SPEC-016-zero-touch-enrollment.md.
New enroll module: on a managed agent with no stored cak_ but with enrollment_key + site_code, POST machine_uid + hostname + labels to <https-base>/api/enroll and persist the minted cak_. Handles every Phase A status code distinctly: - 201 new / 200 reuse -> persist cak_ (DPAPI store) and connect - 202 collision_pending -> log "pending operator confirmation", slow re-check loop (no key issued; cannot connect until confirmed) - 401 ENROLL_REJECTED / 409 ENROLL_SITE_CONFLICT -> distinct actionable errors, long backoff (won't fix without operator action, but recovers automatically once it does) — no tight loop - 429 -> honor Retry-After, short backoff - network / 5xx / decode -> short backoff The enrollment_key and cak_ are never logged. Uses the existing reqwest client and the update path's TLS posture (rustls; dev-insecure only in debug + opt-in). Wire-contract unit tests pin the request shape against the server's EnrollRequest/EnrollLabels and decode active + pending bodies. main.rs run-mode wiring: before a managed agent connects, resolve the operating credential by precedence — stored cak_ (steady state, no network) -> first-run enrollment -> DEPRECATED legacy api_key (transition only, logged at WARNING) -> error. The relay already accepts the cak_ as the api_key query param, so the persistent transport authenticates with it unchanged. Attended/support-code and viewer paths are untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>Phase B agent commits (d0b8db0..367906b) are already on main (fast-forwarded); SPEC-018 + the SPEC-017->018 ref fix landed separately on main. Closing as already-merged.
Pull request closed