The Phase A work passed cargo check + clippy + tests locally but missed
`cargo fmt --all -- --check` (the first step of the Linux CI job): module
ordering in db/mod.rs and two trailing-comment alignments in rate_limit.rs.
No logic change. Agent build failure on the prior run was transient infra
(verified: agent crate compiles clean locally).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Applies the four review fixes to POST /api/enroll, all in server/src/api/enroll.rs
(+ a new ENROLL_SITE_CONFLICT event type in server/src/db/events.rs):
1. HIGH — close the within-tenant cross-site silent-move hijack. A valid key for
site B presented for a machine_uid already bound to a DIFFERENT site is now
REFUSED (409 ENROLL_SITE_CONFLICT) instead of silently repointing the row and
minting a fresh cak_. No move, no key. Emits an ENROLL_SITE_CONFLICT audit event
+ alert TODO. Same-site match still resolves to reuse; a NULL prior site_id is a
first relational bind, not a move. The unauthenticated site_move mint path is
removed; deliberate moves are deferred to the Phase-B --reassign flow + dashboard.
2. MEDIUM — kill the timing/enumeration oracle. Unknown site_code and no-active-key
early rejects now pay a dummy Argon2id verify against a fixed, valid throwaway PHC
constant (TIMING_EQUALIZER_PHC) before returning the identical 401, so every
rejection path pays one KDF. The constant is asserted valid + verifying in tests.
3. LOW — fix the new-enroll TOCTOU. The dedup lookup + INSERT is wrapped in a bounded
retry loop: a concurrent first-enroll of the same machine_uid whose INSERT loses
the unique-index race (classified by is_machine_uid_conflict on SQLSTATE 23505 +
machine_uid constraint) now re-looks-up and converges to reuse instead of 500ing.
A non-machine_uid unique violation still surfaces as 500.
4. LOW — make the collision-gate doc honest + leave an enforcement TODO. The module
doc now states the gate withholds only a NEWLY minted cak_ (a prior clean cak_
survives) and that nothing consults enrollment_state at control time yet, with a
TODO(SPEC-016 Phase B/D) marker for relay/control-plane enforcement + revocation.
Verify: cargo check, cargo clippy --all-targets, and cargo test all clean on this
Windows host (104 tests pass). Two DB-gated tests (cross-site bound-site_id exposure,
machine_uid-vs-agent_id conflict classification) no-op without TEST_DATABASE_URL and
run against real Postgres in CI; the Linux target / real-Postgres handler path is
validated there, not on this host.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Server-side zero-touch per-site enrollment (Phase A: backend + DB only;
agent-side machine_uid derivation is Phase B, server treats it as opaque).
Migration 010_spec016_enrollment.sql:
- connect_sites: relational site anchor (site_code natural key, per-tenant
unique). The spec assumed a sites table existed; it did not (site/company
were free-text columns on connect_machines), so this creates a minimal one.
- site_enrollment_keys: rotatable, Argon2id-hashed cek_ secret + monotonic
version + hex fingerprint + active flag; one-active-per-site partial unique.
- connect_machines: + site_id (FK), + enrollment_state ('active'|'pending')
collision gate, + per-tenant (tenant_id, machine_uid) unique index added
ALONGSIDE the 008 global index (the connect-path upsert_machine ON CONFLICT
arbiter binds to 008 — dropping it would break live reconnect).
- connect_sites.enrollment_policy: reserved (default auto-approve), not enforced.
auth/enrollment_keys.rs: cek_ mint (256-bit, OS CSPRNG), Argon2id hash/verify
(reuses auth::password), and hex fingerprint vN (XXXX) per resolved-decision #3.
db/sites.rs + db/enrollment_keys.rs: runtime sqlx persistence; rotate_key
deactivates+inserts in one tx to hold the one-active-key invariant.
POST /api/enroll (public, api/enroll.rs): site_code+cek_ verify against active
key -> dedup on (tenant, machine_uid) -> new / reuse / site-move / collision.
Collision gate (PROVISIONAL heuristic: online existing row + different hostname)
-> pending, no usable cak_, alert. Mints cak_ via existing agent_keys path in the
exact form relay::validate_agent_api_key expects. Per-(site_code,IP) rate-limit +
lockout (EnrollLimiter). Audit events + [ENROLL] alert markers with
TODO(SPEC-016) #dev-alerts notes.
Admin (JWT) api/sites.rs: POST /api/sites/:id/enrollment-key/rotate (plaintext +
fingerprint once) and GET .../enrollment-key (fingerprint/version, no secret).
Routes wired in main.rs (enroll public, rotation admin). 13 new unit tests;
full server suite 99 passing. cargo check + clippy clean on the host (Windows)
target — Linux cross-target not installed here; server crate is platform-neutral
Rust. No sqlx offline cache needed (codebase uses runtime queries, no query!).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>