feat(server): dedup machines on machine_uid (SPEC-004 Task 2)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m48s
Build and Test / Build Agent (Windows) (push) Successful in 7m34s
Build and Test / Security Audit (push) Successful in 4m44s
Build and Test / Build Summary (push) Has been skipped

Persist the agent-reported machine_uid and dedup connect_machines on it so a
single physical machine can't register duplicate rows when its config-file
agent_id regenerates (the ghost-session root cause).

- migration 008: nullable connect_machines.machine_uid + partial unique index
  (WHERE machine_uid IS NOT NULL); idempotent, startup-applied.
- upsert_machine: two-path dedup (ON CONFLICT machine_uid when present, else
  the legacy ON CONFLICT agent_id path, unchanged).
- session reattach: a machine_uid index consulted before agent_id, with all
  removal paths purging it.
- security: keyed (cak_) agents stay authoritative — their claimed machine_uid
  is dropped (effective_machine_uid=None); uid is dedup-only for un-keyed /
  support-code agents. Startup restore skips uid-indexing keyed machines and
  fails closed if the keyed-set query errors.

74 server tests pass; clippy clean. Implements specs/v2-stable-identity/plan.md Task 2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-31 12:06:50 -07:00
parent 97780304e7
commit ffca7f0cee
6 changed files with 697 additions and 50 deletions

View File

@@ -245,9 +245,39 @@ async fn main() -> Result<()> {
"Reconciling {} managed session(s) from database",
machines.len()
);
// Machines bound to an active `cak_` key. For these the key→machine
// binding is authoritative (SPEC-004 Task 2), so we must NOT index a
// restored keyed session by its stored `machine_uid`: doing so would
// let an un-keyed agent spoofing that uid reattach the keyed machine's
// offline session after a restart. The connect path never writes a uid
// for keyed agents, so a non-NULL uid on a keyed row can only come from
// a legacy pre-keying row — but close the gap regardless. On query
// failure, fail closed (treat all machines as keyed: index none by uid)
// rather than risk indexing a keyed machine.
let keyed_ids = match db::agent_keys::keyed_machine_ids(db.pool()).await {
Ok(ids) => ids,
Err(e) => {
tracing::warn!(
"Could not load keyed-machine set; suppressing uid reattach index for all restored machines: {}",
e
);
machines.iter().map(|m| m.id).collect()
}
};
for machine in machines {
// Keyed machines get None (uid reattach disabled); un-keyed
// machines keep their stored uid for legitimate reattach.
let restore_uid = if keyed_ids.contains(&machine.id) {
None
} else {
machine.machine_uid.as_deref()
};
sessions
.restore_offline_machine(&machine.agent_id, &machine.hostname)
.restore_offline_machine(
&machine.agent_id,
&machine.hostname,
restore_uid,
)
.await;
}
}