feat(server): dedup machines on machine_uid (SPEC-004 Task 2)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m48s
Build and Test / Build Agent (Windows) (push) Successful in 7m34s
Build and Test / Security Audit (push) Successful in 4m44s
Build and Test / Build Summary (push) Has been skipped

Persist the agent-reported machine_uid and dedup connect_machines on it so a
single physical machine can't register duplicate rows when its config-file
agent_id regenerates (the ghost-session root cause).

- migration 008: nullable connect_machines.machine_uid + partial unique index
  (WHERE machine_uid IS NOT NULL); idempotent, startup-applied.
- upsert_machine: two-path dedup (ON CONFLICT machine_uid when present, else
  the legacy ON CONFLICT agent_id path, unchanged).
- session reattach: a machine_uid index consulted before agent_id, with all
  removal paths purging it.
- security: keyed (cak_) agents stay authoritative — their claimed machine_uid
  is dropped (effective_machine_uid=None); uid is dedup-only for un-keyed /
  support-code agents. Startup restore skips uid-indexing keyed machines and
  fails closed if the keyed-set query errors.

74 server tests pass; clippy clean. Implements specs/v2-stable-identity/plan.md Task 2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-31 12:06:50 -07:00
parent 97780304e7
commit ffca7f0cee
6 changed files with 697 additions and 50 deletions

View File

@@ -74,6 +74,13 @@ pub struct AgentParams {
/// API key for persistent (managed) agents
#[serde(default)]
api_key: Option<String>,
/// Deterministic, recomputable hardware identity reported by the agent
/// (v2-stable-identity Task 1; `transport/websocket.rs`). Used to dedup
/// registrations for UN-KEYED agents only — see the security note where it is
/// consumed below. CLIENT-SUPPLIED and therefore spoofable: it is never used to
/// override a `cak_` key's authoritative machine binding.
#[serde(default)]
machine_uid: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -110,6 +117,12 @@ pub async fn agent_ws_handler(
.unwrap_or_else(|| agent_id.clone());
let support_code = params.support_code.clone();
let api_key = params.api_key.clone();
// CLIENT-SUPPLIED machine_uid (v2-stable-identity Task 1). Spoofable; see the
// security gate below. It is used for dedup ONLY on the un-keyed path; for
// `cak_`-keyed agents it is suppressed so the key's machine binding stays
// authoritative. `is_keyed_agent` records which path authenticated us.
let claimed_machine_uid = params.machine_uid.clone();
let mut is_keyed_agent = false;
// Real client IP via the trusted-proxy-aware extractor (shared with the rate
// limiter and audit log). Behind NPM on loopback, `addr.ip()` is 127.0.0.1;
// this resolves the actual remote agent IP from forwarding headers when the
@@ -268,6 +281,11 @@ pub async fn agent_ws_handler(
);
}
agent_id = trusted_agent_id;
// KEYED agent: the key→machine binding is AUTHORITATIVE. Suppress
// the client-claimed machine_uid for dedup (below) so a valid key
// for machine X can never repoint machine Y's row by claiming Y's
// uid. Dedup for this agent stays on its authenticated agent_id.
is_keyed_agent = true;
info!(
"Agent {} from {} authenticated via per-agent key",
agent_id, client_ip
@@ -324,6 +342,20 @@ pub async fn agent_ws_handler(
let support_codes = state.support_codes.clone();
let db = state.db.clone();
// SECURITY GATE for machine_uid dedup (v2-stable-identity Task 2):
// - KEYED (`cak_`) agents: dedup stays on the authenticated agent_id; the
// claimed uid is DROPPED so it cannot override the key's machine binding.
// - UN-KEYED agents (support-code, deprecated shared key, no auth-identity):
// the claimed uid is a dedup-only correctness aid and is passed through.
// A client-asserted machine_uid is spoofable, so it must never be the dedup key
// for a keyed agent. This is the only place the distinction is enforced.
let effective_machine_uid = if is_keyed_agent {
None
} else {
// Treat an empty/whitespace-only claim as absent.
claimed_machine_uid.filter(|u| !u.trim().is_empty())
};
// Bounded relay: cap inbound frame/message size before the socket is upgraded
// so oversized agent frames are rejected by the WS layer, never `to_vec()`'d
// and broadcast. (WS-OOM HIGH.)
@@ -340,6 +372,7 @@ pub async fn agent_ws_handler(
agent_id,
agent_name,
support_code,
effective_machine_uid,
Some(client_ip),
)
}))
@@ -548,6 +581,10 @@ async fn handle_agent_connection(
agent_id: String,
agent_name: String,
support_code: Option<String>,
// Dedup identity for the UN-KEYED path only (already security-gated to `None`
// for `cak_`-keyed agents by `agent_ws_handler`). Drives both the in-memory
// session reattach key and the DB `ON CONFLICT (machine_uid)` upsert.
machine_uid: Option<String>,
client_ip: Option<std::net::IpAddr>,
) {
info!(
@@ -581,14 +618,27 @@ async fn handle_agent_connection(
// Persistent agents (no support code) keep their session when disconnected
let is_persistent = support_code.is_none();
let (session_id, frame_tx, mut input_rx) = sessions
.register_agent(agent_id.clone(), agent_name.clone(), is_persistent)
.register_agent(
agent_id.clone(),
agent_name.clone(),
is_persistent,
machine_uid.as_deref(),
)
.await;
info!("Session created: {} (agent in idle mode)", session_id);
// Database: upsert machine and create session record
let _machine_id = if let Some(ref db) = db {
match db::machines::upsert_machine(db.pool(), &agent_id, &agent_name, is_persistent).await {
match db::machines::upsert_machine(
db.pool(),
&agent_id,
&agent_name,
is_persistent,
machine_uid.as_deref(),
)
.await
{
Ok(machine) => {
// Create session record
let _ = db::sessions::create_session(