feat(server): v2 secure-session-core Task 2 - auth rebuild
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m37s
Build and Test / Build Agent (Windows) (push) Successful in 6m37s
Build and Test / Security Audit (push) Successful in 4m10s
Build and Test / Build Summary (push) Has been skipped

SPEC-002 Phase 1 Task 2 (specs/v2-secure-session-core), code-reviewed APPROVED.

- DELETE the JWT-as-agent-key branch in relay validate_agent_api_key (audit
  CRITICAL): agent auth now = per-agent cak_ key (SHA-256 -> connect_agent_keys,
  revoked filtered) OR support code OR deprecated shared AGENT_API_KEY (warned).
  A user JWT can no longer authenticate an agent.
- auth/agent_keys.rs: cak_ gen (OsRng 256-bit) + SHA-256 hash + verify.
- auth/jwt.rs: ViewerClaims + create/validate_viewer_token (5-min TTL,
  purpose=viewer, session_id+tenant_id claims; non-interchangeable with login).
- Admin key issuance: POST/GET/DELETE /api/machines/:agent_id/keys.
- POST /api/sessions/:id/viewer-token mints a session-bound short-lived token.
- Migration 005: organization/site/tags on connect_machines (fixes the silent
  update_machine_metadata write, coord todo faf39fe0).

NOTE: viewer-token minting is gated by AuthenticatedUser only; the AUTHORIZATION
check (admin/permission gate) that closes audit CRITICAL #1 lands in Task 3 (the
viewer WS verification). The viewer WS path (relay/mod.rs:285) is untouched here.
Not cargo-check-verified (no toolchain on the authoring host) - self-reviewed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 18:57:12 -07:00
parent fef8111ff3
commit 41691bfb2c
12 changed files with 788 additions and 16 deletions

View File

@@ -1,7 +1,10 @@
# v2 Secure Session Core — Implementation Plan
> Spec created: 2026-05-29
> Status: in progress — Task 1 (schema) DONE 2026-05-29; Task 2 (auth) next
> Status: in progress — Tasks 1-2 DONE 2026-05-29; Task 3 (relay WS) next.
> CARRY-FORWARD: Task 3 MUST add a viewer-token AUTHORIZATION check (admin/permission gate) — Task 2
> fixed only the token *mechanism*; the authz gate is what actually closes audit CRITICAL #1. Policy
> (admin-only vs admin-or-view-permission) pending Mike's decision.
> Parent: `docs/specs/SPEC-002-v2-modernization-architecture.md` (Phase 1)
> Keystone: Tasks 14 are the "get-right-first" secure auth/session core — every audit CRITICAL/HIGH
> is closed there. Tasks 57 deliver the product capability on top. Do them in order.
@@ -47,7 +50,22 @@ Reference: `specs/native-remote-control/plan.md` Task 2 (`connect_agent_keys`);
---
## Task 2 (KEYSTONE): Rebuilt auth model — plane separation + session-scoped viewer tokens
## Task 2 (KEYSTONE) [DONE 2026-05-29 — code-reviewed APPROVED]: Rebuilt auth model — plane separation + session-scoped viewer tokens
> CARRY-FORWARD TO TASK 3: viewer-token minting is gated only by `AuthenticatedUser` (authentication,
> not authorization). GC has a real `admin|operator|viewer` role + permissions model, so this is intra-
> tenant privilege escalation until a permission check is added. **The mechanism is fixed here; the authz
> check in Task 3 is what closes audit CRITICAL #1.** Metadata bug todo faf39fe0 resolved (migration 005).
> [IMPLEMENTED] `auth/agent_keys.rs` [new] (cak_ mint/SHA-256 hash/verify), `auth/jwt.rs`
> (`ViewerClaims` + `create_viewer_token`/`validate_viewer_token`, 5-min TTL, `purpose:"viewer"`),
> `auth/mod.rs` (module + re-export). Deleted the JWT-as-agent-key branch in `relay/mod.rs`
> `validate_agent_api_key` — now per-agent `cak_` key OR deprecated shared `AGENT_API_KEY` (WARNING-logged),
> never a user JWT. New endpoints: `POST/GET /api/machines/:agent_id/keys`,
> `DELETE /api/machines/:agent_id/keys/:key_id` (admin), `POST /api/sessions/:id/viewer-token` (dashboard JWT).
> db helpers added: `agent_keys::{list_for_machine,key_belongs_to_machine}`. Folded in migration
> `005_machine_metadata.sql` + Machine struct org/site/tags mapping (coord todo faf39fe0). No Rust
> toolchain on this machine — self-reviewed; not yet `cargo check`-verified.
Files touched: `server/src/auth/` (`mod.rs`, `jwt.rs`, `agent_keys.rs` [new], `token_blacklist.rs`,
`password.rs`), `server/src/api/auth.rs`, `server/src/api/sessions.rs`.
@@ -76,6 +94,13 @@ Files touched: `server/src/relay/mod.rs`, `server/src/session/mod.rs`.
- **`viewer_ws_handler`** (`relay/mod.rs:242`): verify the viewer token's **signature + expiry +
blacklist + `session_id` claim == requested `session_id`** before `handle_viewer_connection`
(`relay/mod.rs:595`). Reject otherwise. (Fixes the any-JWT-joins-any-session + blacklist-bypass CRITICALs.)
- **Viewer-token AUTHORIZATION (carry-forward from Task 2 review) — this is what actually closes audit
CRITICAL #1.** The minting endpoint `POST /api/sessions/:id/viewer-token` (in `server/src/api/sessions.rs`)
must enforce a real permission predicate, not just `AuthenticatedUser`: `user.is_admin() ||
user.has_permission(<policy>)`. GC's role model (`admin|operator|viewer`) + permissions table already
exist (`server/src/auth/mod.rs`), so honoring the intra-tenant role distinction is cheap. **Policy
decision (Mike): admin-only, or admin-or-`view_sessions`-permission.** Multi-tenant client-access
isolation stays deferred to Phase 4; this is only the intra-tenant role gate.
- **`agent_ws_handler`** (`relay/mod.rs:55`): authenticate via per-agent key OR support code only
(Task 2). Persistent reattach must bind to the authenticated machine identity, not a query-string
`agent_id` alone (`session/mod.rs:98`).