feat(server): v2 secure-session-core Task 2 - auth rebuild
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m37s
Build and Test / Build Agent (Windows) (push) Successful in 6m37s
Build and Test / Security Audit (push) Successful in 4m10s
Build and Test / Build Summary (push) Has been skipped

SPEC-002 Phase 1 Task 2 (specs/v2-secure-session-core), code-reviewed APPROVED.

- DELETE the JWT-as-agent-key branch in relay validate_agent_api_key (audit
  CRITICAL): agent auth now = per-agent cak_ key (SHA-256 -> connect_agent_keys,
  revoked filtered) OR support code OR deprecated shared AGENT_API_KEY (warned).
  A user JWT can no longer authenticate an agent.
- auth/agent_keys.rs: cak_ gen (OsRng 256-bit) + SHA-256 hash + verify.
- auth/jwt.rs: ViewerClaims + create/validate_viewer_token (5-min TTL,
  purpose=viewer, session_id+tenant_id claims; non-interchangeable with login).
- Admin key issuance: POST/GET/DELETE /api/machines/:agent_id/keys.
- POST /api/sessions/:id/viewer-token mints a session-bound short-lived token.
- Migration 005: organization/site/tags on connect_machines (fixes the silent
  update_machine_metadata write, coord todo faf39fe0).

NOTE: viewer-token minting is gated by AuthenticatedUser only; the AUTHORIZATION
check (admin/permission gate) that closes audit CRITICAL #1 lands in Task 3 (the
viewer WS verification). The viewer WS path (relay/mod.rs:285) is untouched here.
Not cargo-check-verified (no toolchain on the authoring host) - self-reviewed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 18:57:12 -07:00
parent fef8111ff3
commit 41691bfb2c
12 changed files with 788 additions and 16 deletions

View File

@@ -0,0 +1,26 @@
-- Migration: 005_machine_metadata.sql
-- Purpose: Add the organization / site / tags columns to connect_machines.
--
-- db::machines::update_machine_metadata (called from relay/mod.rs on every
-- AgentStatus update) has always written `organization`, `site`, and `tags`
-- to connect_machines, but no prior migration ever created those columns.
-- The caller swallows the error (`let _ = ...`), so the write failed silently
-- and the metadata never persisted. This adds the columns so the write
-- succeeds, and the Machine struct now maps them (so SELECT * returns them).
-- Resolves coord todo faf39fe0.
--
-- Idempotent: ADD COLUMN IF NOT EXISTS. Applied on server startup by
-- sqlx::migrate!(); never pre-applied via psql. Ordered after 004.
-- See .claude/standards/gururmm/sqlx-migrations.md.
-- organization / site: nullable free-form text reported by the agent
-- (AgentStatus.organization / AgentStatus.site).
ALTER TABLE connect_machines ADD COLUMN IF NOT EXISTS organization TEXT;
ALTER TABLE connect_machines ADD COLUMN IF NOT EXISTS site TEXT;
-- tags: TEXT[] matching the &[String] bound by update_machine_metadata and the
-- repeated-string AgentStatus.tags proto field. NOT NULL DEFAULT '{}' so every
-- row (existing and new) holds an array and decodes cleanly into Vec<String>;
-- the metadata update only overwrites it when the agent reports a non-empty set.
ALTER TABLE connect_machines
ADD COLUMN IF NOT EXISTS tags TEXT[] NOT NULL DEFAULT '{}';

View File

@@ -0,0 +1,253 @@
//! Per-agent key issuance endpoints (admin plane).
//!
//! Lets an administrator mint, list, and revoke per-agent `cak_` keys for a
//! managed machine. These keys are the v2 replacement for the shared
//! `AGENT_API_KEY`: each is per-machine, individually revocable, and stored
//! only as a SHA-256 hash.
//!
//! Auth: dashboard JWT + admin role (the [`AdminUser`] extractor). All routes
//! are mounted behind the JWT `auth_layer`, so the extractor has the JWT config
//! and blacklist available.
//!
//! SECURITY:
//! - The plaintext key is returned exactly once, in the create response. It is
//! never persisted and never logged.
//! - List responses expose only metadata (id, timestamps) — never the hash and
//! never the plaintext.
//! - Errors returned to clients use the standard envelope and never surface raw
//! `sqlx`/`anyhow` strings (which can leak schema/host detail).
use axum::{
extract::{Path, State},
http::StatusCode,
Json,
};
use chrono::{DateTime, Utc};
use serde::Serialize;
use uuid::Uuid;
use crate::auth::{agent_keys, AdminUser};
use crate::db;
use crate::AppState;
/// Standard error envelope (see `.claude/standards/api/response-format.md`).
#[derive(Debug, Serialize)]
pub struct ApiError {
pub detail: String,
pub error_code: String,
pub status_code: u16,
}
impl ApiError {
fn new(status: StatusCode, code: &str, detail: &str) -> (StatusCode, Json<ApiError>) {
(
status,
Json(ApiError {
detail: detail.to_string(),
error_code: code.to_string(),
status_code: status.as_u16(),
}),
)
}
}
type ApiResult<T> = Result<T, (StatusCode, Json<ApiError>)>;
/// Response for a freshly minted key. `key` is present ONLY here, once.
#[derive(Debug, Serialize)]
pub struct CreatedKey {
pub id: Uuid,
pub machine_id: Uuid,
/// The plaintext `cak_` key. Shown exactly once — store it now; the server
/// keeps only its hash.
pub key: String,
pub created_at: DateTime<Utc>,
}
/// Public metadata for an existing key. Never includes the hash or plaintext.
#[derive(Debug, Serialize)]
pub struct KeyMetadata {
pub id: Uuid,
pub machine_id: Uuid,
pub created_at: DateTime<Utc>,
pub last_used_at: Option<DateTime<Utc>>,
pub revoked_at: Option<DateTime<Utc>>,
}
/// Resolve the database handle or a 503 envelope.
fn require_db(state: &AppState) -> ApiResult<&db::Database> {
state.db.as_ref().ok_or_else(|| {
ApiError::new(
StatusCode::SERVICE_UNAVAILABLE,
"DATABASE_UNAVAILABLE",
"Database not available",
)
})
}
/// Resolve an `agent_id` path segment to the machine's UUID primary key (or a
/// 404 envelope). The route param is `agent_id` (string) to stay consistent
/// with the other `/api/machines/:agent_id` routes — matchit 0.7 panics if the
/// same path position uses two different param names — while the key tables key
/// on the machine UUID.
async fn resolve_machine_id(db: &db::Database, agent_id: &str) -> ApiResult<Uuid> {
let machine = db::machines::get_machine_by_agent_id(db.pool(), agent_id)
.await
.map_err(|e| {
tracing::error!("DB error resolving machine: {}", e);
ApiError::new(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Internal server error",
)
})?
.ok_or_else(|| {
ApiError::new(
StatusCode::NOT_FOUND,
"MACHINE_NOT_FOUND",
"Machine not found",
)
})?;
Ok(machine.id)
}
/// POST /api/machines/:agent_id/keys — mint a new per-agent key for a machine.
///
/// `:agent_id` is the machine's `agent_id`; it is resolved to the machine UUID
/// (`connect_machines.id`) that the `connect_agent_keys.machine_id` FK targets.
/// Returns the plaintext key once.
pub async fn create_key(
AdminUser(admin): AdminUser,
State(state): State<AppState>,
Path(agent_id): Path<String>,
) -> ApiResult<(StatusCode, Json<CreatedKey>)> {
let db = require_db(&state)?;
let machine_id = resolve_machine_id(db, &agent_id).await?;
// Mint plaintext + hash. Only the hash is persisted.
let plaintext = agent_keys::generate_agent_key();
let key_hash = agent_keys::hash_agent_key(&plaintext);
let tenant_id = db::tenancy::current_tenant_id();
let key_id =
db::agent_keys::insert_agent_key(db.pool(), machine_id, &key_hash, Some(tenant_id))
.await
.map_err(|e| {
tracing::error!("DB error inserting agent key: {}", e);
ApiError::new(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Failed to create agent key",
)
})?;
// Audit the issuance WITHOUT the key material.
tracing::info!(
"Admin {} issued a per-agent key {} for machine {}",
admin.username,
key_id,
machine_id
);
Ok((
StatusCode::CREATED,
Json(CreatedKey {
id: key_id,
machine_id,
key: plaintext,
created_at: Utc::now(),
}),
))
}
/// GET /api/machines/:agent_id/keys — list key metadata for a machine.
///
/// Returns only non-secret metadata. Useful for an admin to see which keys
/// exist, when they were last used, and which are revoked.
pub async fn list_keys(
AdminUser(_admin): AdminUser,
State(state): State<AppState>,
Path(agent_id): Path<String>,
) -> ApiResult<Json<Vec<KeyMetadata>>> {
let db = require_db(&state)?;
let machine_id = resolve_machine_id(db, &agent_id).await?;
let keys = db::agent_keys::list_for_machine(db.pool(), machine_id)
.await
.map_err(|e| {
tracing::error!("DB error listing agent keys: {}", e);
ApiError::new(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Failed to list agent keys",
)
})?;
let out = keys
.into_iter()
.map(|k| KeyMetadata {
id: k.id,
machine_id: k.machine_id,
created_at: k.created_at,
last_used_at: k.last_used_at,
revoked_at: k.revoked_at,
})
.collect();
Ok(Json(out))
}
/// DELETE /api/machines/:agent_id/keys/:key_id — revoke a key.
///
/// Sets `revoked_at`; the key is immediately rejected by agent auth thereafter.
/// `:agent_id` (machine) scopes the revoke so a key id from another machine
/// cannot be revoked via a mismatched URL.
pub async fn revoke_key(
AdminUser(admin): AdminUser,
State(state): State<AppState>,
Path((agent_id, key_id)): Path<(String, Uuid)>,
) -> ApiResult<StatusCode> {
let db = require_db(&state)?;
let machine_id = resolve_machine_id(db, &agent_id).await?;
// Scope the revoke to the path machine so a key id from another machine
// cannot be revoked via a mismatched URL.
let owned = db::agent_keys::key_belongs_to_machine(db.pool(), key_id, machine_id)
.await
.map_err(|e| {
tracing::error!("DB error verifying key ownership: {}", e);
ApiError::new(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Internal server error",
)
})?;
if !owned {
return Err(ApiError::new(
StatusCode::NOT_FOUND,
"KEY_NOT_FOUND",
"Key not found for this machine",
));
}
db::agent_keys::revoke_agent_key(db.pool(), key_id)
.await
.map_err(|e| {
tracing::error!("DB error revoking agent key: {}", e);
ApiError::new(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Failed to revoke agent key",
)
})?;
tracing::info!(
"Admin {} revoked per-agent key {} for machine {}",
admin.username,
key_id,
machine_id
);
Ok(StatusCode::NO_CONTENT)
}

View File

@@ -4,7 +4,9 @@ pub mod auth;
pub mod auth_logout;
pub mod changelog;
pub mod downloads;
pub mod machine_keys;
pub mod releases;
pub mod sessions;
pub mod users;
use axum::{

106
server/src/api/sessions.rs Normal file
View File

@@ -0,0 +1,106 @@
//! Session API endpoints — viewer-token minting.
//!
//! `POST /api/sessions/:id/viewer-token` mints a short-lived, session-scoped
//! viewer JWT for an authenticated dashboard user. This replaces the v1 model
//! where any dashboard JWT could join any session at the viewer WebSocket:
//! a viewer token is now bound to one `session_id`, and the WS layer (Task 3)
//! verifies signature + expiry + blacklist + that the token's `session_id`
//! matches the requested session.
//!
//! Authorization (Phase 1): the requester must present a valid dashboard JWT
//! (the [`AuthenticatedUser`] extractor) AND the target session must exist in
//! the live session manager. Per-tenant / per-machine ACL narrowing is a
//! Phase-4 concern; the tenancy claim is already carried so the WS and future
//! phases can enforce it. Minting is itself the authorization gate — only an
//! authenticated user can obtain a token, and only for a real session.
use axum::{
extract::{Path, State},
http::StatusCode,
Json,
};
use serde::Serialize;
use uuid::Uuid;
use crate::auth::AuthenticatedUser;
use crate::db;
use crate::AppState;
use super::machine_keys::ApiError;
type ApiResult<T> = Result<T, (StatusCode, Json<ApiError>)>;
/// Response carrying a freshly minted viewer token.
#[derive(Debug, Serialize)]
pub struct ViewerTokenResponse {
/// The short-lived viewer JWT. Use as the `token` query param on
/// `/ws/viewer` for this session only.
pub token: String,
/// The session the token authorizes (echoed for client convenience).
pub session_id: String,
/// Token lifetime in seconds (for client-side refresh scheduling).
pub expires_in_secs: i64,
}
/// Build a standard error envelope (mirrors `machine_keys`).
fn err(status: StatusCode, code: &str, detail: &str) -> (StatusCode, Json<ApiError>) {
(
status,
Json(ApiError {
detail: detail.to_string(),
error_code: code.to_string(),
status_code: status.as_u16(),
}),
)
}
/// POST /api/sessions/:id/viewer-token
///
/// Mints a ~5-minute, session-scoped viewer token for the authenticated user.
pub async fn mint_viewer_token(
user: AuthenticatedUser,
State(state): State<AppState>,
Path(id): Path<String>,
) -> ApiResult<Json<ViewerTokenResponse>> {
let session_id = Uuid::parse_str(&id)
.map_err(|_| err(StatusCode::BAD_REQUEST, "INVALID_SESSION_ID", "Invalid session ID"))?;
// Authorization gate: the session must exist (live session manager is the
// source of truth for joinable sessions, matching GET /api/sessions/:id).
let session = state.sessions.get_session(session_id).await.ok_or_else(|| {
err(
StatusCode::NOT_FOUND,
"SESSION_NOT_FOUND",
"Session not found",
)
})?;
// Resolve tenancy (Phase-1: always the default tenant). Carried in the
// claim so the WS and Phase-4 isolation can enforce it.
let tenant_id = db::tenancy::current_tenant_id();
let token = state
.jwt_config
.create_viewer_token(&user.user_id, session_id, tenant_id)
.map_err(|e| {
tracing::error!("Failed to mint viewer token: {}", e);
err(
StatusCode::INTERNAL_SERVER_ERROR,
"INTERNAL_ERROR",
"Failed to mint viewer token",
)
})?;
tracing::info!(
"User {} minted a viewer token for session {} (agent {})",
user.username,
session_id,
session.agent_id
);
Ok(Json(ViewerTokenResponse {
token,
session_id: session_id.to_string(),
expires_in_secs: crate::auth::jwt::VIEWER_TOKEN_TTL_SECS,
}))
}

View File

@@ -0,0 +1,147 @@
//! Per-agent key minting, hashing, and verification (auth layer).
//!
//! This is the auth-plane counterpart to the persistence-only
//! [`crate::db::agent_keys`] module. It owns the *secret* side of the per-agent
//! key lifecycle:
//!
//! - [`generate_agent_key`] mints a high-entropy, `cak_`-prefixed plaintext key.
//! The plaintext is shown to the operator exactly once at issuance and is
//! NEVER persisted or logged.
//! - [`hash_agent_key`] computes the SHA-256 hex digest that the database
//! stores and looks up by. SHA-256 (not Argon2) is deliberate here: the key is
//! already high-entropy random, so the slow, salted password KDF buys nothing,
//! and a deterministic digest is required for the constant-shape hash-equality
//! lookup. Argon2id remains reserved for *passwords* (see
//! [`crate::auth::password`]).
//! - [`verify_agent_key`] hashes a presented key, looks it up against the
//! active (non-revoked) rows, stamps `last_used_at` on a hit, and returns the
//! owning machine id.
//!
//! This mirrors the GuruRMM `agk_` enrollment pattern
//! (`guru-rmm/server/src/ws/mod.rs`): random bytes -> prefixed token ->
//! hex SHA-256 stored hash.
//!
//! SECURITY: never log a plaintext key or its hash. Functions here return the
//! plaintext to the caller (issuance endpoint) but emit no `tracing` output
//! containing key material.
use rand::RngCore;
use ring::digest;
use sqlx::PgPool;
use uuid::Uuid;
use crate::db;
/// Prefix marking a GuruConnect per-agent key. Mirrors GuruRMM's `agk_`.
pub const AGENT_KEY_PREFIX: &str = "cak_";
/// Number of random bytes behind a per-agent key (256 bits of entropy).
const AGENT_KEY_RANDOM_BYTES: usize = 32;
/// Generate a new high-entropy, `cak_`-prefixed per-agent key (plaintext).
///
/// The returned string is the ONLY time the plaintext exists; the caller must
/// surface it to the operator once and store only [`hash_agent_key`] of it.
/// Uses the OS CSPRNG via `rand::rngs::OsRng`.
pub fn generate_agent_key() -> String {
let mut bytes = [0u8; AGENT_KEY_RANDOM_BYTES];
rand::rngs::OsRng.fill_bytes(&mut bytes);
format!("{}{}", AGENT_KEY_PREFIX, hex_encode(&bytes))
}
/// Compute the storage/lookup hash for a per-agent key.
///
/// SHA-256, hex-encoded lowercase — identical in shape to GuruRMM's
/// `hash_api_key`, so the two systems' hashed-key columns are interchangeable
/// in format. Deterministic by design: the database looks keys up by this exact
/// value.
pub fn hash_agent_key(presented: &str) -> String {
let digest = digest::digest(&digest::SHA256, presented.as_bytes());
hex_encode(digest.as_ref())
}
/// Verify a presented per-agent key.
///
/// Returns the owning `machine_id` if the key hashes to an active (non-revoked)
/// row, stamping `last_used_at` as a side effect. Returns `None` if the key is
/// unknown or revoked (`find_active_by_hash` already filters `revoked_at`).
///
/// SECURITY: only the hash touches the database; the plaintext never leaves
/// this function and is never logged.
pub async fn verify_agent_key(pool: &PgPool, presented: &str) -> Option<Uuid> {
// Cheap structural reject before hashing/DB work: a real key is prefixed.
if !presented.starts_with(AGENT_KEY_PREFIX) {
return None;
}
let key_hash = hash_agent_key(presented);
match db::agent_keys::find_active_by_hash(pool, &key_hash).await {
Ok(Some(key)) => {
// Best-effort usage stamp; a failure here must not block auth.
if let Err(e) = db::agent_keys::touch_last_used(pool, key.id).await {
tracing::warn!("Failed to stamp last_used_at for agent key: {}", e);
}
Some(key.machine_id)
}
Ok(None) => None,
Err(e) => {
tracing::error!("Database error during agent key verification: {}", e);
None
}
}
}
/// Lowercase hex encoding without pulling in the `hex` crate.
fn hex_encode(bytes: &[u8]) -> String {
use std::fmt::Write;
let mut s = String::with_capacity(bytes.len() * 2);
for b in bytes {
let _ = write!(s, "{:02x}", b);
}
s
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn generated_key_is_prefixed_and_high_entropy() {
let key = generate_agent_key();
assert!(key.starts_with(AGENT_KEY_PREFIX));
// prefix + 32 bytes * 2 hex chars
assert_eq!(key.len(), AGENT_KEY_PREFIX.len() + AGENT_KEY_RANDOM_BYTES * 2);
}
#[test]
fn generated_keys_are_unique() {
let a = generate_agent_key();
let b = generate_agent_key();
assert_ne!(a, b);
}
#[test]
fn hash_is_stable_and_hex() {
let key = "cak_deadbeef";
let h1 = hash_agent_key(key);
let h2 = hash_agent_key(key);
assert_eq!(h1, h2);
assert_eq!(h1.len(), 64); // SHA-256 -> 32 bytes -> 64 hex chars
assert!(h1.chars().all(|c| c.is_ascii_hexdigit()));
}
#[test]
fn hash_differs_per_key() {
assert_ne!(hash_agent_key("cak_aaa"), hash_agent_key("cak_bbb"));
}
#[test]
fn hash_matches_known_sha256_vector() {
// SHA-256("abc") well-known test vector.
assert_eq!(
hex_encode(digest::digest(&digest::SHA256, b"abc").as_ref()),
"ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"
);
}
}

View File

@@ -44,6 +44,54 @@ impl Claims {
}
}
/// The fixed `purpose` discriminator carried by a viewer token.
///
/// A login [`Claims`] never carries this field; a [`ViewerClaims`] always does
/// and it is checked on validation. This makes the two token types
/// non-interchangeable even before structural deserialization differences are
/// considered.
pub const VIEWER_TOKEN_PURPOSE: &str = "viewer";
/// Default lifetime of a minted viewer token, in seconds (5 minutes).
pub const VIEWER_TOKEN_TTL_SECS: i64 = 300;
/// Claims for a session-scoped viewer token.
///
/// Minted by an authenticated + authorized dashboard user for ONE specific
/// session (`POST /api/sessions/:id/viewer-token`) and consumed by the viewer
/// WebSocket (Task 3), which verifies signature + expiry + blacklist + that
/// `session_id` matches the requested session before admitting the viewer.
///
/// The `purpose` field is fixed to [`VIEWER_TOKEN_PURPOSE`]; a normal login JWT
/// has no such field and cannot be deserialized into this struct, and this
/// struct cannot be deserialized into a login [`Claims`] (it lacks the
/// login-only fields). The discriminator is also re-checked explicitly on
/// validation so the two are unambiguously separate credential planes.
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct ViewerClaims {
/// Subject — the dashboard user id that minted/owns this token.
pub sub: String,
/// The session this token authorizes viewing of. The WS layer (Task 3)
/// rejects the token if this does not equal the requested session.
pub session_id: String,
/// Tenancy scope (Phase-4-ready). Resolved via `db::tenancy` at mint time.
pub tenant_id: String,
/// Fixed discriminator — always [`VIEWER_TOKEN_PURPOSE`].
pub purpose: String,
/// Expiration time (unix timestamp).
pub exp: i64,
/// Issued at (unix timestamp).
pub iat: i64,
}
impl ViewerClaims {
/// Parse the session id claim as a UUID.
pub fn session_uuid(&self) -> Result<Uuid> {
Uuid::parse_str(&self.session_id)
.map_err(|e| anyhow!("Invalid session_id in viewer token: {}", e))
}
}
/// JWT configuration
#[derive(Clone)]
pub struct JwtConfig {
@@ -119,6 +167,74 @@ impl JwtConfig {
Ok(token_data.claims)
}
/// Mint a short-lived, session-scoped viewer token.
///
/// Signed with the same secret as login tokens but carrying a distinct
/// [`ViewerClaims`] shape and a fixed `purpose`. TTL defaults to
/// [`VIEWER_TOKEN_TTL_SECS`] (5 minutes) — short by design so a leaked
/// viewer token has a tiny window and no standing access is granted.
pub fn create_viewer_token(
&self,
user_id: &str,
session_id: Uuid,
tenant_id: Uuid,
) -> Result<String> {
let now = Utc::now();
let exp = now + Duration::seconds(VIEWER_TOKEN_TTL_SECS);
let claims = ViewerClaims {
sub: user_id.to_string(),
session_id: session_id.to_string(),
tenant_id: tenant_id.to_string(),
purpose: VIEWER_TOKEN_PURPOSE.to_string(),
exp: exp.timestamp(),
iat: now.timestamp(),
};
encode(
&Header::default(),
&claims,
&EncodingKey::from_secret(self.secret.as_bytes()),
)
.map_err(|e| anyhow!("Failed to create viewer token: {}", e))
}
/// Validate a viewer token: signature + expiry + correct `purpose`.
///
/// Rejects a login JWT presented as a viewer token (it lacks the viewer
/// claim shape, and even if forced it would fail the `purpose` check) and,
/// symmetrically, a viewer token can never satisfy [`Self::validate_token`]
/// because it lacks the login claim fields. The WS layer (Task 3)
/// additionally checks the blacklist and the `session_id` match.
#[allow(dead_code)] // Consumed by the viewer WebSocket handler in Task 3.
pub fn validate_viewer_token(&self, token: &str) -> Result<ViewerClaims> {
let mut validation = Validation::default();
validation.validate_exp = true;
validation.validate_nbf = false;
validation.leeway = 0;
// The login claim's required fields differ from ViewerClaims, so the
// default required-claims set ("exp") is the only safe intersection.
validation.set_required_spec_claims(&["exp"]);
let token_data = decode::<ViewerClaims>(
token,
&DecodingKey::from_secret(self.secret.as_bytes()),
&validation,
)
.map_err(|e| anyhow!("Invalid viewer token: {}", e))?;
if token_data.claims.purpose != VIEWER_TOKEN_PURPOSE {
return Err(anyhow!("Token is not a viewer token"));
}
let now = Utc::now().timestamp();
if token_data.claims.exp < now {
return Err(anyhow!("Viewer token has expired"));
}
Ok(token_data.claims)
}
}
// Removed insecure default_jwt_secret() function - JWT_SECRET must be set via environment variable

View File

@@ -3,11 +3,12 @@
//! Handles JWT validation for dashboard users and API key
//! validation for agents.
pub mod agent_keys;
pub mod jwt;
pub mod password;
pub mod token_blacklist;
pub use jwt::{Claims, JwtConfig};
pub use jwt::{Claims, JwtConfig, ViewerClaims};
pub use password::{generate_random_password, hash_password, verify_password};
pub use token_blacklist::TokenBlacklist;

View File

@@ -78,8 +78,47 @@ pub async fn find_active_by_hash(
.await
}
/// List all keys for a machine (newest first), including revoked ones.
///
/// Returns full rows; the API layer projects these to metadata-only responses
/// and never serializes `key_hash`.
pub async fn list_for_machine(
pool: &PgPool,
machine_id: Uuid,
) -> Result<Vec<AgentKey>, sqlx::Error> {
sqlx::query_as::<_, AgentKey>(
r#"
SELECT id, machine_id, key_hash, tenant_id, created_at, last_used_at, revoked_at
FROM connect_agent_keys
WHERE machine_id = $1
ORDER BY created_at DESC
"#,
)
.bind(machine_id)
.fetch_all(pool)
.await
}
/// Check whether a key id belongs to the given machine.
///
/// Used to scope revocation to the machine in the request path, so a key id
/// cannot be revoked via a mismatched machine URL.
pub async fn key_belongs_to_machine(
pool: &PgPool,
key_id: Uuid,
machine_id: Uuid,
) -> Result<bool, sqlx::Error> {
let exists = sqlx::query_scalar::<_, bool>(
"SELECT EXISTS(SELECT 1 FROM connect_agent_keys WHERE id = $1 AND machine_id = $2)",
)
.bind(key_id)
.bind(machine_id)
.fetch_one(pool)
.await?;
Ok(exists)
}
/// Revoke a key by id (idempotent: re-revoking keeps the original timestamp).
#[allow(dead_code)] // Wired by the key-revocation endpoint in Task 2.
pub async fn revoke_agent_key(pool: &PgPool, id: Uuid) -> Result<(), sqlx::Error> {
sqlx::query(
r#"

View File

@@ -22,6 +22,18 @@ pub struct Machine {
pub updated_at: DateTime<Utc>,
/// Tenancy-ready (Phase 4). Backfilled to the default tenant by migration 004.
pub tenant_id: Option<Uuid>,
/// Company/organization name reported by the agent (`AgentStatus.organization`).
/// Column added in migration 005; previously written by `update_machine_metadata`
/// against a non-existent column (the write silently failed). Now mapped here so
/// `SELECT *` returns it.
pub organization: Option<String>,
/// Site/location name reported by the agent (`AgentStatus.site`). Migration 005.
pub site: Option<String>,
/// Free-form tags reported by the agent (`AgentStatus.tags`). Stored as a
/// Postgres `TEXT[]`; matches the `&[String]` bound by `update_machine_metadata`.
/// Migration 005. `NULL`/absent column reads back as an empty vec via the default.
#[serde(default)]
pub tags: Vec<String>,
}
/// Get or create a machine by agent_id (upsert)

View File

@@ -312,6 +312,11 @@ async fn main() -> Result<()> {
.route("/api/sessions", get(list_sessions))
.route("/api/sessions/:id", get(get_session))
.route("/api/sessions/:id", delete(disconnect_session))
// Session-scoped viewer-token minting (dashboard JWT; bound to one session)
.route(
"/api/sessions/:id/viewer-token",
post(api::sessions::mint_viewer_token),
)
// REST API - Machines
.route("/api/machines", get(list_machines))
.route("/api/machines/:agent_id", get(get_machine))
@@ -321,6 +326,21 @@ async fn main() -> Result<()> {
"/api/machines/:agent_id/update",
post(trigger_machine_update),
)
// Per-agent key issuance (admin only). `:agent_id` matches the param
// name used by the other /api/machines/:agent_id routes — matchit 0.7
// panics if the same path position uses two different param names.
.route(
"/api/machines/:agent_id/keys",
post(api::machine_keys::create_key),
)
.route(
"/api/machines/:agent_id/keys",
get(api::machine_keys::list_keys),
)
.route(
"/api/machines/:agent_id/keys/:key_id",
delete(api::machine_keys::revoke_key),
)
// REST API - Releases and Version
.route("/api/version", get(api::releases::get_version)) // No auth - for agent polling
.route("/api/releases", get(api::releases::list_releases))

View File

@@ -169,8 +169,9 @@ pub async fn agent_ws_handler(
// Validate API key if provided (for persistent agents)
if let Some(ref key) = api_key {
// For now, we'll accept API keys that match the JWT secret or a configured agent key
// In production, this should validate against a database of registered agents
// Agent-plane auth ONLY: a per-agent `cak_` key (hash-compared against
// connect_agent_keys, rejecting revoked) or the deprecated shared
// AGENT_API_KEY fallback. A dashboard/user JWT is NEVER accepted here.
if !validate_agent_api_key(&state, key).await {
warn!(
"Agent connection rejected: {} from {} - invalid API key",
@@ -220,21 +221,45 @@ pub async fn agent_ws_handler(
}))
}
/// Validate an agent API key
/// Validate an agent key presented on the agent plane.
///
/// SECURITY (v2): a dashboard/user JWT is NEVER a valid agent credential — the
/// old `jwt_config.validate_token` branch was the relay CRITICAL and is gone.
/// Accepts, in order:
/// 1. A per-agent `cak_` key: SHA-256 hash compared against
/// `connect_agent_keys`; revoked keys are rejected (the DB query filters
/// `revoked_at IS NULL`). This is the supported path.
/// 2. The shared `AGENT_API_KEY` env value — DEPRECATED fallback, retained
/// only for not-yet-migrated agents. Its use is logged at WARNING and it
/// should be removed once all managed agents carry per-agent keys.
///
/// Never logs the presented key or any hash.
async fn validate_agent_api_key(state: &AppState, api_key: &str) -> bool {
// Check if API key is a valid JWT (allows using dashboard token for testing)
if state.jwt_config.validate_token(api_key).is_ok() {
return true;
}
// Check against configured agent API key if set
if let Some(ref configured_key) = state.agent_api_key {
if api_key == configured_key {
// 1. Per-agent key (the supported path). Requires a database; without one,
// only the deprecated shared-key fallback below can apply.
if let Some(ref db) = state.db {
if crate::auth::agent_keys::verify_agent_key(db.pool(), api_key)
.await
.is_some()
{
return true;
}
}
// 2. DEPRECATED shared-key fallback. Constant-time-ish equality is not
// critical here (the key is high-entropy and this path is sunset), but
// we still avoid logging the value.
if let Some(ref configured_key) = state.agent_api_key {
if api_key == configured_key {
warn!(
"[WARNING] Agent authenticated via the DEPRECATED shared AGENT_API_KEY \
fallback. Migrate this agent to a per-agent cak_ key; the shared key \
will be removed."
);
return true;
}
}
// In future: validate against database of registered agents
false
}

View File

@@ -1,7 +1,10 @@
# v2 Secure Session Core — Implementation Plan
> Spec created: 2026-05-29
> Status: in progress — Task 1 (schema) DONE 2026-05-29; Task 2 (auth) next
> Status: in progress — Tasks 1-2 DONE 2026-05-29; Task 3 (relay WS) next.
> CARRY-FORWARD: Task 3 MUST add a viewer-token AUTHORIZATION check (admin/permission gate) — Task 2
> fixed only the token *mechanism*; the authz gate is what actually closes audit CRITICAL #1. Policy
> (admin-only vs admin-or-view-permission) pending Mike's decision.
> Parent: `docs/specs/SPEC-002-v2-modernization-architecture.md` (Phase 1)
> Keystone: Tasks 14 are the "get-right-first" secure auth/session core — every audit CRITICAL/HIGH
> is closed there. Tasks 57 deliver the product capability on top. Do them in order.
@@ -47,7 +50,22 @@ Reference: `specs/native-remote-control/plan.md` Task 2 (`connect_agent_keys`);
---
## Task 2 (KEYSTONE): Rebuilt auth model — plane separation + session-scoped viewer tokens
## Task 2 (KEYSTONE) [DONE 2026-05-29 — code-reviewed APPROVED]: Rebuilt auth model — plane separation + session-scoped viewer tokens
> CARRY-FORWARD TO TASK 3: viewer-token minting is gated only by `AuthenticatedUser` (authentication,
> not authorization). GC has a real `admin|operator|viewer` role + permissions model, so this is intra-
> tenant privilege escalation until a permission check is added. **The mechanism is fixed here; the authz
> check in Task 3 is what closes audit CRITICAL #1.** Metadata bug todo faf39fe0 resolved (migration 005).
> [IMPLEMENTED] `auth/agent_keys.rs` [new] (cak_ mint/SHA-256 hash/verify), `auth/jwt.rs`
> (`ViewerClaims` + `create_viewer_token`/`validate_viewer_token`, 5-min TTL, `purpose:"viewer"`),
> `auth/mod.rs` (module + re-export). Deleted the JWT-as-agent-key branch in `relay/mod.rs`
> `validate_agent_api_key` — now per-agent `cak_` key OR deprecated shared `AGENT_API_KEY` (WARNING-logged),
> never a user JWT. New endpoints: `POST/GET /api/machines/:agent_id/keys`,
> `DELETE /api/machines/:agent_id/keys/:key_id` (admin), `POST /api/sessions/:id/viewer-token` (dashboard JWT).
> db helpers added: `agent_keys::{list_for_machine,key_belongs_to_machine}`. Folded in migration
> `005_machine_metadata.sql` + Machine struct org/site/tags mapping (coord todo faf39fe0). No Rust
> toolchain on this machine — self-reviewed; not yet `cargo check`-verified.
Files touched: `server/src/auth/` (`mod.rs`, `jwt.rs`, `agent_keys.rs` [new], `token_blacklist.rs`,
`password.rs`), `server/src/api/auth.rs`, `server/src/api/sessions.rs`.
@@ -76,6 +94,13 @@ Files touched: `server/src/relay/mod.rs`, `server/src/session/mod.rs`.
- **`viewer_ws_handler`** (`relay/mod.rs:242`): verify the viewer token's **signature + expiry +
blacklist + `session_id` claim == requested `session_id`** before `handle_viewer_connection`
(`relay/mod.rs:595`). Reject otherwise. (Fixes the any-JWT-joins-any-session + blacklist-bypass CRITICALs.)
- **Viewer-token AUTHORIZATION (carry-forward from Task 2 review) — this is what actually closes audit
CRITICAL #1.** The minting endpoint `POST /api/sessions/:id/viewer-token` (in `server/src/api/sessions.rs`)
must enforce a real permission predicate, not just `AuthenticatedUser`: `user.is_admin() ||
user.has_permission(<policy>)`. GC's role model (`admin|operator|viewer`) + permissions table already
exist (`server/src/auth/mod.rs`), so honoring the intra-tenant role distinction is cheap. **Policy
decision (Mike): admin-only, or admin-or-`view_sessions`-permission.** Multi-tenant client-access
isolation stays deferred to Phase 4; this is only the intra-tenant role gate.
- **`agent_ws_handler`** (`relay/mod.rs:55`): authenticate via per-agent key OR support code only
(Task 2). Persistent reattach must bind to the authenticated machine identity, not a query-string
`agent_id` alone (`session/mod.rs:98`).