claudetools/projects/msp-tools/guru-connect/SEC4_AGENT_VALIDATION_AUDIT.md

# SEC-4: Agent Connection Validation - Security Audit

**Status:** NEEDS ENHANCEMENT - Validation exists but has security gaps
**Priority:** CRITICAL
**Date:** 2026-01-17

## Audit Findings

### GOOD: Existing Validation

The agent connection handler (relay/mod.rs:54-123) has solid validation logic:

**Support Code Validation (Lines 74-87)**
```rust
if let Some(ref code) = support_code {
    let code_info = state.support_codes.get_status(code).await;
    if code_info.is_none() {
        warn!("Agent connection rejected: {} - invalid support code {}", agent_id, code);
        return Err(StatusCode::UNAUTHORIZED);  // ✓ Rejects invalid codes
    }
    let status = code_info.unwrap();
    if status != "pending" && status != "connected" {
        warn!("Agent connection rejected: {} - support code {} has status {}", agent_id, code, status);
        return Err(StatusCode::UNAUTHORIZED);  // ✓ Rejects expired/cancelled codes
    }
}
```

**API Key Validation (Lines 90-98)**
```rust
if let Some(ref key) = api_key {
    if !validate_agent_api_key(key, &state.config).await {
        warn!("Agent connection rejected: {} - invalid API key", agent_id);
        return Err(StatusCode::UNAUTHORIZED);  // ✓ Rejects invalid API keys
    }
}
```

**Continuous Cancellation Checking (Lines 266-290)**
- Background task checks for code cancellation every 2 seconds
- Immediately disconnects agent if support code is cancelled
- Sends disconnect message to agent with reason

**What's Working:**
✓ Support code status validation (pending/connected only)
✓ API key validation (JWT or shared key)
✓ Requires at least one authentication method
✓ Periodic cancellation detection
✓ Database session tracking
✓ Connection/disconnection logging to console

## SECURITY GAPS FOUND

### 1. NO IP ADDRESS LOGGING (CRITICAL)

**Problem:** All database event logging calls use `None` for IP address parameter

**Evidence:**
```rust
// relay/mod.rs:207-213 - Session started event
let _ = db::events::log_event(
    db.pool(),
    session_id,
    db::events::EventTypes::SESSION_STARTED,
    None, None, None, None,  // ← IP address is None
).await;
```

**Impact:**
- Cannot trace suspicious connection patterns
- Cannot identify brute force attempts from specific IPs
- Cannot implement IP-based blocking
- Audit log incomplete for forensics

**Fix Required:** Extract client IP from WebSocket connection and log it

### 2. NO FAILED CONNECTION LOGGING (CRITICAL)

**Problem:** Only successful connections create database audit events. Failed validation attempts are only logged to console with `warn!()`

**Evidence:**
```rust
// Lines 68, 77, 81, 94 - All failed attempts only log to console
warn!("Agent connection rejected: {} - no support code or API key", agent_id);
return Err(StatusCode::UNAUTHORIZED);  // ← No database event created
```

**Impact:**
- Cannot detect brute force attacks
- Cannot identify stolen/leaked support codes being tried
- Cannot track repeated failed attempts from same IP
- No audit trail for security incidents

**Fix Required:** Create database events for failed connection attempts with:
- Timestamp
- Agent ID
- IP address
- Failure reason (invalid code, expired code, invalid API key, no auth)

### 3. NO CONNECTION RATE LIMITING (HIGH)

**Problem:** SEC-2 rate limiting is not yet functional due to compilation errors

**Impact:**
- Attacker can try unlimited support codes per second
- API key brute forcing is possible
- No protection against DoS via connection spam

**Fix Required:** Complete SEC-2 implementation or implement custom rate limiting

### 4. NO API KEY STRENGTH VALIDATION (MEDIUM)

**Problem:** API keys are validated but not checked for minimum strength

**Current Code (relay/mod.rs:108-123)**
```rust
async fn validate_agent_api_key(api_key: &str, config: &Config) -> bool {
    // 1. Try as JWT token
    if let Ok(claims) = crate::auth::jwt::verify_token(api_key, &config.jwt_secret) {
        if claims.role == "admin" || claims.role == "agent" {
            return true;  // ✓ Valid JWT
        }
    }

    // 2. Check against configured shared key
    if let Some(ref configured_key) = config.agent_api_key {
        if api_key == configured_key {
            return true;  // ← No strength check
        }
    }

    false
}
```

**Impact:**
- Weak API keys like "12345" or "password" could be configured
- No enforcement of minimum length or complexity

**Fix Required:** Validate API key strength (minimum 32 characters, high entropy)

## Recommended Fixes

### FIX 1: Add IP Address Extraction (HIGH PRIORITY)

**Create:** `server/src/utils/ip_extract.rs`
```rust
use axum::extract::ConnectInfo;
use std::net::SocketAddr;

/// Extract IP address from Axum request
pub fn extract_ip(connect_info: Option<&ConnectInfo<SocketAddr>>) -> Option<String> {
    connect_info.map(|info| info.0.ip().to_string())
}
```

**Modify:** `server/src/relay/mod.rs` - Add ConnectInfo to handlers
```rust
use axum::extract::ConnectInfo;
use std::net::SocketAddr;

pub async fn agent_ws_handler(
    ws: WebSocketUpgrade,
    State(state): State<AppState>,
    ConnectInfo(addr): ConnectInfo<SocketAddr>,  // ← Add this
    // ... rest
) -> Result<impl IntoResponse, StatusCode> {
    let client_ip = Some(addr.ip());
    // ... use client_ip in log_event calls
}
```

**Modify:** All `log_event()` calls to include IP address

### FIX 2: Add Failed Connection Event Logging (HIGH PRIORITY)

**Add new event types to `db/events.rs`:**
```rust
impl EventTypes {
    // Existing...
    pub const CONNECTION_REJECTED_NO_AUTH: &'static str = "connection_rejected_no_auth";
    pub const CONNECTION_REJECTED_INVALID_CODE: &'static str = "connection_rejected_invalid_code";
    pub const CONNECTION_REJECTED_EXPIRED_CODE: &'static str = "connection_rejected_expired_code";
    pub const CONNECTION_REJECTED_INVALID_API_KEY: &'static str = "connection_rejected_invalid_api_key";
}
```

**Modify:** `relay/mod.rs` to log rejections to database
```rust
// Before returning Err(), log to database
if let Some(ref db) = state.db {
    let _ = db::events::log_event(
        db.pool(),
        Uuid::new_v4(),  // Create temporary UUID for failed attempt
        db::events::EventTypes::CONNECTION_REJECTED_INVALID_CODE,
        None,
        Some(&agent_id),
        Some(serde_json::json!({
            "support_code": code,
            "reason": "invalid_code"
        })),
        Some(client_ip),
    ).await;
}
```

### FIX 3: Add API Key Strength Validation (MEDIUM PRIORITY)

**Create:** `server/src/utils/validation.rs`
```rust
use anyhow::{anyhow, Result};

/// Validate API key meets minimum security requirements
pub fn validate_api_key_strength(api_key: &str) -> Result<()> {
    if api_key.len() < 32 {
        return Err(anyhow!("API key must be at least 32 characters long"));
    }

    // Check for common weak keys
    let weak_keys = ["password", "12345", "admin", "test"];
    if weak_keys.contains(&api_key.to_lowercase().as_str()) {
        return Err(anyhow!("API key is too weak"));
    }

    // Check for sufficient entropy (basic check)
    let unique_chars: std::collections::HashSet<char> = api_key.chars().collect();
    if unique_chars.len() < 10 {
        return Err(anyhow!("API key has insufficient entropy"));
    }

    Ok(())
}
```

**Modify:** Config loading to validate API key at startup

### FIX 4: Add Connection Monitoring Dashboard Query

**Create:** `server/src/db/security.rs`
```rust
/// Get failed connection attempts by IP (for monitoring)
pub async fn get_failed_attempts_by_ip(
    pool: &PgPool,
    since: DateTime<Utc>,
    limit: i64,
) -> Result<Vec<(String, i64)>, sqlx::Error> {
    sqlx::query_as::<_, (String, i64)>(
        r#"
        SELECT ip_address::text, COUNT(*) as attempt_count
        FROM connect_session_events
        WHERE event_type LIKE 'connection_rejected_%'
          AND timestamp > $1
          AND ip_address IS NOT NULL
        GROUP BY ip_address
        ORDER BY attempt_count DESC
        LIMIT $2
        "#
    )
    .bind(since)
    .bind(limit)
    .fetch_all(pool)
    .await
}
```

## Implementation Priority

**Day 1 (Immediate):**
1. FIX 1: Add IP address extraction and logging
2. FIX 2: Add failed connection event logging

**Day 2:**
3. FIX 3: Add API key strength validation
4. FIX 4: Add security monitoring queries

**Later (after SEC-2 complete):**
5. Enable rate limiting on agent connections

## Testing Checklist

After implementing fixes:
- [ ] Valid support code connects successfully (IP logged)
- [ ] Invalid support code is rejected (failed attempt logged with IP)
- [ ] Expired support code is rejected (failed attempt logged)
- [ ] Valid API key connects successfully (IP logged)
- [ ] Invalid API key is rejected (failed attempt logged with IP)
- [ ] No auth method is rejected (failed attempt logged with IP)
- [ ] Weak API key is rejected at startup
- [ ] Security monitoring query returns suspicious IPs
- [ ] Failed attempts visible in dashboard

## Current Status

**Validation Logic:** GOOD - Rejects invalid connections correctly
**Audit Logging:** INCOMPLETE - No IP addresses, no failed attempts
**Rate Limiting:** NOT IMPLEMENTED - Blocked by SEC-2
**API Key Validation:** INCOMPLETE - No strength checking

---

**Audit Completed:** 2026-01-17
**Next Action:** Implement FIX 1 and FIX 2 (IP logging + failed connection events)