Files
claudetools/projects/msp-tools/guru-rmm/docs/REMEDIATION_PLAN.md
azcomputerguru 65086f4407 fix(security): Implement Phase 1 critical security fixes
CORS:
- Restrict CORS to DASHBOARD_URL environment variable
- Default to production dashboard domain

Authentication:
- Add AuthUser requirement to all agent management endpoints
- Add AuthUser requirement to all command endpoints
- Add AuthUser requirement to all metrics endpoints
- Add audit logging for command execution (user_id tracked)

Agent Security:
- Replace Unicode characters with ASCII markers [OK]/[ERROR]/[WARNING]
- Add certificate pinning for update downloads (allowlist domains)
- Fix insecure temp file creation (use /var/run/gururmm with 0700 perms)
- Fix rollback script backgrounding (use setsid instead of literal &)

Dashboard Security:
- Move token storage from localStorage to sessionStorage
- Add proper TypeScript types (remove 'any' from error handlers)
- Centralize token management functions

Legacy Agent:
- Add -AllowInsecureTLS parameter (opt-in required)
- Add Windows Event Log audit trail when insecure mode used
- Update documentation with security warnings

Closes: Phase 1 items in issue #1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 21:16:24 -07:00

32 KiB
Raw Blame History

GuruRMM Remediation Plan

Generated: 2026-01-20 Review Source: Full-scale code review of all components Priority: Critical issues first, then major, then warnings


Executive Summary

This document provides specific code fixes for all critical and major issues identified during the comprehensive code review. Issues are organized by component and priority.

Issue Counts:

  • CRITICAL: 16 (must fix before production)
  • MAJOR: 14 (should fix before production)
  • WARNING: 28 (recommended improvements)

Phase 1: Critical Security Fixes (MUST DO)

1.1 Server: Restrict CORS Configuration

File: server/src/main.rs Lines: 133-136 Severity: CRITICAL

Current Code:

let cors = CorsLayer::new()
    .allow_origin(Any)
    .allow_methods(Any)
    .allow_headers(Any);

Fixed Code:

use tower_http::cors::{CorsLayer, AllowOrigin};
use http::HeaderValue;

// Get allowed origins from environment
let dashboard_origin = std::env::var("DASHBOARD_URL")
    .unwrap_or_else(|_| "https://rmm.azcomputerguru.com".to_string());

let cors = CorsLayer::new()
    .allow_origin(AllowOrigin::exact(
        HeaderValue::from_str(&dashboard_origin).expect("Invalid DASHBOARD_URL")
    ))
    .allow_methods([
        http::Method::GET,
        http::Method::POST,
        http::Method::PUT,
        http::Method::DELETE,
        http::Method::OPTIONS,
    ])
    .allow_headers([
        http::header::AUTHORIZATION,
        http::header::CONTENT_TYPE,
        http::header::ACCEPT,
    ])
    .allow_credentials(true);

Environment Variable:

DASHBOARD_URL=https://rmm.azcomputerguru.com

1.2 Server: Add Authentication to All Endpoints

File: server/src/api/agents.rs Severity: CRITICAL

Problem: Most endpoints lack the AuthUser extractor.

Fix Pattern - Apply to ALL endpoint functions:

// BEFORE (insecure)
pub async fn list_agents(
    State(state): State<AppState>,
) -> Result<Json<Vec<AgentResponse>>, (StatusCode, String)> {

// AFTER (secure)
pub async fn list_agents(
    State(state): State<AppState>,
    _user: AuthUser,  // Add this to require authentication
) -> Result<Json<Vec<AgentResponse>>, (StatusCode, String)> {

Endpoints requiring AuthUser addition:

agents.rs:

  • list_agents (line 62)
  • get_agent (line 74)
  • delete_agent (line 87)
  • get_stats (line 109)
  • move_agent (line 126)
  • list_agents_with_details (line 152)
  • list_unassigned_agents (line 163)
  • get_agent_state (line 175)
  • register_agent (line 32) - Add auth OR implement rate limiting

commands.rs:

  • send_command (line 46)
  • list_commands (line 103)
  • get_command (line 117)

metrics.rs:

  • get_agent_metrics (line 29)
  • get_summary (line 57)

Complete Fix for commands.rs:

// server/src/api/commands.rs

use crate::auth::AuthUser;

pub async fn send_command(
    State(state): State<AppState>,
    user: AuthUser,  // ADD THIS
    Path(agent_id): Path<Uuid>,
    Json(req): Json<SendCommandRequest>,
) -> Result<Json<CommandResponse>, (StatusCode, String)> {
    // Log who sent the command
    tracing::info!(
        user_id = %user.user_id,
        agent_id = %agent_id,
        command_type = ?req.command_type,
        "Command sent by user"
    );

    // Create command with user attribution
    let command = db::create_command(
        &state.db,
        agent_id,
        &req.command_type,
        &req.command,
        req.timeout_seconds,
        req.elevated.unwrap_or(false),
        Some(user.user_id),  // Track who created it
    )
    .await
    .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

    // ... rest of function
}

pub async fn list_commands(
    State(state): State<AppState>,
    _user: AuthUser,  // ADD THIS
) -> Result<Json<Vec<CommandResponse>>, (StatusCode, String)> {
    // ... existing code
}

pub async fn get_command(
    State(state): State<AppState>,
    _user: AuthUser,  // ADD THIS
    Path(command_id): Path<Uuid>,
) -> Result<Json<CommandResponse>, (StatusCode, String)> {
    // ... existing code
}

1.3 Server: Add Rate Limiting to Registration

File: server/src/main.rs Severity: CRITICAL

Add dependency to Cargo.toml:

tower-governor = "0.3"

Implementation:

// server/src/main.rs

use tower_governor::{GovernorLayer, GovernorConfigBuilder, KeyExtractor};
use std::net::IpAddr;

// Custom key extractor for rate limiting by IP
#[derive(Clone)]
struct RealIpKeyExtractor;

impl KeyExtractor for RealIpKeyExtractor {
    type Key = IpAddr;

    fn extract<T>(&self, req: &http::Request<T>) -> Result<Self::Key, GovernorError> {
        // Try X-Forwarded-For first (for reverse proxy)
        if let Some(forwarded) = req.headers().get("x-forwarded-for") {
            if let Ok(s) = forwarded.to_str() {
                if let Some(ip) = s.split(',').next() {
                    if let Ok(addr) = ip.trim().parse() {
                        return Ok(addr);
                    }
                }
            }
        }

        // Fall back to connection info
        req.extensions()
            .get::<axum::extract::ConnectInfo<std::net::SocketAddr>>()
            .map(|ci| ci.0.ip())
            .ok_or(GovernorError::UnableToExtractKey)
    }
}

// In main() or router setup:
let registration_limiter = GovernorConfigBuilder::default()
    .per_second(1)  // 1 request per second
    .burst_size(5)  // Allow burst of 5
    .key_extractor(RealIpKeyExtractor)
    .finish()
    .unwrap();

let app = Router::new()
    // Rate-limited registration endpoints
    .route("/api/agents/register",
        post(register_agent).layer(GovernorLayer::new(&registration_limiter)))
    .route("/api/agents/register-legacy",
        post(register_legacy).layer(GovernorLayer::new(&registration_limiter)))
    // ... other routes without rate limiting

1.4 Agent: Remove Unicode Characters

File: agent/src/main.rs Lines: 460, 467, 478, 480 Severity: CRITICAL

Find and replace all Unicode characters:

// BEFORE
println!("\n✓ GuruRMM Agent installed successfully!");
println!("\n⚠️  IMPORTANT: Edit {} with your server URL and API key!", config_dest);

// AFTER
println!("\n[OK] GuruRMM Agent installed successfully!");
println!("\n[WARNING] IMPORTANT: Edit {} with your server URL and API key!", config_dest);

Complete list of replacements:

Before After
[OK]
[ERROR]
⚠️ [WARNING]
[INFO]
->
-

1.5 Agent: Secure Update Download with Certificate Pinning

File: agent/src/updater/mod.rs Severity: CRITICAL

Add to Cargo.toml:

[dependencies]
sha2 = "0.10"

Implementation:

// agent/src/updater/mod.rs

use sha2::{Sha256, Digest};

impl Updater {
    /// Download binary with certificate validation
    async fn download_binary(&self, url: &str, expected_checksum: &str) -> Result<PathBuf> {
        // Validate URL is from trusted domain
        let allowed_domains = [
            "rmm-api.azcomputerguru.com",
            "downloads.azcomputerguru.com",
        ];

        let parsed_url = url::Url::parse(url)
            .context("Invalid download URL")?;

        let host = parsed_url.host_str()
            .ok_or_else(|| anyhow::anyhow!("No host in download URL"))?;

        if !allowed_domains.iter().any(|d| host == *d || host.ends_with(&format!(".{}", d))) {
            return Err(anyhow::anyhow!(
                "Download URL host '{}' not in allowed domains: {:?}",
                host, allowed_domains
            ));
        }

        // Require HTTPS
        if parsed_url.scheme() != "https" {
            return Err(anyhow::anyhow!("Download URL must use HTTPS"));
        }

        info!("Downloading update from: {}", url);

        let response = self.http_client
            .get(url)
            .send()
            .await
            .context("HTTP request failed")?;

        if !response.status().is_success() {
            return Err(anyhow::anyhow!(
                "Download failed with status: {}",
                response.status()
            ));
        }

        let bytes = response.bytes().await
            .context("Failed to read response body")?;

        // Verify checksum BEFORE writing to disk
        let mut hasher = Sha256::new();
        hasher.update(&bytes);
        let actual_checksum = format!("{:x}", hasher.finalize());

        if actual_checksum != expected_checksum {
            return Err(anyhow::anyhow!(
                "Checksum mismatch! Expected: {}, Got: {}",
                expected_checksum, actual_checksum
            ));
        }

        info!("Checksum verified: {}", actual_checksum);

        // Write to secure temp location
        let temp_dir = std::env::temp_dir();
        let temp_path = temp_dir.join(format!("gururmm-update-{}.tmp", uuid::Uuid::new_v4()));

        tokio::fs::write(&temp_path, &bytes).await
            .context("Failed to write update binary")?;

        // Set restrictive permissions on Unix
        #[cfg(unix)]
        {
            use std::os::unix::fs::PermissionsExt;
            let perms = std::fs::Permissions::from_mode(0o700);
            std::fs::set_permissions(&temp_path, perms)?;
        }

        Ok(temp_path)
    }
}

1.6 Agent: Fix Insecure Temp File Creation

File: agent/src/updater/mod.rs Lines: 275-331 Severity: CRITICAL

Problem: Using world-writable /tmp/ for rollback script.

Fixed Code:

// agent/src/updater/mod.rs

#[cfg(unix)]
async fn create_rollback_script(&self, backup_path: &Path, target_path: &Path) -> Result<PathBuf> {
    use std::os::unix::fs::PermissionsExt;

    // Use /var/run or agent's own directory instead of /tmp
    let script_dir = if Path::new("/var/run/gururmm").exists() {
        PathBuf::from("/var/run/gururmm")
    } else {
        // Fall back to agent's config directory
        PathBuf::from("/etc/gururmm")
    };

    // Ensure directory exists with proper permissions
    tokio::fs::create_dir_all(&script_dir).await?;

    let script_path = script_dir.join(format!("rollback-{}.sh", uuid::Uuid::new_v4()));

    let script = format!(r#"#!/bin/bash
set -e

# Rollback script for GuruRMM agent update
BACKUP="{}"
TARGET="{}"
SCRIPT_PATH="$0"

# Function to cleanup
cleanup() {{
    rm -f "$SCRIPT_PATH"
}}
trap cleanup EXIT

# Wait for parent process to exit
sleep 2

# Restore backup
if [ -f "$BACKUP" ]; then
    cp "$BACKUP" "$TARGET"
    chmod +x "$TARGET"

    # Restart service
    if command -v systemctl &> /dev/null; then
        systemctl restart gururmm-agent
    fi
fi
"#,
        backup_path.display(),
        target_path.display()
    );

    // Write script with restrictive permissions
    tokio::fs::write(&script_path, &script).await?;

    // Set permissions to 700 (owner only)
    let perms = std::fs::Permissions::from_mode(0o700);
    std::fs::set_permissions(&script_path, perms)?;

    // Verify ownership (should be root)
    let metadata = std::fs::metadata(&script_path)?;
    if metadata.uid() != 0 {
        warn!("Rollback script not owned by root - this may be a security issue");
    }

    Ok(script_path)
}

1.7 Dashboard: Improve Token Storage Security

File: dashboard/src/api/client.ts Severity: CRITICAL

Option A: Use httpOnly Cookies (Recommended)

This requires server-side changes. The server should set the token as an httpOnly cookie.

Server-side change (server/src/api/auth.rs):

use axum::response::{IntoResponse, Response};
use http::header::{SET_COOKIE, HeaderValue};

pub async fn login(
    State(state): State<AppState>,
    Json(req): Json<LoginRequest>,
) -> Result<Response, (StatusCode, String)> {
    // ... existing validation ...

    let token = create_jwt(&user)?;

    // Set httpOnly cookie instead of returning token in body
    let cookie = format!(
        "auth_token={}; HttpOnly; Secure; SameSite=Strict; Path=/; Max-Age=86400",
        token
    );

    let mut response = Json(LoginResponse {
        user_id: user.id,
        email: user.email,
        name: user.name,
        role: user.role,
    }).into_response();

    response.headers_mut().insert(
        SET_COOKIE,
        HeaderValue::from_str(&cookie).unwrap()
    );

    Ok(response)
}

Dashboard change (client.ts):

// dashboard/src/api/client.ts

const api = axios.create({
  baseURL: import.meta.env.VITE_API_URL || 'https://rmm-api.azcomputerguru.com',
  withCredentials: true,  // Send cookies with requests
});

// Remove localStorage token handling - cookies are automatic

export const authApi = {
  login: async (email: string, password: string) => {
    const response = await api.post<LoginResponse>('/api/auth/login', { email, password });
    // No need to store token - it's in httpOnly cookie
    return response.data;
  },

  logout: async () => {
    await api.post('/api/auth/logout');  // Server clears cookie
  },

  // ...
};

Option B: If httpOnly cookies aren't feasible (short-term)

Add XSS protection and token encryption:

// dashboard/src/api/client.ts

import CryptoJS from 'crypto-js';

// Use session-specific key (generated on page load)
const SESSION_KEY = CryptoJS.lib.WordArray.random(32).toString();

const encryptToken = (token: string): string => {
  return CryptoJS.AES.encrypt(token, SESSION_KEY).toString();
};

const decryptToken = (encrypted: string): string | null => {
  try {
    const bytes = CryptoJS.AES.decrypt(encrypted, SESSION_KEY);
    return bytes.toString(CryptoJS.enc.Utf8);
  } catch {
    return null;
  }
};

export const authApi = {
  login: async (email: string, password: string) => {
    const response = await api.post<LoginResponse>('/api/auth/login', { email, password });
    if (response.data.token) {
      // Store encrypted token in sessionStorage (cleared on tab close)
      sessionStorage.setItem('auth_token', encryptToken(response.data.token));
    }
    return response.data;
  },

  getToken: (): string | null => {
    const encrypted = sessionStorage.getItem('auth_token');
    return encrypted ? decryptToken(encrypted) : null;
  },

  // ...
};

// Update interceptor
api.interceptors.request.use((config) => {
  const token = authApi.getToken();
  if (token) {
    config.headers.Authorization = `Bearer ${token}`;
  }
  return config;
});

1.8 Dashboard: Add Command Input Validation

File: dashboard/src/pages/AgentDetail.tsx Lines: 236-240 Severity: CRITICAL

Add validation and confirmation:

// dashboard/src/pages/AgentDetail.tsx

import { useState, useCallback } from 'react';

// Add dangerous command patterns
const DANGEROUS_PATTERNS = [
  /rm\s+-rf\s+\//i,
  /format\s+[a-z]:/i,
  /del\s+\/[fqs]/i,
  /shutdown/i,
  /reboot/i,
  /init\s+0/i,
  /mkfs/i,
  /dd\s+if=/i,
  />\s*\/dev\//i,
  /chmod\s+-R\s+777/i,
];

const WARN_PATTERNS = [
  /password/i,
  /credential/i,
  /secret/i,
  /api.?key/i,
  /token/i,
];

interface CommandValidation {
  isValid: boolean;
  isDangerous: boolean;
  hasWarnings: boolean;
  messages: string[];
}

const validateCommand = (command: string): CommandValidation => {
  const messages: string[] = [];
  let isDangerous = false;
  let hasWarnings = false;

  // Check for empty command
  if (!command.trim()) {
    return { isValid: false, isDangerous: false, hasWarnings: false, messages: ['Command cannot be empty'] };
  }

  // Check for dangerous patterns
  for (const pattern of DANGEROUS_PATTERNS) {
    if (pattern.test(command)) {
      isDangerous = true;
      messages.push(`[DANGER] Potentially destructive command detected: ${pattern.source}`);
    }
  }

  // Check for warning patterns
  for (const pattern of WARN_PATTERNS) {
    if (pattern.test(command)) {
      hasWarnings = true;
      messages.push(`[WARNING] Command may expose sensitive data: ${pattern.source}`);
    }
  }

  // Check command length
  if (command.length > 10000) {
    messages.push('[WARNING] Command is very long - consider using a script file');
    hasWarnings = true;
  }

  return {
    isValid: true,
    isDangerous,
    hasWarnings,
    messages,
  };
};

// In the component:
const [showConfirmDialog, setShowConfirmDialog] = useState(false);
const [pendingCommand, setPendingCommand] = useState<string>('');
const [validationResult, setValidationResult] = useState<CommandValidation | null>(null);

const handleSendCommand = useCallback(async () => {
  if (!command.trim() || !agent) return;

  const validation = validateCommand(command);
  setValidationResult(validation);

  if (!validation.isValid) {
    return;
  }

  if (validation.isDangerous || validation.hasWarnings) {
    setPendingCommand(command);
    setShowConfirmDialog(true);
    return;
  }

  await executeCommand(command);
}, [command, agent]);

const executeCommand = async (cmd: string) => {
  try {
    await agentsApi.sendCommand(agent!.id, {
      command_type: commandType,
      command: cmd,
      timeout_seconds: 300,
      elevated: false,
    });
    setCommand('');
    setShowConfirmDialog(false);
    setPendingCommand('');
  } catch (err) {
    console.error('Failed to send command:', err);
  }
};

// Add confirmation dialog JSX:
{showConfirmDialog && (
  <div className="fixed inset-0 bg-black/50 flex items-center justify-center z-50">
    <div className="bg-white rounded-lg p-6 max-w-lg w-full mx-4">
      <h3 className="text-lg font-semibold text-red-600 mb-4">
        {validationResult?.isDangerous ? '[DANGER] Confirm Dangerous Command' : '[WARNING] Confirm Command'}
      </h3>

      <div className="mb-4 space-y-2">
        {validationResult?.messages.map((msg, i) => (
          <p key={i} className={msg.includes('DANGER') ? 'text-red-600' : 'text-yellow-600'}>
            {msg}
          </p>
        ))}
      </div>

      <pre className="bg-gray-100 p-3 rounded text-sm mb-4 overflow-x-auto">
        {pendingCommand}
      </pre>

      <p className="text-sm text-gray-600 mb-4">
        This command will be executed on: <strong>{agent?.hostname}</strong>
      </p>

      <div className="flex gap-3 justify-end">
        <button
          onClick={() => {
            setShowConfirmDialog(false);
            setPendingCommand('');
          }}
          className="px-4 py-2 border rounded hover:bg-gray-50"
        >
          Cancel
        </button>
        <button
          onClick={() => executeCommand(pendingCommand)}
          className={`px-4 py-2 rounded text-white ${
            validationResult?.isDangerous
              ? 'bg-red-600 hover:bg-red-700'
              : 'bg-yellow-600 hover:bg-yellow-700'
          }`}
        >
          Execute Anyway
        </button>
      </div>
    </div>
  </div>
)}

1.9 Legacy Agent: Document Certificate Validation Bypass

File: agent-legacy/GuruRMM-Agent.ps1 Line: 103 Severity: CRITICAL

Cannot be fully fixed for legacy systems that lack proper certificate chains.

Mitigation: Add explicit parameter and logging:

# GuruRMM-Agent.ps1

param(
    [string]$ServerUrl = "",
    [string]$SiteCode = "",
    [switch]$AllowInsecureTLS,  # NEW: Must be explicitly set
    [switch]$Register,
    [switch]$Run
)

# Near line 103, replace:
function Initialize-TLS {
    # Configure TLS
    try {
        [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12
        Write-Log "TLS 1.2 configured successfully" "INFO"
    } catch {
        Write-Log "TLS 1.2 not available, falling back to TLS 1.1" "WARN"
        try {
            [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls11
        } catch {
            Write-Log "TLS 1.1 not available - connection may be insecure" "ERROR"
        }
    }

    # Certificate validation - ONLY disable if explicitly requested
    if ($script:AllowInsecureTLS) {
        Write-Log "[SECURITY WARNING] Certificate validation DISABLED - vulnerable to MITM attacks" "WARN"
        Write-Log "[SECURITY WARNING] This should only be used on legacy systems with self-signed certificates" "WARN"

        # Log to Windows Event Log for audit trail
        try {
            Write-EventLog -LogName Application -Source "GuruRMM" -EventId 1001 -EntryType Warning `
                -Message "GuruRMM agent started with certificate validation disabled. This is a security risk."
        } catch {
            Write-Log "Could not write to Windows Event Log" "WARN"
        }

        [System.Net.ServicePointManager]::ServerCertificateValidationCallback = { $true }
    } else {
        Write-Log "Certificate validation ENABLED (secure mode)" "INFO"
        # Ensure callback is not set (use default validation)
        [System.Net.ServicePointManager]::ServerCertificateValidationCallback = $null
    }
}

# Update installation documentation:
<#
.SYNOPSIS
    GuruRMM Legacy Agent for Windows Server 2008 R2 and older systems.

.DESCRIPTION
    This agent is designed for systems that cannot run the modern Rust-based agent.

    SECURITY WARNING: The -AllowInsecureTLS switch disables certificate validation,
    making the connection vulnerable to man-in-the-middle attacks. Only use this
    option on isolated networks or when using self-signed certificates that cannot
    be properly installed in the system certificate store.

.PARAMETER AllowInsecureTLS
    [SECURITY RISK] Disables SSL/TLS certificate validation. Required for systems
    with self-signed certificates or broken certificate chains. This flag must be
    explicitly provided - it is not enabled by default.

.EXAMPLE
    # Secure installation (recommended)
    .\GuruRMM-Agent.ps1 -Register -ServerUrl "https://rmm.example.com" -SiteCode "ACME-CORP-1234"

    # Insecure installation (legacy systems only)
    .\GuruRMM-Agent.ps1 -Register -ServerUrl "https://rmm.example.com" -SiteCode "ACME-CORP-1234" -AllowInsecureTLS
#>

Phase 2: Major Issues (Should Fix)

2.1 Server: Add Missing Database Index

File: server/migrations/005_add_missing_indexes.sql (new file)

-- Migration: Add missing indexes for performance
-- This migration adds indexes that were identified during code review

-- Index for agent API key lookups (authentication)
CREATE INDEX IF NOT EXISTS idx_agents_api_key_hash ON agents(api_key_hash);

-- Index for site API key lookups
CREATE INDEX IF NOT EXISTS idx_sites_api_key_hash ON sites(api_key_hash);

-- Index for command status queries
CREATE INDEX IF NOT EXISTS idx_commands_status ON commands(status);

-- Index for metrics time-range queries
CREATE INDEX IF NOT EXISTS idx_metrics_agent_timestamp ON metrics(agent_id, timestamp DESC);

-- Index for finding online agents
CREATE INDEX IF NOT EXISTS idx_agents_status_last_seen ON agents(status, last_seen DESC);

2.2 Server: Fix WebSocket Ping/Pong Protocol

File: server/src/ws/mod.rs Lines: 379-386

Current (incorrect):

Ok(Message::Ping(data)) => {
    if tx
        .send(ServerMessage::Ack { message_id: None })
        .await
        .is_err()
    {
        break;
    }
}

Fixed:

Ok(Message::Ping(data)) => {
    // WebSocket protocol requires Pong response to Ping
    if ws_sender
        .send(Message::Pong(data))
        .await
        .is_err()
    {
        warn!("Failed to send Pong response");
        break;
    }
}

2.3 Agent: Fix Service Path Construction

File: agent/src/service.rs Line: 93

Current:

let config_path = PathBuf::from(format!(r"{}\\agent.toml", CONFIG_DIR));

Fixed:

let config_path = PathBuf::from(CONFIG_DIR).join("agent.toml");

2.4 Agent: Fix Rollback Script Backgrounding

File: agent/src/updater/mod.rs Lines: 321-327

Current (broken):

tokio::process::Command::new("nohup")
    .arg("bash")
    .arg(&script_path)
    .arg("&")  // This doesn't work - & is passed as literal argument
    .spawn()

Fixed:

// Use setsid to create new session and properly detach
tokio::process::Command::new("setsid")
    .arg("bash")
    .arg(&script_path)
    .stdin(std::process::Stdio::null())
    .stdout(std::process::Stdio::null())
    .stderr(std::process::Stdio::null())
    .spawn()
    .context("Failed to spawn rollback script")?;

// Alternative using nohup correctly:
tokio::process::Command::new("bash")
    .arg("-c")
    .arg(format!("nohup {} </dev/null >/dev/null 2>&1 &", script_path.display()))
    .spawn()
    .context("Failed to spawn rollback script")?;

2.5 Agent: Replace todo!() with Proper Error

File: agent/src/main.rs Lines: 227-229

Current:

#[cfg(target_os = "macos")]
{
    todo!("macOS launchd service installation not yet implemented");
}

Fixed:

#[cfg(target_os = "macos")]
{
    return Err(anyhow::anyhow!(
        "macOS launchd service installation is not yet implemented. \
         Please run the agent manually or create a launchd plist manually. \
         See: https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingLaunchdJobs.html"
    ));
}

2.6 Dashboard: Fix Duplicate Filter Options

File: dashboard/src/pages/Agents.tsx Lines: 175-176

Current:

<option value="">All Clients</option>
<option value="">Unassigned</option>

Fixed:

<option value="">All Clients</option>
<option value="__unassigned__">Unassigned Only</option>

Update filtering logic:

const filteredAgents = useMemo(() => {
  let result = agents || [];

  if (clientFilter === '__unassigned__') {
    result = result.filter(a => !a.site_id && !a.client_id);
  } else if (clientFilter) {
    result = result.filter(a => a.client_id === clientFilter);
  }

  if (searchTerm) {
    const term = searchTerm.toLowerCase();
    result = result.filter(a =>
      a.hostname.toLowerCase().includes(term) ||
      a.os_type?.toLowerCase().includes(term)
    );
  }

  return result;
}, [agents, clientFilter, searchTerm]);

2.7 Dashboard: Add Error Handlers to Mutations

File: dashboard/src/pages/Agents.tsx Lines: 98-114

Current:

const deleteMutation = useMutation({
  mutationFn: agentsApi.delete,
  onSuccess: () => {
    queryClient.invalidateQueries({ queryKey: ['agents'] });
    setDeleteConfirm(null);
  },
});

Fixed:

const deleteMutation = useMutation({
  mutationFn: agentsApi.delete,
  onSuccess: () => {
    queryClient.invalidateQueries({ queryKey: ['agents'] });
    setDeleteConfirm(null);
    // Optional: Show success toast
  },
  onError: (error: AxiosError<{ error: string }>) => {
    const message = error.response?.data?.error || error.message || 'Failed to delete agent';
    // Show error to user
    alert(`[ERROR] ${message}`);
    // Or use a toast notification library
  },
});

const moveMutation = useMutation({
  mutationFn: ({ agentId, siteId }: { agentId: string; siteId: string | null }) =>
    agentsApi.move(agentId, siteId),
  onSuccess: () => {
    queryClient.invalidateQueries({ queryKey: ['agents'] });
    setMoveAgent(null);
  },
  onError: (error: AxiosError<{ error: string }>) => {
    const message = error.response?.data?.error || error.message || 'Failed to move agent';
    alert(`[ERROR] ${message}`);
  },
});

2.8 Dashboard: Replace any Types

File: dashboard/src/pages/Login.tsx Line: 24

Current:

} catch (err: any) {
  setError(err.response?.data?.error || 'Login failed');
}

Fixed:

import { AxiosError } from 'axios';

interface ApiError {
  error: string;
}

// In the component:
} catch (err) {
  if (err instanceof AxiosError) {
    const apiError = err.response?.data as ApiError | undefined;
    setError(apiError?.error || err.message || 'Login failed');
  } else if (err instanceof Error) {
    setError(err.message);
  } else {
    setError('An unexpected error occurred');
  }
}

Apply same pattern to Register.tsx line 37.


3.1 Add Request Logging/Audit Trail

File: server/src/main.rs

use tower_http::trace::{TraceLayer, DefaultMakeSpan, DefaultOnResponse};
use tracing::Level;

// Add to router
let app = Router::new()
    // ... routes ...
    .layer(
        TraceLayer::new_for_http()
            .make_span_with(DefaultMakeSpan::new().level(Level::INFO))
            .on_response(DefaultOnResponse::new().level(Level::INFO))
    );

3.2 Add Health Check with Database Verification

File: server/src/main.rs

Current:

async fn health_check() -> &'static str {
    "OK"
}

Fixed:

use serde::Serialize;

#[derive(Serialize)]
struct HealthResponse {
    status: &'static str,
    database: &'static str,
    version: &'static str,
}

async fn health_check(State(state): State<AppState>) -> Json<HealthResponse> {
    let db_status = match sqlx::query("SELECT 1")
        .fetch_one(&state.db)
        .await
    {
        Ok(_) => "healthy",
        Err(_) => "unhealthy",
    };

    Json(HealthResponse {
        status: if db_status == "healthy" { "ok" } else { "degraded" },
        database: db_status,
        version: env!("CARGO_PKG_VERSION"),
    })
}

3.3 Add Graceful Shutdown

File: server/src/main.rs

use tokio::signal;

async fn shutdown_signal() {
    let ctrl_c = async {
        signal::ctrl_c()
            .await
            .expect("Failed to install Ctrl+C handler");
    };

    #[cfg(unix)]
    let terminate = async {
        signal::unix::signal(signal::unix::SignalKind::terminate())
            .expect("Failed to install signal handler")
            .recv()
            .await;
    };

    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();

    tokio::select! {
        _ = ctrl_c => {},
        _ = terminate => {},
    }

    tracing::info!("Shutdown signal received, starting graceful shutdown");
}

// In main():
let listener = tokio::net::TcpListener::bind(&addr).await?;

axum::serve(listener, app)
    .with_graceful_shutdown(shutdown_signal())
    .await?;

Implementation Checklist

Critical (Do First)

  • 1.1 Restrict CORS configuration
  • 1.2 Add AuthUser to all endpoints
  • 1.3 Add rate limiting to registration
  • 1.4 Remove Unicode characters from agent output
  • 1.5 Add certificate pinning for updates
  • 1.6 Fix insecure temp file creation
  • 1.7 Improve dashboard token storage
  • 1.8 Add command input validation
  • 1.9 Document/mitigate legacy agent cert bypass

Major (Do Second)

  • 2.1 Add missing database indexes
  • 2.2 Fix WebSocket ping/pong
  • 2.3 Fix service path construction
  • 2.4 Fix rollback script backgrounding
  • 2.5 Replace todo!() with proper errors
  • 2.6 Fix duplicate filter options
  • 2.7 Add error handlers to mutations
  • 2.8 Replace any types with proper types

Warnings (Do Third)

  • 3.1 Add request logging/audit trail
  • 3.2 Add comprehensive health check
  • 3.3 Add graceful shutdown handling
  • Add pagination to commands endpoint
  • Add input validation to all request types
  • Implement token expiration handling in dashboard
  • Add dark mode toggle (CSS already exists)
  • Add log rotation to legacy agent

Testing Requirements

After implementing fixes:

  1. Authentication Tests

    • Verify all protected endpoints return 401 without token
    • Verify all protected endpoints work with valid token
    • Verify rate limiting on registration endpoints
  2. Security Tests

    • Verify CORS blocks unauthorized origins
    • Verify command validation catches dangerous patterns
    • Verify update downloads only from allowed domains
  3. Functional Tests

    • Verify agent registration still works
    • Verify command execution still works
    • Verify metrics collection still works
    • Verify WebSocket reconnection works
  4. Legacy Agent Tests

    • Verify agent works without -AllowInsecureTLS (with valid cert)
    • Verify agent works with -AllowInsecureTLS (with self-signed)
    • Verify Windows Event Log entries are created

Estimated Effort

Phase Items Estimated Time
Phase 1 (Critical) 9 items 8-12 hours
Phase 2 (Major) 8 items 4-6 hours
Phase 3 (Warnings) 8+ items 4-8 hours
Testing All 4-6 hours
Total 20-32 hours

Document Version: 1.0 Created: 2026-01-20 Author: Claude Code Review System