Files
claudetools/projects/msp-tools/guru-rmm/docs/REMEDIATION_PLAN.md
azcomputerguru 65086f4407 fix(security): Implement Phase 1 critical security fixes
CORS:
- Restrict CORS to DASHBOARD_URL environment variable
- Default to production dashboard domain

Authentication:
- Add AuthUser requirement to all agent management endpoints
- Add AuthUser requirement to all command endpoints
- Add AuthUser requirement to all metrics endpoints
- Add audit logging for command execution (user_id tracked)

Agent Security:
- Replace Unicode characters with ASCII markers [OK]/[ERROR]/[WARNING]
- Add certificate pinning for update downloads (allowlist domains)
- Fix insecure temp file creation (use /var/run/gururmm with 0700 perms)
- Fix rollback script backgrounding (use setsid instead of literal &)

Dashboard Security:
- Move token storage from localStorage to sessionStorage
- Add proper TypeScript types (remove 'any' from error handlers)
- Centralize token management functions

Legacy Agent:
- Add -AllowInsecureTLS parameter (opt-in required)
- Add Windows Event Log audit trail when insecure mode used
- Update documentation with security warnings

Closes: Phase 1 items in issue #1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 21:16:24 -07:00

1277 lines
32 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# GuruRMM Remediation Plan
**Generated:** 2026-01-20
**Review Source:** Full-scale code review of all components
**Priority:** Critical issues first, then major, then warnings
---
## Executive Summary
This document provides specific code fixes for all critical and major issues identified during the comprehensive code review. Issues are organized by component and priority.
**Issue Counts:**
- CRITICAL: 16 (must fix before production)
- MAJOR: 14 (should fix before production)
- WARNING: 28 (recommended improvements)
---
## Phase 1: Critical Security Fixes (MUST DO)
### 1.1 Server: Restrict CORS Configuration
**File:** `server/src/main.rs`
**Lines:** 133-136
**Severity:** CRITICAL
**Current Code:**
```rust
let cors = CorsLayer::new()
.allow_origin(Any)
.allow_methods(Any)
.allow_headers(Any);
```
**Fixed Code:**
```rust
use tower_http::cors::{CorsLayer, AllowOrigin};
use http::HeaderValue;
// Get allowed origins from environment
let dashboard_origin = std::env::var("DASHBOARD_URL")
.unwrap_or_else(|_| "https://rmm.azcomputerguru.com".to_string());
let cors = CorsLayer::new()
.allow_origin(AllowOrigin::exact(
HeaderValue::from_str(&dashboard_origin).expect("Invalid DASHBOARD_URL")
))
.allow_methods([
http::Method::GET,
http::Method::POST,
http::Method::PUT,
http::Method::DELETE,
http::Method::OPTIONS,
])
.allow_headers([
http::header::AUTHORIZATION,
http::header::CONTENT_TYPE,
http::header::ACCEPT,
])
.allow_credentials(true);
```
**Environment Variable:**
```bash
DASHBOARD_URL=https://rmm.azcomputerguru.com
```
---
### 1.2 Server: Add Authentication to All Endpoints
**File:** `server/src/api/agents.rs`
**Severity:** CRITICAL
**Problem:** Most endpoints lack the `AuthUser` extractor.
**Fix Pattern - Apply to ALL endpoint functions:**
```rust
// BEFORE (insecure)
pub async fn list_agents(
State(state): State<AppState>,
) -> Result<Json<Vec<AgentResponse>>, (StatusCode, String)> {
// AFTER (secure)
pub async fn list_agents(
State(state): State<AppState>,
_user: AuthUser, // Add this to require authentication
) -> Result<Json<Vec<AgentResponse>>, (StatusCode, String)> {
```
**Endpoints requiring AuthUser addition:**
**agents.rs:**
- `list_agents` (line 62)
- `get_agent` (line 74)
- `delete_agent` (line 87)
- `get_stats` (line 109)
- `move_agent` (line 126)
- `list_agents_with_details` (line 152)
- `list_unassigned_agents` (line 163)
- `get_agent_state` (line 175)
- `register_agent` (line 32) - Add auth OR implement rate limiting
**commands.rs:**
- `send_command` (line 46)
- `list_commands` (line 103)
- `get_command` (line 117)
**metrics.rs:**
- `get_agent_metrics` (line 29)
- `get_summary` (line 57)
**Complete Fix for commands.rs:**
```rust
// server/src/api/commands.rs
use crate::auth::AuthUser;
pub async fn send_command(
State(state): State<AppState>,
user: AuthUser, // ADD THIS
Path(agent_id): Path<Uuid>,
Json(req): Json<SendCommandRequest>,
) -> Result<Json<CommandResponse>, (StatusCode, String)> {
// Log who sent the command
tracing::info!(
user_id = %user.user_id,
agent_id = %agent_id,
command_type = ?req.command_type,
"Command sent by user"
);
// Create command with user attribution
let command = db::create_command(
&state.db,
agent_id,
&req.command_type,
&req.command,
req.timeout_seconds,
req.elevated.unwrap_or(false),
Some(user.user_id), // Track who created it
)
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;
// ... rest of function
}
pub async fn list_commands(
State(state): State<AppState>,
_user: AuthUser, // ADD THIS
) -> Result<Json<Vec<CommandResponse>>, (StatusCode, String)> {
// ... existing code
}
pub async fn get_command(
State(state): State<AppState>,
_user: AuthUser, // ADD THIS
Path(command_id): Path<Uuid>,
) -> Result<Json<CommandResponse>, (StatusCode, String)> {
// ... existing code
}
```
---
### 1.3 Server: Add Rate Limiting to Registration
**File:** `server/src/main.rs`
**Severity:** CRITICAL
**Add dependency to Cargo.toml:**
```toml
tower-governor = "0.3"
```
**Implementation:**
```rust
// server/src/main.rs
use tower_governor::{GovernorLayer, GovernorConfigBuilder, KeyExtractor};
use std::net::IpAddr;
// Custom key extractor for rate limiting by IP
#[derive(Clone)]
struct RealIpKeyExtractor;
impl KeyExtractor for RealIpKeyExtractor {
type Key = IpAddr;
fn extract<T>(&self, req: &http::Request<T>) -> Result<Self::Key, GovernorError> {
// Try X-Forwarded-For first (for reverse proxy)
if let Some(forwarded) = req.headers().get("x-forwarded-for") {
if let Ok(s) = forwarded.to_str() {
if let Some(ip) = s.split(',').next() {
if let Ok(addr) = ip.trim().parse() {
return Ok(addr);
}
}
}
}
// Fall back to connection info
req.extensions()
.get::<axum::extract::ConnectInfo<std::net::SocketAddr>>()
.map(|ci| ci.0.ip())
.ok_or(GovernorError::UnableToExtractKey)
}
}
// In main() or router setup:
let registration_limiter = GovernorConfigBuilder::default()
.per_second(1) // 1 request per second
.burst_size(5) // Allow burst of 5
.key_extractor(RealIpKeyExtractor)
.finish()
.unwrap();
let app = Router::new()
// Rate-limited registration endpoints
.route("/api/agents/register",
post(register_agent).layer(GovernorLayer::new(&registration_limiter)))
.route("/api/agents/register-legacy",
post(register_legacy).layer(GovernorLayer::new(&registration_limiter)))
// ... other routes without rate limiting
```
---
### 1.4 Agent: Remove Unicode Characters
**File:** `agent/src/main.rs`
**Lines:** 460, 467, 478, 480
**Severity:** CRITICAL
**Find and replace all Unicode characters:**
```rust
// BEFORE
println!("\n✓ GuruRMM Agent installed successfully!");
println!("\n⚠️ IMPORTANT: Edit {} with your server URL and API key!", config_dest);
// AFTER
println!("\n[OK] GuruRMM Agent installed successfully!");
println!("\n[WARNING] IMPORTANT: Edit {} with your server URL and API key!", config_dest);
```
**Complete list of replacements:**
| Before | After |
|--------|-------|
| `✓` | `[OK]` |
| `✗` | `[ERROR]` |
| `⚠️` | `[WARNING]` |
| `` | `[INFO]` |
| `→` | `->` |
| `•` | `-` |
---
### 1.5 Agent: Secure Update Download with Certificate Pinning
**File:** `agent/src/updater/mod.rs`
**Severity:** CRITICAL
**Add to Cargo.toml:**
```toml
[dependencies]
sha2 = "0.10"
```
**Implementation:**
```rust
// agent/src/updater/mod.rs
use sha2::{Sha256, Digest};
impl Updater {
/// Download binary with certificate validation
async fn download_binary(&self, url: &str, expected_checksum: &str) -> Result<PathBuf> {
// Validate URL is from trusted domain
let allowed_domains = [
"rmm-api.azcomputerguru.com",
"downloads.azcomputerguru.com",
];
let parsed_url = url::Url::parse(url)
.context("Invalid download URL")?;
let host = parsed_url.host_str()
.ok_or_else(|| anyhow::anyhow!("No host in download URL"))?;
if !allowed_domains.iter().any(|d| host == *d || host.ends_with(&format!(".{}", d))) {
return Err(anyhow::anyhow!(
"Download URL host '{}' not in allowed domains: {:?}",
host, allowed_domains
));
}
// Require HTTPS
if parsed_url.scheme() != "https" {
return Err(anyhow::anyhow!("Download URL must use HTTPS"));
}
info!("Downloading update from: {}", url);
let response = self.http_client
.get(url)
.send()
.await
.context("HTTP request failed")?;
if !response.status().is_success() {
return Err(anyhow::anyhow!(
"Download failed with status: {}",
response.status()
));
}
let bytes = response.bytes().await
.context("Failed to read response body")?;
// Verify checksum BEFORE writing to disk
let mut hasher = Sha256::new();
hasher.update(&bytes);
let actual_checksum = format!("{:x}", hasher.finalize());
if actual_checksum != expected_checksum {
return Err(anyhow::anyhow!(
"Checksum mismatch! Expected: {}, Got: {}",
expected_checksum, actual_checksum
));
}
info!("Checksum verified: {}", actual_checksum);
// Write to secure temp location
let temp_dir = std::env::temp_dir();
let temp_path = temp_dir.join(format!("gururmm-update-{}.tmp", uuid::Uuid::new_v4()));
tokio::fs::write(&temp_path, &bytes).await
.context("Failed to write update binary")?;
// Set restrictive permissions on Unix
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let perms = std::fs::Permissions::from_mode(0o700);
std::fs::set_permissions(&temp_path, perms)?;
}
Ok(temp_path)
}
}
```
---
### 1.6 Agent: Fix Insecure Temp File Creation
**File:** `agent/src/updater/mod.rs`
**Lines:** 275-331
**Severity:** CRITICAL
**Problem:** Using world-writable `/tmp/` for rollback script.
**Fixed Code:**
```rust
// agent/src/updater/mod.rs
#[cfg(unix)]
async fn create_rollback_script(&self, backup_path: &Path, target_path: &Path) -> Result<PathBuf> {
use std::os::unix::fs::PermissionsExt;
// Use /var/run or agent's own directory instead of /tmp
let script_dir = if Path::new("/var/run/gururmm").exists() {
PathBuf::from("/var/run/gururmm")
} else {
// Fall back to agent's config directory
PathBuf::from("/etc/gururmm")
};
// Ensure directory exists with proper permissions
tokio::fs::create_dir_all(&script_dir).await?;
let script_path = script_dir.join(format!("rollback-{}.sh", uuid::Uuid::new_v4()));
let script = format!(r#"#!/bin/bash
set -e
# Rollback script for GuruRMM agent update
BACKUP="{}"
TARGET="{}"
SCRIPT_PATH="$0"
# Function to cleanup
cleanup() {{
rm -f "$SCRIPT_PATH"
}}
trap cleanup EXIT
# Wait for parent process to exit
sleep 2
# Restore backup
if [ -f "$BACKUP" ]; then
cp "$BACKUP" "$TARGET"
chmod +x "$TARGET"
# Restart service
if command -v systemctl &> /dev/null; then
systemctl restart gururmm-agent
fi
fi
"#,
backup_path.display(),
target_path.display()
);
// Write script with restrictive permissions
tokio::fs::write(&script_path, &script).await?;
// Set permissions to 700 (owner only)
let perms = std::fs::Permissions::from_mode(0o700);
std::fs::set_permissions(&script_path, perms)?;
// Verify ownership (should be root)
let metadata = std::fs::metadata(&script_path)?;
if metadata.uid() != 0 {
warn!("Rollback script not owned by root - this may be a security issue");
}
Ok(script_path)
}
```
---
### 1.7 Dashboard: Improve Token Storage Security
**File:** `dashboard/src/api/client.ts`
**Severity:** CRITICAL
**Option A: Use httpOnly Cookies (Recommended)**
This requires server-side changes. The server should set the token as an httpOnly cookie.
**Server-side change (server/src/api/auth.rs):**
```rust
use axum::response::{IntoResponse, Response};
use http::header::{SET_COOKIE, HeaderValue};
pub async fn login(
State(state): State<AppState>,
Json(req): Json<LoginRequest>,
) -> Result<Response, (StatusCode, String)> {
// ... existing validation ...
let token = create_jwt(&user)?;
// Set httpOnly cookie instead of returning token in body
let cookie = format!(
"auth_token={}; HttpOnly; Secure; SameSite=Strict; Path=/; Max-Age=86400",
token
);
let mut response = Json(LoginResponse {
user_id: user.id,
email: user.email,
name: user.name,
role: user.role,
}).into_response();
response.headers_mut().insert(
SET_COOKIE,
HeaderValue::from_str(&cookie).unwrap()
);
Ok(response)
}
```
**Dashboard change (client.ts):**
```typescript
// dashboard/src/api/client.ts
const api = axios.create({
baseURL: import.meta.env.VITE_API_URL || 'https://rmm-api.azcomputerguru.com',
withCredentials: true, // Send cookies with requests
});
// Remove localStorage token handling - cookies are automatic
export const authApi = {
login: async (email: string, password: string) => {
const response = await api.post<LoginResponse>('/api/auth/login', { email, password });
// No need to store token - it's in httpOnly cookie
return response.data;
},
logout: async () => {
await api.post('/api/auth/logout'); // Server clears cookie
},
// ...
};
```
**Option B: If httpOnly cookies aren't feasible (short-term)**
Add XSS protection and token encryption:
```typescript
// dashboard/src/api/client.ts
import CryptoJS from 'crypto-js';
// Use session-specific key (generated on page load)
const SESSION_KEY = CryptoJS.lib.WordArray.random(32).toString();
const encryptToken = (token: string): string => {
return CryptoJS.AES.encrypt(token, SESSION_KEY).toString();
};
const decryptToken = (encrypted: string): string | null => {
try {
const bytes = CryptoJS.AES.decrypt(encrypted, SESSION_KEY);
return bytes.toString(CryptoJS.enc.Utf8);
} catch {
return null;
}
};
export const authApi = {
login: async (email: string, password: string) => {
const response = await api.post<LoginResponse>('/api/auth/login', { email, password });
if (response.data.token) {
// Store encrypted token in sessionStorage (cleared on tab close)
sessionStorage.setItem('auth_token', encryptToken(response.data.token));
}
return response.data;
},
getToken: (): string | null => {
const encrypted = sessionStorage.getItem('auth_token');
return encrypted ? decryptToken(encrypted) : null;
},
// ...
};
// Update interceptor
api.interceptors.request.use((config) => {
const token = authApi.getToken();
if (token) {
config.headers.Authorization = `Bearer ${token}`;
}
return config;
});
```
---
### 1.8 Dashboard: Add Command Input Validation
**File:** `dashboard/src/pages/AgentDetail.tsx`
**Lines:** 236-240
**Severity:** CRITICAL
**Add validation and confirmation:**
```typescript
// dashboard/src/pages/AgentDetail.tsx
import { useState, useCallback } from 'react';
// Add dangerous command patterns
const DANGEROUS_PATTERNS = [
/rm\s+-rf\s+\//i,
/format\s+[a-z]:/i,
/del\s+\/[fqs]/i,
/shutdown/i,
/reboot/i,
/init\s+0/i,
/mkfs/i,
/dd\s+if=/i,
/>\s*\/dev\//i,
/chmod\s+-R\s+777/i,
];
const WARN_PATTERNS = [
/password/i,
/credential/i,
/secret/i,
/api.?key/i,
/token/i,
];
interface CommandValidation {
isValid: boolean;
isDangerous: boolean;
hasWarnings: boolean;
messages: string[];
}
const validateCommand = (command: string): CommandValidation => {
const messages: string[] = [];
let isDangerous = false;
let hasWarnings = false;
// Check for empty command
if (!command.trim()) {
return { isValid: false, isDangerous: false, hasWarnings: false, messages: ['Command cannot be empty'] };
}
// Check for dangerous patterns
for (const pattern of DANGEROUS_PATTERNS) {
if (pattern.test(command)) {
isDangerous = true;
messages.push(`[DANGER] Potentially destructive command detected: ${pattern.source}`);
}
}
// Check for warning patterns
for (const pattern of WARN_PATTERNS) {
if (pattern.test(command)) {
hasWarnings = true;
messages.push(`[WARNING] Command may expose sensitive data: ${pattern.source}`);
}
}
// Check command length
if (command.length > 10000) {
messages.push('[WARNING] Command is very long - consider using a script file');
hasWarnings = true;
}
return {
isValid: true,
isDangerous,
hasWarnings,
messages,
};
};
// In the component:
const [showConfirmDialog, setShowConfirmDialog] = useState(false);
const [pendingCommand, setPendingCommand] = useState<string>('');
const [validationResult, setValidationResult] = useState<CommandValidation | null>(null);
const handleSendCommand = useCallback(async () => {
if (!command.trim() || !agent) return;
const validation = validateCommand(command);
setValidationResult(validation);
if (!validation.isValid) {
return;
}
if (validation.isDangerous || validation.hasWarnings) {
setPendingCommand(command);
setShowConfirmDialog(true);
return;
}
await executeCommand(command);
}, [command, agent]);
const executeCommand = async (cmd: string) => {
try {
await agentsApi.sendCommand(agent!.id, {
command_type: commandType,
command: cmd,
timeout_seconds: 300,
elevated: false,
});
setCommand('');
setShowConfirmDialog(false);
setPendingCommand('');
} catch (err) {
console.error('Failed to send command:', err);
}
};
// Add confirmation dialog JSX:
{showConfirmDialog && (
<div className="fixed inset-0 bg-black/50 flex items-center justify-center z-50">
<div className="bg-white rounded-lg p-6 max-w-lg w-full mx-4">
<h3 className="text-lg font-semibold text-red-600 mb-4">
{validationResult?.isDangerous ? '[DANGER] Confirm Dangerous Command' : '[WARNING] Confirm Command'}
</h3>
<div className="mb-4 space-y-2">
{validationResult?.messages.map((msg, i) => (
<p key={i} className={msg.includes('DANGER') ? 'text-red-600' : 'text-yellow-600'}>
{msg}
</p>
))}
</div>
<pre className="bg-gray-100 p-3 rounded text-sm mb-4 overflow-x-auto">
{pendingCommand}
</pre>
<p className="text-sm text-gray-600 mb-4">
This command will be executed on: <strong>{agent?.hostname}</strong>
</p>
<div className="flex gap-3 justify-end">
<button
onClick={() => {
setShowConfirmDialog(false);
setPendingCommand('');
}}
className="px-4 py-2 border rounded hover:bg-gray-50"
>
Cancel
</button>
<button
onClick={() => executeCommand(pendingCommand)}
className={`px-4 py-2 rounded text-white ${
validationResult?.isDangerous
? 'bg-red-600 hover:bg-red-700'
: 'bg-yellow-600 hover:bg-yellow-700'
}`}
>
Execute Anyway
</button>
</div>
</div>
</div>
)}
```
---
### 1.9 Legacy Agent: Document Certificate Validation Bypass
**File:** `agent-legacy/GuruRMM-Agent.ps1`
**Line:** 103
**Severity:** CRITICAL
**Cannot be fully fixed** for legacy systems that lack proper certificate chains.
**Mitigation: Add explicit parameter and logging:**
```powershell
# GuruRMM-Agent.ps1
param(
[string]$ServerUrl = "",
[string]$SiteCode = "",
[switch]$AllowInsecureTLS, # NEW: Must be explicitly set
[switch]$Register,
[switch]$Run
)
# Near line 103, replace:
function Initialize-TLS {
# Configure TLS
try {
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12
Write-Log "TLS 1.2 configured successfully" "INFO"
} catch {
Write-Log "TLS 1.2 not available, falling back to TLS 1.1" "WARN"
try {
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls11
} catch {
Write-Log "TLS 1.1 not available - connection may be insecure" "ERROR"
}
}
# Certificate validation - ONLY disable if explicitly requested
if ($script:AllowInsecureTLS) {
Write-Log "[SECURITY WARNING] Certificate validation DISABLED - vulnerable to MITM attacks" "WARN"
Write-Log "[SECURITY WARNING] This should only be used on legacy systems with self-signed certificates" "WARN"
# Log to Windows Event Log for audit trail
try {
Write-EventLog -LogName Application -Source "GuruRMM" -EventId 1001 -EntryType Warning `
-Message "GuruRMM agent started with certificate validation disabled. This is a security risk."
} catch {
Write-Log "Could not write to Windows Event Log" "WARN"
}
[System.Net.ServicePointManager]::ServerCertificateValidationCallback = { $true }
} else {
Write-Log "Certificate validation ENABLED (secure mode)" "INFO"
# Ensure callback is not set (use default validation)
[System.Net.ServicePointManager]::ServerCertificateValidationCallback = $null
}
}
# Update installation documentation:
<#
.SYNOPSIS
GuruRMM Legacy Agent for Windows Server 2008 R2 and older systems.
.DESCRIPTION
This agent is designed for systems that cannot run the modern Rust-based agent.
SECURITY WARNING: The -AllowInsecureTLS switch disables certificate validation,
making the connection vulnerable to man-in-the-middle attacks. Only use this
option on isolated networks or when using self-signed certificates that cannot
be properly installed in the system certificate store.
.PARAMETER AllowInsecureTLS
[SECURITY RISK] Disables SSL/TLS certificate validation. Required for systems
with self-signed certificates or broken certificate chains. This flag must be
explicitly provided - it is not enabled by default.
.EXAMPLE
# Secure installation (recommended)
.\GuruRMM-Agent.ps1 -Register -ServerUrl "https://rmm.example.com" -SiteCode "ACME-CORP-1234"
# Insecure installation (legacy systems only)
.\GuruRMM-Agent.ps1 -Register -ServerUrl "https://rmm.example.com" -SiteCode "ACME-CORP-1234" -AllowInsecureTLS
#>
```
---
## Phase 2: Major Issues (Should Fix)
### 2.1 Server: Add Missing Database Index
**File:** `server/migrations/005_add_missing_indexes.sql` (new file)
```sql
-- Migration: Add missing indexes for performance
-- This migration adds indexes that were identified during code review
-- Index for agent API key lookups (authentication)
CREATE INDEX IF NOT EXISTS idx_agents_api_key_hash ON agents(api_key_hash);
-- Index for site API key lookups
CREATE INDEX IF NOT EXISTS idx_sites_api_key_hash ON sites(api_key_hash);
-- Index for command status queries
CREATE INDEX IF NOT EXISTS idx_commands_status ON commands(status);
-- Index for metrics time-range queries
CREATE INDEX IF NOT EXISTS idx_metrics_agent_timestamp ON metrics(agent_id, timestamp DESC);
-- Index for finding online agents
CREATE INDEX IF NOT EXISTS idx_agents_status_last_seen ON agents(status, last_seen DESC);
```
---
### 2.2 Server: Fix WebSocket Ping/Pong Protocol
**File:** `server/src/ws/mod.rs`
**Lines:** 379-386
**Current (incorrect):**
```rust
Ok(Message::Ping(data)) => {
if tx
.send(ServerMessage::Ack { message_id: None })
.await
.is_err()
{
break;
}
}
```
**Fixed:**
```rust
Ok(Message::Ping(data)) => {
// WebSocket protocol requires Pong response to Ping
if ws_sender
.send(Message::Pong(data))
.await
.is_err()
{
warn!("Failed to send Pong response");
break;
}
}
```
---
### 2.3 Agent: Fix Service Path Construction
**File:** `agent/src/service.rs`
**Line:** 93
**Current:**
```rust
let config_path = PathBuf::from(format!(r"{}\\agent.toml", CONFIG_DIR));
```
**Fixed:**
```rust
let config_path = PathBuf::from(CONFIG_DIR).join("agent.toml");
```
---
### 2.4 Agent: Fix Rollback Script Backgrounding
**File:** `agent/src/updater/mod.rs`
**Lines:** 321-327
**Current (broken):**
```rust
tokio::process::Command::new("nohup")
.arg("bash")
.arg(&script_path)
.arg("&") // This doesn't work - & is passed as literal argument
.spawn()
```
**Fixed:**
```rust
// Use setsid to create new session and properly detach
tokio::process::Command::new("setsid")
.arg("bash")
.arg(&script_path)
.stdin(std::process::Stdio::null())
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.spawn()
.context("Failed to spawn rollback script")?;
// Alternative using nohup correctly:
tokio::process::Command::new("bash")
.arg("-c")
.arg(format!("nohup {} </dev/null >/dev/null 2>&1 &", script_path.display()))
.spawn()
.context("Failed to spawn rollback script")?;
```
---
### 2.5 Agent: Replace todo!() with Proper Error
**File:** `agent/src/main.rs`
**Lines:** 227-229
**Current:**
```rust
#[cfg(target_os = "macos")]
{
todo!("macOS launchd service installation not yet implemented");
}
```
**Fixed:**
```rust
#[cfg(target_os = "macos")]
{
return Err(anyhow::anyhow!(
"macOS launchd service installation is not yet implemented. \
Please run the agent manually or create a launchd plist manually. \
See: https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingLaunchdJobs.html"
));
}
```
---
### 2.6 Dashboard: Fix Duplicate Filter Options
**File:** `dashboard/src/pages/Agents.tsx`
**Lines:** 175-176
**Current:**
```tsx
<option value="">All Clients</option>
<option value="">Unassigned</option>
```
**Fixed:**
```tsx
<option value="">All Clients</option>
<option value="__unassigned__">Unassigned Only</option>
```
**Update filtering logic:**
```typescript
const filteredAgents = useMemo(() => {
let result = agents || [];
if (clientFilter === '__unassigned__') {
result = result.filter(a => !a.site_id && !a.client_id);
} else if (clientFilter) {
result = result.filter(a => a.client_id === clientFilter);
}
if (searchTerm) {
const term = searchTerm.toLowerCase();
result = result.filter(a =>
a.hostname.toLowerCase().includes(term) ||
a.os_type?.toLowerCase().includes(term)
);
}
return result;
}, [agents, clientFilter, searchTerm]);
```
---
### 2.7 Dashboard: Add Error Handlers to Mutations
**File:** `dashboard/src/pages/Agents.tsx`
**Lines:** 98-114
**Current:**
```typescript
const deleteMutation = useMutation({
mutationFn: agentsApi.delete,
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['agents'] });
setDeleteConfirm(null);
},
});
```
**Fixed:**
```typescript
const deleteMutation = useMutation({
mutationFn: agentsApi.delete,
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['agents'] });
setDeleteConfirm(null);
// Optional: Show success toast
},
onError: (error: AxiosError<{ error: string }>) => {
const message = error.response?.data?.error || error.message || 'Failed to delete agent';
// Show error to user
alert(`[ERROR] ${message}`);
// Or use a toast notification library
},
});
const moveMutation = useMutation({
mutationFn: ({ agentId, siteId }: { agentId: string; siteId: string | null }) =>
agentsApi.move(agentId, siteId),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['agents'] });
setMoveAgent(null);
},
onError: (error: AxiosError<{ error: string }>) => {
const message = error.response?.data?.error || error.message || 'Failed to move agent';
alert(`[ERROR] ${message}`);
},
});
```
---
### 2.8 Dashboard: Replace `any` Types
**File:** `dashboard/src/pages/Login.tsx`
**Line:** 24
**Current:**
```typescript
} catch (err: any) {
setError(err.response?.data?.error || 'Login failed');
}
```
**Fixed:**
```typescript
import { AxiosError } from 'axios';
interface ApiError {
error: string;
}
// In the component:
} catch (err) {
if (err instanceof AxiosError) {
const apiError = err.response?.data as ApiError | undefined;
setError(apiError?.error || err.message || 'Login failed');
} else if (err instanceof Error) {
setError(err.message);
} else {
setError('An unexpected error occurred');
}
}
```
**Apply same pattern to Register.tsx line 37.**
---
## Phase 3: Warning Level Issues (Recommended)
### 3.1 Add Request Logging/Audit Trail
**File:** `server/src/main.rs`
```rust
use tower_http::trace::{TraceLayer, DefaultMakeSpan, DefaultOnResponse};
use tracing::Level;
// Add to router
let app = Router::new()
// ... routes ...
.layer(
TraceLayer::new_for_http()
.make_span_with(DefaultMakeSpan::new().level(Level::INFO))
.on_response(DefaultOnResponse::new().level(Level::INFO))
);
```
---
### 3.2 Add Health Check with Database Verification
**File:** `server/src/main.rs`
**Current:**
```rust
async fn health_check() -> &'static str {
"OK"
}
```
**Fixed:**
```rust
use serde::Serialize;
#[derive(Serialize)]
struct HealthResponse {
status: &'static str,
database: &'static str,
version: &'static str,
}
async fn health_check(State(state): State<AppState>) -> Json<HealthResponse> {
let db_status = match sqlx::query("SELECT 1")
.fetch_one(&state.db)
.await
{
Ok(_) => "healthy",
Err(_) => "unhealthy",
};
Json(HealthResponse {
status: if db_status == "healthy" { "ok" } else { "degraded" },
database: db_status,
version: env!("CARGO_PKG_VERSION"),
})
}
```
---
### 3.3 Add Graceful Shutdown
**File:** `server/src/main.rs`
```rust
use tokio::signal;
async fn shutdown_signal() {
let ctrl_c = async {
signal::ctrl_c()
.await
.expect("Failed to install Ctrl+C handler");
};
#[cfg(unix)]
let terminate = async {
signal::unix::signal(signal::unix::SignalKind::terminate())
.expect("Failed to install signal handler")
.recv()
.await;
};
#[cfg(not(unix))]
let terminate = std::future::pending::<()>();
tokio::select! {
_ = ctrl_c => {},
_ = terminate => {},
}
tracing::info!("Shutdown signal received, starting graceful shutdown");
}
// In main():
let listener = tokio::net::TcpListener::bind(&addr).await?;
axum::serve(listener, app)
.with_graceful_shutdown(shutdown_signal())
.await?;
```
---
## Implementation Checklist
### Critical (Do First)
- [ ] 1.1 Restrict CORS configuration
- [ ] 1.2 Add AuthUser to all endpoints
- [ ] 1.3 Add rate limiting to registration
- [ ] 1.4 Remove Unicode characters from agent output
- [ ] 1.5 Add certificate pinning for updates
- [ ] 1.6 Fix insecure temp file creation
- [ ] 1.7 Improve dashboard token storage
- [ ] 1.8 Add command input validation
- [ ] 1.9 Document/mitigate legacy agent cert bypass
### Major (Do Second)
- [ ] 2.1 Add missing database indexes
- [ ] 2.2 Fix WebSocket ping/pong
- [ ] 2.3 Fix service path construction
- [ ] 2.4 Fix rollback script backgrounding
- [ ] 2.5 Replace todo!() with proper errors
- [ ] 2.6 Fix duplicate filter options
- [ ] 2.7 Add error handlers to mutations
- [ ] 2.8 Replace `any` types with proper types
### Warnings (Do Third)
- [ ] 3.1 Add request logging/audit trail
- [ ] 3.2 Add comprehensive health check
- [ ] 3.3 Add graceful shutdown handling
- [ ] Add pagination to commands endpoint
- [ ] Add input validation to all request types
- [ ] Implement token expiration handling in dashboard
- [ ] Add dark mode toggle (CSS already exists)
- [ ] Add log rotation to legacy agent
---
## Testing Requirements
After implementing fixes:
1. **Authentication Tests**
- Verify all protected endpoints return 401 without token
- Verify all protected endpoints work with valid token
- Verify rate limiting on registration endpoints
2. **Security Tests**
- Verify CORS blocks unauthorized origins
- Verify command validation catches dangerous patterns
- Verify update downloads only from allowed domains
3. **Functional Tests**
- Verify agent registration still works
- Verify command execution still works
- Verify metrics collection still works
- Verify WebSocket reconnection works
4. **Legacy Agent Tests**
- Verify agent works without -AllowInsecureTLS (with valid cert)
- Verify agent works with -AllowInsecureTLS (with self-signed)
- Verify Windows Event Log entries are created
---
## Estimated Effort
| Phase | Items | Estimated Time |
|-------|-------|----------------|
| Phase 1 (Critical) | 9 items | 8-12 hours |
| Phase 2 (Major) | 8 items | 4-6 hours |
| Phase 3 (Warnings) | 8+ items | 4-8 hours |
| Testing | All | 4-6 hours |
| **Total** | | **20-32 hours** |
---
**Document Version:** 1.0
**Created:** 2026-01-20
**Author:** Claude Code Review System