# GuruRMM Real-Time Tunnel Implementation Plan ## Overview Transform GuruRMM agents from periodic check-in mode (30-second heartbeats) to persistent tunnel mode, enabling Claude Code on tech workstation to execute commands on remote machines through secure multiplexed channels. --- ## Architecture Summary ### Current State (Confirmed via exploration) - **Server:** Axum 0.7 @ 172.16.3.30:3001, WebSocket endpoint, AgentConnections HashMap - **Agent:** Tokio async, 30-second heartbeat confirmed, 3 concurrent tasks (metrics/network/heartbeat) - **Protocol:** Tagged JSON enums (ServerMessage/AgentMessage) with serde ### Key Architectural Decisions 1. **Tunnel Lifecycle:** Hybrid - WebSocket stays persistent, tunnel mode is operational state change - Agent modes: Heartbeat (default) ↔ Tunnel (active session) - One tunnel per agent, on-demand activation, instant mode switching 2. **Channel Multiplexing:** Unified protocol with channel_id routing - Single WebSocket, multiple logical channels - Enables concurrent operations (multiple terminals, simultaneous file transfers) - Channel types: Terminal, FileRead, FileWrite, FileList, Registry, Services 3. **Claude Integration:** Custom MCP server - Tools: `gururmm_run_command`, `gururmm_read_file`, `gururmm_write_file`, `gururmm_list_directory`, `gururmm_list_agents` - JWT authentication via environment variable - Auto-manages tunnel sessions (open on first use, keep-alive, close on idle) 4. **Security:** Three-layer model - Layer 1: JWT authentication (24h expiration) - Layer 2: Session authorization (tech_sessions table, 4h inactivity timeout) - Layer 3: Command validation (working directory allowlist, rate limiting 100/min, audit logging) --- ## Protocol Extensions ### New Message Types ```rust // Server → Agent enum ServerMessage { // ... existing ... TunnelOpen { session_id: String, tech_id: i32 }, TunnelClose { session_id: String }, TunnelData { channel_id: String, data: TunnelDataPayload }, } // Agent → Server enum AgentMessage { // ... existing ... TunnelReady { session_id: String }, TunnelData { channel_id: String, data: TunnelDataPayload }, TunnelError { channel_id: String, error: String }, } enum TunnelDataPayload { Terminal { command: String }, TerminalOutput { stdout: String, stderr: String, exit_code: Option }, FileRead { path: String }, FileContent { content: Vec, mime_type: String }, FileWrite { path: String, content: Vec }, FileList { path: String }, FileListResult { entries: Vec }, } ``` ### Agent Mode State Machine ```rust enum AgentMode { Heartbeat, // Default: 30s heartbeats, metrics, network monitoring Tunnel { session_id: String, tech_id: i32, channels: HashMap, }, } ``` --- ## Implementation Phases ### Phase 1: Core Tunnel Infrastructure (Week 1) **Goal:** Establish tunnel mode switching and channel routing **Server:** - Add TunnelOpen/TunnelClose/TunnelData to ServerMessage enum - Create tech_sessions table (id, session_id, tech_id, agent_id, opened_at, last_activity, status) - Implement endpoints: POST /api/v1/tunnel/open, POST /close, GET /status/:session_id - Add channel routing in WebSocket handler (route by channel_id) - Session validation middleware (JWT + ownership check) **Agent:** - Add TunnelReady/TunnelData/TunnelError to AgentMessage enum - Implement AgentMode state machine - Add channel manager (HashMap) - Handle TunnelOpen → respond TunnelReady - Handle TunnelClose → cleanup channels, return to heartbeat mode **Critical Files:** - `server/src/ws/mod.rs` - WebSocket handler, protocol definitions - `server/src/routes/tunnel.rs` - NEW: Tunnel API endpoints - `server/src/middleware/auth.rs` - Session validation - `agent/src/transport/websocket.rs` - WebSocket client, protocol handling - `agent/src/tunnel/mod.rs` - NEW: Tunnel mode manager - `migrations/XXX_create_tech_sessions.sql` - NEW: Database schema ### Phase 2: Terminal Channel (Week 2) **Goal:** Execute PowerShell/cmd/bash commands through tunnel **Implementation:** - Create TerminalChannel handler on agent (spawn child process, capture streams) - Implement TunnelDataPayload::Terminal on server - Working directory validation on agent (configurable allowlist) - Command result streaming for long-running commands - Endpoint: POST /api/v1/tunnel/:session_id/command **Critical Files:** - `agent/src/tunnel/terminal.rs` - NEW: Terminal channel handler - `server/src/routes/tunnel.rs` - Add command execution endpoint - `agent/config.toml` - Add allowed_paths configuration ### Phase 3: File Operations (Week 3) **Goal:** Read, write, list files through tunnel **Implementation:** - Create FileChannel handler on agent - Chunked transfer for files > 1MB (transfer_id tracking) - Base64 encoding for binary data - MIME type detection (magic numbers) - Endpoints: GET /file, PUT /file, POST /file/list **Critical Files:** - `agent/src/tunnel/file.rs` - NEW: File channel handler - `server/src/routes/tunnel.rs` - Add file operation endpoints - `common/src/transfer.rs` - NEW: Chunked transfer utilities ### Phase 4: MCP Server Integration (Week 4) **Goal:** Expose tunnel operations as MCP tools for Claude Code **Implementation:** - Create new project: `gururmm-mcp-server` (Rust) - Use `mcp-server-rs` crate - Implement 5 core tools (run_command, read_file, write_file, list_dir, list_agents) - JWT token from environment variable (GURURMM_AUTH_TOKEN) - Auto-manage tunnel sessions (open on first tool use, 5min idle timeout) **Critical Files:** - `mcp-server/src/main.rs` - NEW: MCP server entry point - `mcp-server/src/tools.rs` - NEW: Tool implementations - `mcp-server/src/session.rs` - NEW: Session manager - `mcp-server/Cargo.toml` - NEW: Dependencies **MCP Config Example:** ```json { "mcpServers": { "gururmm": { "command": "gururmm-mcp-server", "env": { "GURURMM_API_URL": "http://172.16.3.30:3001", "GURURMM_AUTH_TOKEN": "jwt-token-here" } } } } ``` ### Phase 5: Advanced Features (Week 5+) - Registry operations (Windows winreg crate) - Service management (sc.exe/WMI on Windows, systemctl on Linux) - Interactive terminal with PTY (stretch goal) --- ## Database Schema ```sql CREATE TABLE tech_sessions ( id SERIAL PRIMARY KEY, session_id VARCHAR(36) UNIQUE NOT NULL, tech_id INTEGER NOT NULL REFERENCES techs(id), agent_id INTEGER NOT NULL REFERENCES agents(id), opened_at TIMESTAMP NOT NULL DEFAULT NOW(), last_activity TIMESTAMP NOT NULL DEFAULT NOW(), closed_at TIMESTAMP, status VARCHAR(20) NOT NULL DEFAULT 'active', UNIQUE(tech_id, agent_id, status) WHERE status = 'active' ); CREATE TABLE tunnel_audit ( id SERIAL PRIMARY KEY, session_id VARCHAR(36) NOT NULL REFERENCES tech_sessions(session_id), channel_id VARCHAR(36) NOT NULL, operation VARCHAR(50) NOT NULL, details JSONB, created_at TIMESTAMP NOT NULL DEFAULT NOW() ); CREATE INDEX idx_tech_sessions_tech ON tech_sessions(tech_id); CREATE INDEX idx_tech_sessions_agent ON tech_sessions(agent_id); CREATE INDEX idx_tunnel_audit_session ON tunnel_audit(session_id); ``` --- ## API Endpoints (New) ``` POST /api/v1/tunnel/open Body: { "agent_id": 123 } Response: { "session_id": "uuid", "status": "active" } POST /api/v1/tunnel/close Body: { "session_id": "uuid" } GET /api/v1/tunnel/status/:session_id POST /api/v1/tunnel/:session_id/command Body: { "command": "...", "shell": "powershell", "working_dir": "...", "timeout": 30000 } GET /api/v1/tunnel/:session_id/file?path=... PUT /api/v1/tunnel/:session_id/file?path=... POST /api/v1/tunnel/:session_id/file/list?path=... ``` --- ## MCP Tools ``` gururmm_run_command(agent_id, command, shell, working_dir, timeout) gururmm_read_file(agent_id, path) gururmm_write_file(agent_id, path, content) gururmm_list_directory(agent_id, path) gururmm_list_agents() ``` --- ## Security Implementation ### Working Directory Validation ```toml # agent/config.toml [security] allowed_paths = ["C:\\Shares", "C:\\Temp"] ``` Agent validates all file operations against allowlist, rejects path traversal (`..`). ### Rate Limiting - Server enforces: 100 commands per minute per tech per agent - Sliding window (in-memory or Redis) - 429 response on limit exceeded - Violations logged to tunnel_audit ### Command Injection Prevention - tokio::process::Command (no shell expansion) - PowerShell: `-NoProfile -NonInteractive -Command` - Input sanitization (escape quotes, reject backticks) - Timeout enforcement ### Session Security - JWT 24h expiration - Sessions auto-expire 4h inactivity - One tunnel per agent (prevents concurrent session conflicts) - Admin force-close endpoint --- ## Testing Strategy ### Unit Tests - Channel routing (correct channel receives message) - Session validation (JWT + ownership) - Command sanitization - Path validation (traversal prevention) ### Integration Tests - Full tunnel lifecycle (open → command → close) - Concurrent sessions to different agents - Session timeout enforcement - Rate limiting ### End-to-End Tests - Claude Code MCP integration - File upload via MCP, verify on agent - Multi-step workflow (read file → modify → write back) --- ## Rollout Plan 1. **Week 5:** Internal testing (2 agents: AD2, DESKTOP-0O8A1RL) 2. **Week 6:** Beta release (3 power user techs) 3. **Week 7:** General availability (all techs, documentation, training) --- ## Success Metrics **Infrastructure (Phase 1-2):** - 95% tunnel open success rate - <500ms command response time - Zero session conflicts **MCP Integration (Phase 3-4):** - 80% tech adoption within 2 weeks - >50 tunnel sessions/day - <5% command error rate **Long-term:** - 20% reduction in RDP sessions - 90% tech satisfaction - <1% security incidents --- ## Risks and Mitigations | Risk | Impact | Mitigation | |------|--------|------------| | Command injection | Critical | Input sanitization, no shell expansion, path allowlist | | Session hijacking | High | Short-lived JWT, session ownership validation, audit logging | | WebSocket instability | Medium | Auto-reconnect, session recovery | | Rate limiting too strict | Medium | Configurable per-tech limits, user feedback | --- ## Open Questions 1. Registry operations scope (full access or specific hives only)? 2. Interactive terminal priority (defer to Phase 6)? 3. Multi-tech sessions for pair programming? 4. MCP server credential manager integration (1Password)? 5. Agent-side logging requirements (compliance)? --- ## Verification Plan ### Phase 1 Verification ```bash # Tech opens tunnel session curl -X POST http://172.16.3.30:3001/api/v1/tunnel/open \ -H "Authorization: Bearer $JWT" \ -d '{"agent_id": 1}' # Response: {"session_id": "uuid", "status": "active"} # Check agent logs - should show: "Tunnel mode activated for session uuid" # Check database: SELECT * FROM tech_sessions WHERE session_id = 'uuid'; ``` ### Phase 2 Verification ```bash # Execute command via tunnel curl -X POST http://172.16.3.30:3001/api/v1/tunnel/$SESSION_ID/command \ -H "Authorization: Bearer $JWT" \ -d '{"command": "Get-Date", "shell": "powershell"}' # Response: {"stdout": "Sunday, April 13, 2026...", "exit_code": 0} ``` ### Phase 4 Verification (MCP) ```bash # Configure MCP server in Claude Code # Test tools appear in Claude's tool list # Execute: "List files in C:\Shares on agent ID 1" # Claude should call gururmm_list_directory tool # Verify output shows directory listing ``` --- ## Next Steps After Approval 1. Create feature branch: `feature/real-time-tunnel` 2. Phase 1 database migrations (tech_sessions, tunnel_audit tables) 3. Update protocol enums (ServerMessage/AgentMessage) 4. Implement tunnel open/close endpoints 5. Update agent WebSocket handler for tunnel mode 6. Unit tests for session validation 7. Deploy to test environment **Estimated Timeline:** 5 weeks to MCP integration, 7 weeks to GA --- **Detailed plan location:** `projects/msp-tools/guru-rmm/plans/real-time-tunnel-architecture.md`