Comprehensive design for transforming agents from 30s heartbeat mode to persistent tunnel mode, enabling Claude Code to execute commands on remote machines through secure multiplexed WebSocket channels. Additions: - Complete implementation plan with 5-phase roadmap (5-7 weeks to GA) - Detailed architecture document covering protocol, security, and MCP integration - Database migration for tech_sessions and tunnel_audit tables Key architectural decisions: - Hybrid lifecycle: WebSocket persistent, tunnel is operational state - Channel multiplexing over single WebSocket (terminal, file ops, etc.) - Three-layer security: JWT auth, session authorization, command validation - Custom MCP server for Claude Code integration Next: Phase 1 implementation (tunnel open/close endpoints, agent mode state machine) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
12 KiB
GuruRMM Real-Time Tunnel Implementation Plan
Overview
Transform GuruRMM agents from periodic check-in mode (30-second heartbeats) to persistent tunnel mode, enabling Claude Code on tech workstation to execute commands on remote machines through secure multiplexed channels.
Architecture Summary
Current State (Confirmed via exploration)
- Server: Axum 0.7 @ 172.16.3.30:3001, WebSocket endpoint, AgentConnections HashMap
- Agent: Tokio async, 30-second heartbeat confirmed, 3 concurrent tasks (metrics/network/heartbeat)
- Protocol: Tagged JSON enums (ServerMessage/AgentMessage) with serde
Key Architectural Decisions
-
Tunnel Lifecycle: Hybrid - WebSocket stays persistent, tunnel mode is operational state change
- Agent modes: Heartbeat (default) ↔ Tunnel (active session)
- One tunnel per agent, on-demand activation, instant mode switching
-
Channel Multiplexing: Unified protocol with channel_id routing
- Single WebSocket, multiple logical channels
- Enables concurrent operations (multiple terminals, simultaneous file transfers)
- Channel types: Terminal, FileRead, FileWrite, FileList, Registry, Services
-
Claude Integration: Custom MCP server
- Tools:
gururmm_run_command,gururmm_read_file,gururmm_write_file,gururmm_list_directory,gururmm_list_agents - JWT authentication via environment variable
- Auto-manages tunnel sessions (open on first use, keep-alive, close on idle)
- Tools:
-
Security: Three-layer model
- Layer 1: JWT authentication (24h expiration)
- Layer 2: Session authorization (tech_sessions table, 4h inactivity timeout)
- Layer 3: Command validation (working directory allowlist, rate limiting 100/min, audit logging)
Protocol Extensions
New Message Types
// Server → Agent
enum ServerMessage {
// ... existing ...
TunnelOpen { session_id: String, tech_id: i32 },
TunnelClose { session_id: String },
TunnelData { channel_id: String, data: TunnelDataPayload },
}
// Agent → Server
enum AgentMessage {
// ... existing ...
TunnelReady { session_id: String },
TunnelData { channel_id: String, data: TunnelDataPayload },
TunnelError { channel_id: String, error: String },
}
enum TunnelDataPayload {
Terminal { command: String },
TerminalOutput { stdout: String, stderr: String, exit_code: Option<i32> },
FileRead { path: String },
FileContent { content: Vec<u8>, mime_type: String },
FileWrite { path: String, content: Vec<u8> },
FileList { path: String },
FileListResult { entries: Vec<FileEntry> },
}
Agent Mode State Machine
enum AgentMode {
Heartbeat, // Default: 30s heartbeats, metrics, network monitoring
Tunnel {
session_id: String,
tech_id: i32,
channels: HashMap<String, ChannelType>,
},
}
Implementation Phases
Phase 1: Core Tunnel Infrastructure (Week 1)
Goal: Establish tunnel mode switching and channel routing
Server:
- Add TunnelOpen/TunnelClose/TunnelData to ServerMessage enum
- Create tech_sessions table (id, session_id, tech_id, agent_id, opened_at, last_activity, status)
- Implement endpoints: POST /api/v1/tunnel/open, POST /close, GET /status/:session_id
- Add channel routing in WebSocket handler (route by channel_id)
- Session validation middleware (JWT + ownership check)
Agent:
- Add TunnelReady/TunnelData/TunnelError to AgentMessage enum
- Implement AgentMode state machine
- Add channel manager (HashMap<channel_id, ChannelHandler>)
- Handle TunnelOpen → respond TunnelReady
- Handle TunnelClose → cleanup channels, return to heartbeat mode
Critical Files:
server/src/ws/mod.rs- WebSocket handler, protocol definitionsserver/src/routes/tunnel.rs- NEW: Tunnel API endpointsserver/src/middleware/auth.rs- Session validationagent/src/transport/websocket.rs- WebSocket client, protocol handlingagent/src/tunnel/mod.rs- NEW: Tunnel mode managermigrations/XXX_create_tech_sessions.sql- NEW: Database schema
Phase 2: Terminal Channel (Week 2)
Goal: Execute PowerShell/cmd/bash commands through tunnel
Implementation:
- Create TerminalChannel handler on agent (spawn child process, capture streams)
- Implement TunnelDataPayload::Terminal on server
- Working directory validation on agent (configurable allowlist)
- Command result streaming for long-running commands
- Endpoint: POST /api/v1/tunnel/:session_id/command
Critical Files:
agent/src/tunnel/terminal.rs- NEW: Terminal channel handlerserver/src/routes/tunnel.rs- Add command execution endpointagent/config.toml- Add allowed_paths configuration
Phase 3: File Operations (Week 3)
Goal: Read, write, list files through tunnel
Implementation:
- Create FileChannel handler on agent
- Chunked transfer for files > 1MB (transfer_id tracking)
- Base64 encoding for binary data
- MIME type detection (magic numbers)
- Endpoints: GET /file, PUT /file, POST /file/list
Critical Files:
agent/src/tunnel/file.rs- NEW: File channel handlerserver/src/routes/tunnel.rs- Add file operation endpointscommon/src/transfer.rs- NEW: Chunked transfer utilities
Phase 4: MCP Server Integration (Week 4)
Goal: Expose tunnel operations as MCP tools for Claude Code
Implementation:
- Create new project:
gururmm-mcp-server(Rust) - Use
mcp-server-rscrate - Implement 5 core tools (run_command, read_file, write_file, list_dir, list_agents)
- JWT token from environment variable (GURURMM_AUTH_TOKEN)
- Auto-manage tunnel sessions (open on first tool use, 5min idle timeout)
Critical Files:
mcp-server/src/main.rs- NEW: MCP server entry pointmcp-server/src/tools.rs- NEW: Tool implementationsmcp-server/src/session.rs- NEW: Session managermcp-server/Cargo.toml- NEW: Dependencies
MCP Config Example:
{
"mcpServers": {
"gururmm": {
"command": "gururmm-mcp-server",
"env": {
"GURURMM_API_URL": "http://172.16.3.30:3001",
"GURURMM_AUTH_TOKEN": "jwt-token-here"
}
}
}
}
Phase 5: Advanced Features (Week 5+)
- Registry operations (Windows winreg crate)
- Service management (sc.exe/WMI on Windows, systemctl on Linux)
- Interactive terminal with PTY (stretch goal)
Database Schema
CREATE TABLE tech_sessions (
id SERIAL PRIMARY KEY,
session_id VARCHAR(36) UNIQUE NOT NULL,
tech_id INTEGER NOT NULL REFERENCES techs(id),
agent_id INTEGER NOT NULL REFERENCES agents(id),
opened_at TIMESTAMP NOT NULL DEFAULT NOW(),
last_activity TIMESTAMP NOT NULL DEFAULT NOW(),
closed_at TIMESTAMP,
status VARCHAR(20) NOT NULL DEFAULT 'active',
UNIQUE(tech_id, agent_id, status) WHERE status = 'active'
);
CREATE TABLE tunnel_audit (
id SERIAL PRIMARY KEY,
session_id VARCHAR(36) NOT NULL REFERENCES tech_sessions(session_id),
channel_id VARCHAR(36) NOT NULL,
operation VARCHAR(50) NOT NULL,
details JSONB,
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_tech_sessions_tech ON tech_sessions(tech_id);
CREATE INDEX idx_tech_sessions_agent ON tech_sessions(agent_id);
CREATE INDEX idx_tunnel_audit_session ON tunnel_audit(session_id);
API Endpoints (New)
POST /api/v1/tunnel/open
Body: { "agent_id": 123 }
Response: { "session_id": "uuid", "status": "active" }
POST /api/v1/tunnel/close
Body: { "session_id": "uuid" }
GET /api/v1/tunnel/status/:session_id
POST /api/v1/tunnel/:session_id/command
Body: { "command": "...", "shell": "powershell", "working_dir": "...", "timeout": 30000 }
GET /api/v1/tunnel/:session_id/file?path=...
PUT /api/v1/tunnel/:session_id/file?path=...
POST /api/v1/tunnel/:session_id/file/list?path=...
MCP Tools
gururmm_run_command(agent_id, command, shell, working_dir, timeout)
gururmm_read_file(agent_id, path)
gururmm_write_file(agent_id, path, content)
gururmm_list_directory(agent_id, path)
gururmm_list_agents()
Security Implementation
Working Directory Validation
# agent/config.toml
[security]
allowed_paths = ["C:\\Shares", "C:\\Temp"]
Agent validates all file operations against allowlist, rejects path traversal (..).
Rate Limiting
- Server enforces: 100 commands per minute per tech per agent
- Sliding window (in-memory or Redis)
- 429 response on limit exceeded
- Violations logged to tunnel_audit
Command Injection Prevention
- tokio::process::Command (no shell expansion)
- PowerShell:
-NoProfile -NonInteractive -Command - Input sanitization (escape quotes, reject backticks)
- Timeout enforcement
Session Security
- JWT 24h expiration
- Sessions auto-expire 4h inactivity
- One tunnel per agent (prevents concurrent session conflicts)
- Admin force-close endpoint
Testing Strategy
Unit Tests
- Channel routing (correct channel receives message)
- Session validation (JWT + ownership)
- Command sanitization
- Path validation (traversal prevention)
Integration Tests
- Full tunnel lifecycle (open → command → close)
- Concurrent sessions to different agents
- Session timeout enforcement
- Rate limiting
End-to-End Tests
- Claude Code MCP integration
- File upload via MCP, verify on agent
- Multi-step workflow (read file → modify → write back)
Rollout Plan
- Week 5: Internal testing (2 agents: AD2, DESKTOP-0O8A1RL)
- Week 6: Beta release (3 power user techs)
- Week 7: General availability (all techs, documentation, training)
Success Metrics
Infrastructure (Phase 1-2):
- 95% tunnel open success rate
- <500ms command response time
- Zero session conflicts
MCP Integration (Phase 3-4):
- 80% tech adoption within 2 weeks
-
50 tunnel sessions/day
- <5% command error rate
Long-term:
- 20% reduction in RDP sessions
- 90% tech satisfaction
- <1% security incidents
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Command injection | Critical | Input sanitization, no shell expansion, path allowlist |
| Session hijacking | High | Short-lived JWT, session ownership validation, audit logging |
| WebSocket instability | Medium | Auto-reconnect, session recovery |
| Rate limiting too strict | Medium | Configurable per-tech limits, user feedback |
Open Questions
- Registry operations scope (full access or specific hives only)?
- Interactive terminal priority (defer to Phase 6)?
- Multi-tech sessions for pair programming?
- MCP server credential manager integration (1Password)?
- Agent-side logging requirements (compliance)?
Verification Plan
Phase 1 Verification
# Tech opens tunnel session
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/open \
-H "Authorization: Bearer $JWT" \
-d '{"agent_id": 1}'
# Response: {"session_id": "uuid", "status": "active"}
# Check agent logs - should show: "Tunnel mode activated for session uuid"
# Check database: SELECT * FROM tech_sessions WHERE session_id = 'uuid';
Phase 2 Verification
# Execute command via tunnel
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/$SESSION_ID/command \
-H "Authorization: Bearer $JWT" \
-d '{"command": "Get-Date", "shell": "powershell"}'
# Response: {"stdout": "Sunday, April 13, 2026...", "exit_code": 0}
Phase 4 Verification (MCP)
# Configure MCP server in Claude Code
# Test tools appear in Claude's tool list
# Execute: "List files in C:\Shares on agent ID 1"
# Claude should call gururmm_list_directory tool
# Verify output shows directory listing
Next Steps After Approval
- Create feature branch:
feature/real-time-tunnel - Phase 1 database migrations (tech_sessions, tunnel_audit tables)
- Update protocol enums (ServerMessage/AgentMessage)
- Implement tunnel open/close endpoints
- Update agent WebSocket handler for tunnel mode
- Unit tests for session validation
- Deploy to test environment
Estimated Timeline: 5 weeks to MCP integration, 7 weeks to GA
Detailed plan location: projects/msp-tools/guru-rmm/plans/real-time-tunnel-architecture.md