Files

azcomputerguru 9940faf34a Add GuruRMM real-time tunnel architecture and planning

Comprehensive design for transforming agents from 30s heartbeat mode to
persistent tunnel mode, enabling Claude Code to execute commands on remote
machines through secure multiplexed WebSocket channels.

Additions:
- Complete implementation plan with 5-phase roadmap (5-7 weeks to GA)
- Detailed architecture document covering protocol, security, and MCP integration
- Database migration for tech_sessions and tunnel_audit tables

Key architectural decisions:
- Hybrid lifecycle: WebSocket persistent, tunnel is operational state
- Channel multiplexing over single WebSocket (terminal, file ops, etc.)
- Three-layer security: JWT auth, session authorization, command validation
- Custom MCP server for Claude Code integration

Next: Phase 1 implementation (tunnel open/close endpoints, agent mode state machine)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-04-14 06:32:16 -07:00

12 KiB

Raw Blame History

GuruRMM Real-Time Tunnel Implementation Plan

Overview

Transform GuruRMM agents from periodic check-in mode (30-second heartbeats) to persistent tunnel mode, enabling Claude Code on tech workstation to execute commands on remote machines through secure multiplexed channels.

Architecture Summary

Current State (Confirmed via exploration)

Server: Axum 0.7 @ 172.16.3.30:3001, WebSocket endpoint, AgentConnections HashMap
Agent: Tokio async, 30-second heartbeat confirmed, 3 concurrent tasks (metrics/network/heartbeat)
Protocol: Tagged JSON enums (ServerMessage/AgentMessage) with serde

Key Architectural Decisions

Tunnel Lifecycle: Hybrid - WebSocket stays persistent, tunnel mode is operational state change
- Agent modes: Heartbeat (default) ↔ Tunnel (active session)
- One tunnel per agent, on-demand activation, instant mode switching
Channel Multiplexing: Unified protocol with channel_id routing
- Single WebSocket, multiple logical channels
- Enables concurrent operations (multiple terminals, simultaneous file transfers)
- Channel types: Terminal, FileRead, FileWrite, FileList, Registry, Services
Claude Integration: Custom MCP server
- Tools: gururmm_run_command, gururmm_read_file, gururmm_write_file, gururmm_list_directory, gururmm_list_agents
- JWT authentication via environment variable
- Auto-manages tunnel sessions (open on first use, keep-alive, close on idle)
Security: Three-layer model
- Layer 1: JWT authentication (24h expiration)
- Layer 2: Session authorization (tech_sessions table, 4h inactivity timeout)
- Layer 3: Command validation (working directory allowlist, rate limiting 100/min, audit logging)

Protocol Extensions

New Message Types

// Server → Agent
enum ServerMessage {
    // ... existing ...
    TunnelOpen { session_id: String, tech_id: i32 },
    TunnelClose { session_id: String },
    TunnelData { channel_id: String, data: TunnelDataPayload },
}

// Agent → Server
enum AgentMessage {
    // ... existing ...
    TunnelReady { session_id: String },
    TunnelData { channel_id: String, data: TunnelDataPayload },
    TunnelError { channel_id: String, error: String },
}

enum TunnelDataPayload {
    Terminal { command: String },
    TerminalOutput { stdout: String, stderr: String, exit_code: Option<i32> },
    FileRead { path: String },
    FileContent { content: Vec<u8>, mime_type: String },
    FileWrite { path: String, content: Vec<u8> },
    FileList { path: String },
    FileListResult { entries: Vec<FileEntry> },
}

Agent Mode State Machine

enum AgentMode {
    Heartbeat,  // Default: 30s heartbeats, metrics, network monitoring
    Tunnel {
        session_id: String,
        tech_id: i32,
        channels: HashMap<String, ChannelType>,
    },
}

Implementation Phases

Phase 1: Core Tunnel Infrastructure (Week 1)

Goal: Establish tunnel mode switching and channel routing

Server:

Add TunnelOpen/TunnelClose/TunnelData to ServerMessage enum
Create tech_sessions table (id, session_id, tech_id, agent_id, opened_at, last_activity, status)
Implement endpoints: POST /api/v1/tunnel/open, POST /close, GET /status/:session_id
Add channel routing in WebSocket handler (route by channel_id)
Session validation middleware (JWT + ownership check)

Agent:

Add TunnelReady/TunnelData/TunnelError to AgentMessage enum
Implement AgentMode state machine
Add channel manager (HashMap<channel_id, ChannelHandler>)
Handle TunnelOpen → respond TunnelReady
Handle TunnelClose → cleanup channels, return to heartbeat mode

Critical Files:

server/src/ws/mod.rs - WebSocket handler, protocol definitions
server/src/routes/tunnel.rs - NEW: Tunnel API endpoints
server/src/middleware/auth.rs - Session validation
agent/src/transport/websocket.rs - WebSocket client, protocol handling
agent/src/tunnel/mod.rs - NEW: Tunnel mode manager
migrations/XXX_create_tech_sessions.sql - NEW: Database schema

Phase 2: Terminal Channel (Week 2)

Goal: Execute PowerShell/cmd/bash commands through tunnel

Implementation:

Create TerminalChannel handler on agent (spawn child process, capture streams)
Implement TunnelDataPayload::Terminal on server
Working directory validation on agent (configurable allowlist)
Command result streaming for long-running commands
Endpoint: POST /api/v1/tunnel/:session_id/command

Critical Files:

agent/src/tunnel/terminal.rs - NEW: Terminal channel handler
server/src/routes/tunnel.rs - Add command execution endpoint
agent/config.toml - Add allowed_paths configuration

Phase 3: File Operations (Week 3)

Goal: Read, write, list files through tunnel

Implementation:

Create FileChannel handler on agent
Chunked transfer for files > 1MB (transfer_id tracking)
Base64 encoding for binary data
MIME type detection (magic numbers)
Endpoints: GET /file, PUT /file, POST /file/list

Critical Files:

agent/src/tunnel/file.rs - NEW: File channel handler
server/src/routes/tunnel.rs - Add file operation endpoints
common/src/transfer.rs - NEW: Chunked transfer utilities

Phase 4: MCP Server Integration (Week 4)

Goal: Expose tunnel operations as MCP tools for Claude Code

Implementation:

Create new project: gururmm-mcp-server (Rust)
Use mcp-server-rs crate
Implement 5 core tools (run_command, read_file, write_file, list_dir, list_agents)
JWT token from environment variable (GURURMM_AUTH_TOKEN)
Auto-manage tunnel sessions (open on first tool use, 5min idle timeout)

Critical Files:

mcp-server/src/main.rs - NEW: MCP server entry point
mcp-server/src/tools.rs - NEW: Tool implementations
mcp-server/src/session.rs - NEW: Session manager
mcp-server/Cargo.toml - NEW: Dependencies

MCP Config Example:

{
  "mcpServers": {
    "gururmm": {
      "command": "gururmm-mcp-server",
      "env": {
        "GURURMM_API_URL": "http://172.16.3.30:3001",
        "GURURMM_AUTH_TOKEN": "jwt-token-here"
      }
    }
  }
}

Phase 5: Advanced Features (Week 5+)

Registry operations (Windows winreg crate)
Service management (sc.exe/WMI on Windows, systemctl on Linux)
Interactive terminal with PTY (stretch goal)

Database Schema

CREATE TABLE tech_sessions (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(36) UNIQUE NOT NULL,
    tech_id INTEGER NOT NULL REFERENCES techs(id),
    agent_id INTEGER NOT NULL REFERENCES agents(id),
    opened_at TIMESTAMP NOT NULL DEFAULT NOW(),
    last_activity TIMESTAMP NOT NULL DEFAULT NOW(),
    closed_at TIMESTAMP,
    status VARCHAR(20) NOT NULL DEFAULT 'active',
    UNIQUE(tech_id, agent_id, status) WHERE status = 'active'
);

CREATE TABLE tunnel_audit (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(36) NOT NULL REFERENCES tech_sessions(session_id),
    channel_id VARCHAR(36) NOT NULL,
    operation VARCHAR(50) NOT NULL,
    details JSONB,
    created_at TIMESTAMP NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_tech_sessions_tech ON tech_sessions(tech_id);
CREATE INDEX idx_tech_sessions_agent ON tech_sessions(agent_id);
CREATE INDEX idx_tunnel_audit_session ON tunnel_audit(session_id);

API Endpoints (New)

POST   /api/v1/tunnel/open
       Body: { "agent_id": 123 }
       Response: { "session_id": "uuid", "status": "active" }

POST   /api/v1/tunnel/close
       Body: { "session_id": "uuid" }

GET    /api/v1/tunnel/status/:session_id

POST   /api/v1/tunnel/:session_id/command
       Body: { "command": "...", "shell": "powershell", "working_dir": "...", "timeout": 30000 }

GET    /api/v1/tunnel/:session_id/file?path=...

PUT    /api/v1/tunnel/:session_id/file?path=...

POST   /api/v1/tunnel/:session_id/file/list?path=...

MCP Tools

gururmm_run_command(agent_id, command, shell, working_dir, timeout)
gururmm_read_file(agent_id, path)
gururmm_write_file(agent_id, path, content)
gururmm_list_directory(agent_id, path)
gururmm_list_agents()

Security Implementation

Working Directory Validation

# agent/config.toml
[security]
allowed_paths = ["C:\\Shares", "C:\\Temp"]

Agent validates all file operations against allowlist, rejects path traversal (..).

Rate Limiting

Server enforces: 100 commands per minute per tech per agent
Sliding window (in-memory or Redis)
429 response on limit exceeded
Violations logged to tunnel_audit

Command Injection Prevention

tokio::process::Command (no shell expansion)
PowerShell: -NoProfile -NonInteractive -Command
Input sanitization (escape quotes, reject backticks)
Timeout enforcement

Session Security

JWT 24h expiration
Sessions auto-expire 4h inactivity
One tunnel per agent (prevents concurrent session conflicts)
Admin force-close endpoint

Testing Strategy

Unit Tests

Channel routing (correct channel receives message)
Session validation (JWT + ownership)
Command sanitization
Path validation (traversal prevention)

Integration Tests

Full tunnel lifecycle (open → command → close)
Concurrent sessions to different agents
Session timeout enforcement
Rate limiting

End-to-End Tests

Claude Code MCP integration
File upload via MCP, verify on agent
Multi-step workflow (read file → modify → write back)

Rollout Plan

Week 5: Internal testing (2 agents: AD2, DESKTOP-0O8A1RL)
Week 6: Beta release (3 power user techs)
Week 7: General availability (all techs, documentation, training)

Success Metrics

Infrastructure (Phase 1-2):

95% tunnel open success rate
<500ms command response time
Zero session conflicts

MCP Integration (Phase 3-4):

80% tech adoption within 2 weeks
50 tunnel sessions/day
<5% command error rate

Long-term:

20% reduction in RDP sessions
90% tech satisfaction
<1% security incidents

Risks and Mitigations

Risk	Impact	Mitigation
Command injection	Critical	Input sanitization, no shell expansion, path allowlist
Session hijacking	High	Short-lived JWT, session ownership validation, audit logging
WebSocket instability	Medium	Auto-reconnect, session recovery
Rate limiting too strict	Medium	Configurable per-tech limits, user feedback

Open Questions

Registry operations scope (full access or specific hives only)?
Interactive terminal priority (defer to Phase 6)?
Multi-tech sessions for pair programming?
MCP server credential manager integration (1Password)?
Agent-side logging requirements (compliance)?

Verification Plan

Phase 1 Verification

# Tech opens tunnel session
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/open \
  -H "Authorization: Bearer $JWT" \
  -d '{"agent_id": 1}'
# Response: {"session_id": "uuid", "status": "active"}

# Check agent logs - should show: "Tunnel mode activated for session uuid"
# Check database: SELECT * FROM tech_sessions WHERE session_id = 'uuid';

Phase 2 Verification

# Execute command via tunnel
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/$SESSION_ID/command \
  -H "Authorization: Bearer $JWT" \
  -d '{"command": "Get-Date", "shell": "powershell"}'
# Response: {"stdout": "Sunday, April 13, 2026...", "exit_code": 0}

Phase 4 Verification (MCP)

# Configure MCP server in Claude Code
# Test tools appear in Claude's tool list
# Execute: "List files in C:\Shares on agent ID 1"
# Claude should call gururmm_list_directory tool
# Verify output shows directory listing

Next Steps After Approval

Create feature branch: feature/real-time-tunnel
Phase 1 database migrations (tech_sessions, tunnel_audit tables)
Update protocol enums (ServerMessage/AgentMessage)
Implement tunnel open/close endpoints
Update agent WebSocket handler for tunnel mode
Unit tests for session validation
Deploy to test environment

Estimated Timeline: 5 weeks to MCP integration, 7 weeks to GA

Detailed plan location: projects/msp-tools/guru-rmm/plans/real-time-tunnel-architecture.md

12 KiB Raw Blame History

GuruRMM Real-Time Tunnel Implementation Plan

Overview

Architecture Summary

Current State (Confirmed via exploration)

Key Architectural Decisions

Protocol Extensions

New Message Types

Agent Mode State Machine

Implementation Phases

Phase 1: Core Tunnel Infrastructure (Week 1)

Phase 2: Terminal Channel (Week 2)

Phase 3: File Operations (Week 3)

Phase 4: MCP Server Integration (Week 4)

Phase 5: Advanced Features (Week 5+)

Database Schema

API Endpoints (New)

MCP Tools

Security Implementation

Working Directory Validation

Rate Limiting

Command Injection Prevention

Session Security

Testing Strategy

Unit Tests

Integration Tests

End-to-End Tests

Rollout Plan

Success Metrics

Risks and Mitigations

Open Questions

Verification Plan

Phase 1 Verification

Phase 2 Verification

Phase 4 Verification (MCP)

Next Steps After Approval

12 KiB

Raw Blame History