Comprehensive design for transforming agents from 30s heartbeat mode to persistent tunnel mode, enabling Claude Code to execute commands on remote machines through secure multiplexed WebSocket channels. Additions: - Complete implementation plan with 5-phase roadmap (5-7 weeks to GA) - Detailed architecture document covering protocol, security, and MCP integration - Database migration for tech_sessions and tunnel_audit tables Key architectural decisions: - Hybrid lifecycle: WebSocket persistent, tunnel is operational state - Channel multiplexing over single WebSocket (terminal, file ops, etc.) - Three-layer security: JWT auth, session authorization, command validation - Custom MCP server for Claude Code integration Next: Phase 1 implementation (tunnel open/close endpoints, agent mode state machine) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
397 lines
12 KiB
Markdown
397 lines
12 KiB
Markdown
# GuruRMM Real-Time Tunnel Implementation Plan
|
|
|
|
## Overview
|
|
|
|
Transform GuruRMM agents from periodic check-in mode (30-second heartbeats) to persistent tunnel mode, enabling Claude Code on tech workstation to execute commands on remote machines through secure multiplexed channels.
|
|
|
|
---
|
|
|
|
## Architecture Summary
|
|
|
|
### Current State (Confirmed via exploration)
|
|
- **Server:** Axum 0.7 @ 172.16.3.30:3001, WebSocket endpoint, AgentConnections HashMap
|
|
- **Agent:** Tokio async, 30-second heartbeat confirmed, 3 concurrent tasks (metrics/network/heartbeat)
|
|
- **Protocol:** Tagged JSON enums (ServerMessage/AgentMessage) with serde
|
|
|
|
### Key Architectural Decisions
|
|
|
|
1. **Tunnel Lifecycle:** Hybrid - WebSocket stays persistent, tunnel mode is operational state change
|
|
- Agent modes: Heartbeat (default) ↔ Tunnel (active session)
|
|
- One tunnel per agent, on-demand activation, instant mode switching
|
|
|
|
2. **Channel Multiplexing:** Unified protocol with channel_id routing
|
|
- Single WebSocket, multiple logical channels
|
|
- Enables concurrent operations (multiple terminals, simultaneous file transfers)
|
|
- Channel types: Terminal, FileRead, FileWrite, FileList, Registry, Services
|
|
|
|
3. **Claude Integration:** Custom MCP server
|
|
- Tools: `gururmm_run_command`, `gururmm_read_file`, `gururmm_write_file`, `gururmm_list_directory`, `gururmm_list_agents`
|
|
- JWT authentication via environment variable
|
|
- Auto-manages tunnel sessions (open on first use, keep-alive, close on idle)
|
|
|
|
4. **Security:** Three-layer model
|
|
- Layer 1: JWT authentication (24h expiration)
|
|
- Layer 2: Session authorization (tech_sessions table, 4h inactivity timeout)
|
|
- Layer 3: Command validation (working directory allowlist, rate limiting 100/min, audit logging)
|
|
|
|
---
|
|
|
|
## Protocol Extensions
|
|
|
|
### New Message Types
|
|
|
|
```rust
|
|
// Server → Agent
|
|
enum ServerMessage {
|
|
// ... existing ...
|
|
TunnelOpen { session_id: String, tech_id: i32 },
|
|
TunnelClose { session_id: String },
|
|
TunnelData { channel_id: String, data: TunnelDataPayload },
|
|
}
|
|
|
|
// Agent → Server
|
|
enum AgentMessage {
|
|
// ... existing ...
|
|
TunnelReady { session_id: String },
|
|
TunnelData { channel_id: String, data: TunnelDataPayload },
|
|
TunnelError { channel_id: String, error: String },
|
|
}
|
|
|
|
enum TunnelDataPayload {
|
|
Terminal { command: String },
|
|
TerminalOutput { stdout: String, stderr: String, exit_code: Option<i32> },
|
|
FileRead { path: String },
|
|
FileContent { content: Vec<u8>, mime_type: String },
|
|
FileWrite { path: String, content: Vec<u8> },
|
|
FileList { path: String },
|
|
FileListResult { entries: Vec<FileEntry> },
|
|
}
|
|
```
|
|
|
|
### Agent Mode State Machine
|
|
|
|
```rust
|
|
enum AgentMode {
|
|
Heartbeat, // Default: 30s heartbeats, metrics, network monitoring
|
|
Tunnel {
|
|
session_id: String,
|
|
tech_id: i32,
|
|
channels: HashMap<String, ChannelType>,
|
|
},
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Core Tunnel Infrastructure (Week 1)
|
|
**Goal:** Establish tunnel mode switching and channel routing
|
|
|
|
**Server:**
|
|
- Add TunnelOpen/TunnelClose/TunnelData to ServerMessage enum
|
|
- Create tech_sessions table (id, session_id, tech_id, agent_id, opened_at, last_activity, status)
|
|
- Implement endpoints: POST /api/v1/tunnel/open, POST /close, GET /status/:session_id
|
|
- Add channel routing in WebSocket handler (route by channel_id)
|
|
- Session validation middleware (JWT + ownership check)
|
|
|
|
**Agent:**
|
|
- Add TunnelReady/TunnelData/TunnelError to AgentMessage enum
|
|
- Implement AgentMode state machine
|
|
- Add channel manager (HashMap<channel_id, ChannelHandler>)
|
|
- Handle TunnelOpen → respond TunnelReady
|
|
- Handle TunnelClose → cleanup channels, return to heartbeat mode
|
|
|
|
**Critical Files:**
|
|
- `server/src/ws/mod.rs` - WebSocket handler, protocol definitions
|
|
- `server/src/routes/tunnel.rs` - NEW: Tunnel API endpoints
|
|
- `server/src/middleware/auth.rs` - Session validation
|
|
- `agent/src/transport/websocket.rs` - WebSocket client, protocol handling
|
|
- `agent/src/tunnel/mod.rs` - NEW: Tunnel mode manager
|
|
- `migrations/XXX_create_tech_sessions.sql` - NEW: Database schema
|
|
|
|
### Phase 2: Terminal Channel (Week 2)
|
|
**Goal:** Execute PowerShell/cmd/bash commands through tunnel
|
|
|
|
**Implementation:**
|
|
- Create TerminalChannel handler on agent (spawn child process, capture streams)
|
|
- Implement TunnelDataPayload::Terminal on server
|
|
- Working directory validation on agent (configurable allowlist)
|
|
- Command result streaming for long-running commands
|
|
- Endpoint: POST /api/v1/tunnel/:session_id/command
|
|
|
|
**Critical Files:**
|
|
- `agent/src/tunnel/terminal.rs` - NEW: Terminal channel handler
|
|
- `server/src/routes/tunnel.rs` - Add command execution endpoint
|
|
- `agent/config.toml` - Add allowed_paths configuration
|
|
|
|
### Phase 3: File Operations (Week 3)
|
|
**Goal:** Read, write, list files through tunnel
|
|
|
|
**Implementation:**
|
|
- Create FileChannel handler on agent
|
|
- Chunked transfer for files > 1MB (transfer_id tracking)
|
|
- Base64 encoding for binary data
|
|
- MIME type detection (magic numbers)
|
|
- Endpoints: GET /file, PUT /file, POST /file/list
|
|
|
|
**Critical Files:**
|
|
- `agent/src/tunnel/file.rs` - NEW: File channel handler
|
|
- `server/src/routes/tunnel.rs` - Add file operation endpoints
|
|
- `common/src/transfer.rs` - NEW: Chunked transfer utilities
|
|
|
|
### Phase 4: MCP Server Integration (Week 4)
|
|
**Goal:** Expose tunnel operations as MCP tools for Claude Code
|
|
|
|
**Implementation:**
|
|
- Create new project: `gururmm-mcp-server` (Rust)
|
|
- Use `mcp-server-rs` crate
|
|
- Implement 5 core tools (run_command, read_file, write_file, list_dir, list_agents)
|
|
- JWT token from environment variable (GURURMM_AUTH_TOKEN)
|
|
- Auto-manage tunnel sessions (open on first tool use, 5min idle timeout)
|
|
|
|
**Critical Files:**
|
|
- `mcp-server/src/main.rs` - NEW: MCP server entry point
|
|
- `mcp-server/src/tools.rs` - NEW: Tool implementations
|
|
- `mcp-server/src/session.rs` - NEW: Session manager
|
|
- `mcp-server/Cargo.toml` - NEW: Dependencies
|
|
|
|
**MCP Config Example:**
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"gururmm": {
|
|
"command": "gururmm-mcp-server",
|
|
"env": {
|
|
"GURURMM_API_URL": "http://172.16.3.30:3001",
|
|
"GURURMM_AUTH_TOKEN": "jwt-token-here"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Phase 5: Advanced Features (Week 5+)
|
|
- Registry operations (Windows winreg crate)
|
|
- Service management (sc.exe/WMI on Windows, systemctl on Linux)
|
|
- Interactive terminal with PTY (stretch goal)
|
|
|
|
---
|
|
|
|
## Database Schema
|
|
|
|
```sql
|
|
CREATE TABLE tech_sessions (
|
|
id SERIAL PRIMARY KEY,
|
|
session_id VARCHAR(36) UNIQUE NOT NULL,
|
|
tech_id INTEGER NOT NULL REFERENCES techs(id),
|
|
agent_id INTEGER NOT NULL REFERENCES agents(id),
|
|
opened_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
last_activity TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
closed_at TIMESTAMP,
|
|
status VARCHAR(20) NOT NULL DEFAULT 'active',
|
|
UNIQUE(tech_id, agent_id, status) WHERE status = 'active'
|
|
);
|
|
|
|
CREATE TABLE tunnel_audit (
|
|
id SERIAL PRIMARY KEY,
|
|
session_id VARCHAR(36) NOT NULL REFERENCES tech_sessions(session_id),
|
|
channel_id VARCHAR(36) NOT NULL,
|
|
operation VARCHAR(50) NOT NULL,
|
|
details JSONB,
|
|
created_at TIMESTAMP NOT NULL DEFAULT NOW()
|
|
);
|
|
|
|
CREATE INDEX idx_tech_sessions_tech ON tech_sessions(tech_id);
|
|
CREATE INDEX idx_tech_sessions_agent ON tech_sessions(agent_id);
|
|
CREATE INDEX idx_tunnel_audit_session ON tunnel_audit(session_id);
|
|
```
|
|
|
|
---
|
|
|
|
## API Endpoints (New)
|
|
|
|
```
|
|
POST /api/v1/tunnel/open
|
|
Body: { "agent_id": 123 }
|
|
Response: { "session_id": "uuid", "status": "active" }
|
|
|
|
POST /api/v1/tunnel/close
|
|
Body: { "session_id": "uuid" }
|
|
|
|
GET /api/v1/tunnel/status/:session_id
|
|
|
|
POST /api/v1/tunnel/:session_id/command
|
|
Body: { "command": "...", "shell": "powershell", "working_dir": "...", "timeout": 30000 }
|
|
|
|
GET /api/v1/tunnel/:session_id/file?path=...
|
|
|
|
PUT /api/v1/tunnel/:session_id/file?path=...
|
|
|
|
POST /api/v1/tunnel/:session_id/file/list?path=...
|
|
```
|
|
|
|
---
|
|
|
|
## MCP Tools
|
|
|
|
```
|
|
gururmm_run_command(agent_id, command, shell, working_dir, timeout)
|
|
gururmm_read_file(agent_id, path)
|
|
gururmm_write_file(agent_id, path, content)
|
|
gururmm_list_directory(agent_id, path)
|
|
gururmm_list_agents()
|
|
```
|
|
|
|
---
|
|
|
|
## Security Implementation
|
|
|
|
### Working Directory Validation
|
|
```toml
|
|
# agent/config.toml
|
|
[security]
|
|
allowed_paths = ["C:\\Shares", "C:\\Temp"]
|
|
```
|
|
|
|
Agent validates all file operations against allowlist, rejects path traversal (`..`).
|
|
|
|
### Rate Limiting
|
|
- Server enforces: 100 commands per minute per tech per agent
|
|
- Sliding window (in-memory or Redis)
|
|
- 429 response on limit exceeded
|
|
- Violations logged to tunnel_audit
|
|
|
|
### Command Injection Prevention
|
|
- tokio::process::Command (no shell expansion)
|
|
- PowerShell: `-NoProfile -NonInteractive -Command`
|
|
- Input sanitization (escape quotes, reject backticks)
|
|
- Timeout enforcement
|
|
|
|
### Session Security
|
|
- JWT 24h expiration
|
|
- Sessions auto-expire 4h inactivity
|
|
- One tunnel per agent (prevents concurrent session conflicts)
|
|
- Admin force-close endpoint
|
|
|
|
---
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
- Channel routing (correct channel receives message)
|
|
- Session validation (JWT + ownership)
|
|
- Command sanitization
|
|
- Path validation (traversal prevention)
|
|
|
|
### Integration Tests
|
|
- Full tunnel lifecycle (open → command → close)
|
|
- Concurrent sessions to different agents
|
|
- Session timeout enforcement
|
|
- Rate limiting
|
|
|
|
### End-to-End Tests
|
|
- Claude Code MCP integration
|
|
- File upload via MCP, verify on agent
|
|
- Multi-step workflow (read file → modify → write back)
|
|
|
|
---
|
|
|
|
## Rollout Plan
|
|
|
|
1. **Week 5:** Internal testing (2 agents: AD2, DESKTOP-0O8A1RL)
|
|
2. **Week 6:** Beta release (3 power user techs)
|
|
3. **Week 7:** General availability (all techs, documentation, training)
|
|
|
|
---
|
|
|
|
## Success Metrics
|
|
|
|
**Infrastructure (Phase 1-2):**
|
|
- 95% tunnel open success rate
|
|
- <500ms command response time
|
|
- Zero session conflicts
|
|
|
|
**MCP Integration (Phase 3-4):**
|
|
- 80% tech adoption within 2 weeks
|
|
- >50 tunnel sessions/day
|
|
- <5% command error rate
|
|
|
|
**Long-term:**
|
|
- 20% reduction in RDP sessions
|
|
- 90% tech satisfaction
|
|
- <1% security incidents
|
|
|
|
---
|
|
|
|
## Risks and Mitigations
|
|
|
|
| Risk | Impact | Mitigation |
|
|
|------|--------|------------|
|
|
| Command injection | Critical | Input sanitization, no shell expansion, path allowlist |
|
|
| Session hijacking | High | Short-lived JWT, session ownership validation, audit logging |
|
|
| WebSocket instability | Medium | Auto-reconnect, session recovery |
|
|
| Rate limiting too strict | Medium | Configurable per-tech limits, user feedback |
|
|
|
|
---
|
|
|
|
## Open Questions
|
|
|
|
1. Registry operations scope (full access or specific hives only)?
|
|
2. Interactive terminal priority (defer to Phase 6)?
|
|
3. Multi-tech sessions for pair programming?
|
|
4. MCP server credential manager integration (1Password)?
|
|
5. Agent-side logging requirements (compliance)?
|
|
|
|
---
|
|
|
|
## Verification Plan
|
|
|
|
### Phase 1 Verification
|
|
```bash
|
|
# Tech opens tunnel session
|
|
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/open \
|
|
-H "Authorization: Bearer $JWT" \
|
|
-d '{"agent_id": 1}'
|
|
# Response: {"session_id": "uuid", "status": "active"}
|
|
|
|
# Check agent logs - should show: "Tunnel mode activated for session uuid"
|
|
# Check database: SELECT * FROM tech_sessions WHERE session_id = 'uuid';
|
|
```
|
|
|
|
### Phase 2 Verification
|
|
```bash
|
|
# Execute command via tunnel
|
|
curl -X POST http://172.16.3.30:3001/api/v1/tunnel/$SESSION_ID/command \
|
|
-H "Authorization: Bearer $JWT" \
|
|
-d '{"command": "Get-Date", "shell": "powershell"}'
|
|
# Response: {"stdout": "Sunday, April 13, 2026...", "exit_code": 0}
|
|
```
|
|
|
|
### Phase 4 Verification (MCP)
|
|
```bash
|
|
# Configure MCP server in Claude Code
|
|
# Test tools appear in Claude's tool list
|
|
# Execute: "List files in C:\Shares on agent ID 1"
|
|
# Claude should call gururmm_list_directory tool
|
|
# Verify output shows directory listing
|
|
```
|
|
|
|
---
|
|
|
|
## Next Steps After Approval
|
|
|
|
1. Create feature branch: `feature/real-time-tunnel`
|
|
2. Phase 1 database migrations (tech_sessions, tunnel_audit tables)
|
|
3. Update protocol enums (ServerMessage/AgentMessage)
|
|
4. Implement tunnel open/close endpoints
|
|
5. Update agent WebSocket handler for tunnel mode
|
|
6. Unit tests for session validation
|
|
7. Deploy to test environment
|
|
|
|
**Estimated Timeline:** 5 weeks to MCP integration, 7 weeks to GA
|
|
|
|
---
|
|
|
|
**Detailed plan location:** `projects/msp-tools/guru-rmm/plans/real-time-tunnel-architecture.md`
|