Files
guru-connect/docs/specs/SPEC-012-headless-linux-tty.md
azcomputerguru a062a825ea spec: add SPEC-012 Headless Linux Mode (Direct TTY Access)
Comprehensive specification for terminal-based remote access to headless
Linux servers (no X11/Wayland GUI):

Core Capabilities:
- PTY spawn via openpty() + fork/exec shell (/bin/bash or $SHELL)
- Terminal I/O: PTY output → TerminalData protobuf → WebSocket relay
- Input: keyboard → TerminalInput protobuf → PTY master write
- Resize: SIGWINCH on terminal window resize, TIOCSWINSZ ioctl
- Auto-detection: agent detects headless environment (no DISPLAY) at runtime

Viewer:
- xterm.js-based web terminal (80x24 default, resizable)
- Full ANSI/VT100 support (colors, cursor control, vim/nano/htop)
- Same protobuf-over-WSS protocol, support-code/agent-key auth
- Dashboard shows "Terminal" badge, routes to terminal viewer

Use Cases:
- Server management (headless Ubuntu Server, VMs, containers)
- Emergency recovery (systemd rescue mode, single-user mode)
- Container debugging (exec into running containers)
- SSH replacement with centralized audit logging

Protobuf Extensions:
- TerminalData, TerminalInput, TerminalResize messages
- AgentStatus.terminal_mode flag

Security:
- Run agent as unprivileged user + sudo for privileged commands
- Session recording to terminal_recordings table (asciicast format)
- Same auth model as GUI agents (support-code / per-agent key)

Estimated effort: Medium (4-6 weeks)
Priority: P2 (server management is market-critical)

Extends SPEC-010 Linux agent with PTY alternative to screen capture.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 18:28:34 -07:00

23 KiB

SPEC-012: Headless Linux Mode (Direct TTY Access)

Status: Proposed Priority: P2 Requested By: Mike Swanson (2026-05-30) Estimated Effort: Medium (4-6 weeks)

Overview

Enable GuruConnect agent support for headless Linux servers (no X11/Wayland GUI) by providing direct terminal (TTY) access instead of screen capture. This addresses a critical server management use case: remote terminal access to Linux servers, VMs, and containers that run without a graphical desktop environment. Unlike SSH, this integrates with the GuruConnect dashboard for centralized access, audit logging, and support-code workflows. The viewer displays a terminal emulator (xterm.js-based web viewer or native terminal in the desktop viewer) connected to a pseudo-TTY (PTY) on the target server. Success criteria: technician can manage a headless Ubuntu Server 22.04 VM via GuruConnect dashboard with same authentication and session model as GUI agents, full terminal capabilities (colors, cursor control, vim/nano editing), and zero X11/Wayland dependencies on the target.

Use Cases:

  • Remote terminal access to headless Linux servers (web hosting, databases, Docker hosts)
  • Container debugging (exec into running containers via GuruConnect)
  • Emergency server recovery (systemd rescue mode, single-user mode)
  • MSP consolidation: one tool for both desktop support (GUI) and server management (terminal)

Success Criteria:

  • GuruConnect agent runs on Ubuntu Server 22.04 minimal install (no desktop packages)
  • Viewer sees full-color, interactive terminal (80x24 or larger, resizable)
  • Full terminal capabilities: ANSI colors, cursor positioning, vim/nano/htop work correctly
  • Same protobuf-over-WSS transport, support-code and persistent-agent authentication
  • Audit logging: session recording (terminal output captured to events table or file)

Scope

Included in v1

Headless Agent Mode:

  • Detect headless environment (no DISPLAY, no X11/Wayland libraries) at runtime
  • Spawn pseudo-TTY (PTY) via openpty() + fork/exec shell (/bin/bash -l or user's $SHELL)
  • Terminal I/O: read PTY output → encode as protobuf TerminalData → send via WebSocket
  • Input: receive protobuf TerminalInput → write to PTY master
  • Terminal resize: handle TerminalResize message → send SIGWINCH to PTY
  • Fallback shell selection: $SHELL env var → /bin/bash/bin/sh
  • Same agent binary as GUI mode: guruconnect detects headless and switches mode automatically
  • Graceful PTY cleanup on session end (send exit command, wait for shell exit, close PTY)

Viewer (Web Viewer):

  • xterm.js-based terminal emulator embedded in viewer.html
  • Connects to same /ws/viewer endpoint with session JWT
  • Relay server detects TerminalData frames (not FrameData) and routes accordingly
  • Terminal controls: resize on window resize, copy/paste support, configurable font size
  • Session toolbar: connection status, terminal size (e.g., "80x24"), reconnect button

Viewer (Native Desktop Viewer - optional Phase 2):

  • Defer native viewer terminal support to Phase 2
  • v1: web viewer only for terminal sessions (show "Open in browser" prompt if launched via guruconnect://)

Protobuf Protocol:

  • New message types: TerminalData (PTY output), TerminalInput (keyboard input), TerminalResize (window size)
  • AgentStatus includes terminal_mode: bool flag (true for headless agents)
  • Dashboard shows terminal icon for headless agents, camera icon for GUI agents

Dashboard:

  • Detect terminal_mode: true in agent status
  • "Connect" button opens web viewer in terminal mode (not screen capture mode)
  • Agent list shows "Terminal" badge for headless agents

Session Recording (Audit):

  • Log all terminal I/O to events table or separate terminal_sessions table
  • Playback: recorded session can be replayed as "terminal recording" (asciicast format or raw PTY dump)

Explicitly out of scope

  • GUI mode on headless agents — v1 is terminal-only; no attempt to start Xvfb or launch GUI apps
  • SSH key management — agent uses GuruConnect auth (support code / agent key), not SSH keys
  • File transfer via terminal — defer to SPEC (file transfer is a separate roadmap item for all agent types)
  • Multi-user terminal sessions — v1 is single-session PTY; no tmux/screen built-in sharing
  • Windows terminal mode — defer; Windows Server typically has GUI (RDP) or SSH (OpenSSH)
  • macOS terminal mode — defer; macOS servers are rare and typically have GUI access

Architecture

Agent PTY Handling

Headless detection:

// agent/src/platform/linux/headless.rs
pub fn is_headless() -> bool {
    // Check if DISPLAY is unset and no X11/Wayland session detected
    std::env::var("DISPLAY").is_err() &&
    std::env::var("WAYLAND_DISPLAY").is_err() &&
    !std::path::Path::new("/tmp/.X11-unix").exists()
}

PTY spawn:

// agent/src/platform/linux/pty.rs
use libc::{openpty, fork, execvp, dup2, STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO, winsize, TIOCSWINSZ};
use std::os::unix::io::RawFd;

pub struct PtySession {
    master_fd: RawFd,
    child_pid: libc::pid_t,
    cols: u16,
    rows: u16,
}

impl PtySession {
    pub fn spawn(shell: &str, cols: u16, rows: u16) -> Result<Self> {
        let mut master_fd: RawFd = 0;
        let mut slave_fd: RawFd = 0;
        let mut winsize = winsize {
            ws_row: rows,
            ws_col: cols,
            ws_xpixel: 0,
            ws_ypixel: 0,
        };

        unsafe {
            if openpty(&mut master_fd, &mut slave_fd, std::ptr::null_mut(),
                       std::ptr::null(), &mut winsize as *mut _) != 0 {
                return Err(anyhow!("openpty failed"));
            }

            let pid = fork();
            if pid == 0 {
                // Child process: exec shell
                dup2(slave_fd, STDIN_FILENO);
                dup2(slave_fd, STDOUT_FILENO);
                dup2(slave_fd, STDERR_FILENO);
                libc::close(master_fd);
                libc::close(slave_fd);

                let shell_cstr = CString::new(shell)?;
                let args = [shell_cstr.as_ptr(), std::ptr::null()];
                execvp(shell_cstr.as_ptr(), args.as_ptr());
                std::process::exit(1); // exec failed
            } else {
                // Parent process: close slave, return master FD
                libc::close(slave_fd);
                Ok(PtySession {
                    master_fd,
                    child_pid: pid,
                    cols,
                    rows,
                })
            }
        }
    }

    pub fn read(&self, buf: &mut [u8]) -> Result<usize> {
        unsafe {
            let n = libc::read(self.master_fd, buf.as_mut_ptr() as *mut _, buf.len());
            if n < 0 {
                Err(anyhow!("PTY read failed"))
            } else {
                Ok(n as usize)
            }
        }
    }

    pub fn write(&self, data: &[u8]) -> Result<()> {
        unsafe {
            let n = libc::write(self.master_fd, data.as_ptr() as *const _, data.len());
            if n < 0 {
                Err(anyhow!("PTY write failed"))
            } else {
                Ok(())
            }
        }
    }

    pub fn resize(&mut self, cols: u16, rows: u16) -> Result<()> {
        self.cols = cols;
        self.rows = rows;
        let winsize = winsize {
            ws_row: rows,
            ws_col: cols,
            ws_xpixel: 0,
            ws_ypixel: 0,
        };
        unsafe {
            if libc::ioctl(self.master_fd, TIOCSWINSZ, &winsize as *const _) != 0 {
                return Err(anyhow!("TIOCSWINSZ failed"));
            }
        }
        // Send SIGWINCH to child process group
        unsafe { libc::kill(-self.child_pid, libc::SIGWINCH); }
        Ok(())
    }
}

impl Drop for PtySession {
    fn drop(&mut self) {
        unsafe {
            libc::close(self.master_fd);
            // Send SIGTERM to child, wait briefly, then SIGKILL if still alive
            libc::kill(self.child_pid, libc::SIGTERM);
            std::thread::sleep(std::time::Duration::from_millis(500));
            libc::waitpid(self.child_pid, std::ptr::null_mut(), libc::WNOHANG);
        }
    }
}

Agent session loop:

// agent/src/session/terminal.rs
pub async fn run_terminal_session(ws: WebSocketClient, support_code: String) -> Result<()> {
    let shell = std::env::var("SHELL").unwrap_or_else(|_| "/bin/bash".to_string());
    let mut pty = PtySession::spawn(&shell, 80, 24)?;

    // Status update: terminal mode
    ws.send(AgentStatus {
        terminal_mode: true,
        os: "Linux".to_string(),
        // ...
    }).await?;

    let mut buf = vec![0u8; 4096];
    loop {
        tokio::select! {
            // Read PTY output, send to relay
            Ok(n) = tokio::task::spawn_blocking({
                let master = pty.master_fd;
                move || unsafe { libc::read(master, buf.as_mut_ptr() as *mut _, buf.len()) }
            }) => {
                if n > 0 {
                    ws.send(TerminalData {
                        data: buf[..n as usize].to_vec(),
                    }).await?;
                }
            }

            // Receive input from relay, write to PTY
            Some(msg) = ws.recv() => {
                match msg {
                    Message::TerminalInput(input) => {
                        pty.write(&input.data)?;
                    }
                    Message::TerminalResize(resize) => {
                        pty.resize(resize.cols, resize.rows)?;
                    }
                    Message::Disconnect => break,
                    _ => {}
                }
            }
        }
    }

    Ok(())
}

Protobuf Protocol Extensions

// proto/guruconnect.proto

message AgentStatus {
  // Existing fields...
  optional bool terminal_mode = 21;  // true for headless agents
}

message TerminalData {
  bytes data = 1;  // PTY raw output (may include ANSI escape sequences)
}

message TerminalInput {
  bytes data = 1;  // Keyboard input from viewer (UTF-8 encoded)
}

message TerminalResize {
  uint32 cols = 1;  // Terminal width (characters)
  uint32 rows = 2;  // Terminal height (lines)
}

// Update AgentMessage and ViewerMessage unions
message AgentMessage {
  oneof message {
    AgentStatus status = 1;
    FrameData frame = 2;
    TerminalData terminal_data = 10;  // NEW
  }
}

message ViewerMessage {
  oneof message {
    InputEvent input = 1;
    TerminalInput terminal_input = 10;  // NEW
    TerminalResize terminal_resize = 11;  // NEW
  }
}

Relay Server Changes

Route terminal vs. screen capture sessions:

// server/src/relay/mod.rs
async fn handle_agent_message(msg: AgentMessage, session: &Session) {
    match msg.message {
        Some(agent_message::Message::Status(status)) => {
            session.terminal_mode = status.terminal_mode.unwrap_or(false);
            // Store in DB: UPDATE sessions SET terminal_mode = ? WHERE id = ?
        }
        Some(agent_message::Message::TerminalData(data)) => {
            // Forward to viewer WebSocket
            if let Some(viewer_ws) = session.viewer_ws.lock().await.as_mut() {
                viewer_ws.send(ViewerMessage {
                    message: Some(viewer_message::Message::TerminalData(data))
                }).await?;
            }
            // Optional: append to terminal_recording buffer for audit
        }
        Some(agent_message::Message::Frame(frame)) => {
            // Existing screen capture logic...
        }
        _ => {}
    }
}

async fn handle_viewer_message(msg: ViewerMessage, session: &Session) {
    match msg.message {
        Some(viewer_message::Message::TerminalInput(input)) => {
            // Forward to agent WebSocket
            if let Some(agent_ws) = session.agent_ws.lock().await.as_mut() {
                agent_ws.send(AgentMessage {
                    message: Some(agent_message::Message::TerminalInput(input))
                }).await?;
            }
        }
        Some(viewer_message::Message::TerminalResize(resize)) => {
            // Forward resize to agent
            if let Some(agent_ws) = session.agent_ws.lock().await.as_mut() {
                agent_ws.send(AgentMessage {
                    message: Some(agent_message::Message::TerminalResize(resize))
                }).await?;
            }
        }
        Some(viewer_message::Message::Input(input)) => {
            // Existing GUI input logic...
        }
        _ => {}
    }
}

Web Viewer (xterm.js)

HTML template:

<!-- server/static/viewer-terminal.html -->
<!DOCTYPE html>
<html>
<head>
    <title>GuruConnect Terminal</title>
    <link rel="stylesheet" href="/vendor/xterm/xterm.css" />
    <script src="/vendor/xterm/xterm.js"></script>
    <script src="/vendor/xterm/xterm-addon-fit.js"></script>
    <style>
        #terminal { height: 100vh; }
        .toolbar { background: #333; color: #fff; padding: 8px; }
    </style>
</head>
<body>
    <div class="toolbar">
        <span id="status">Connecting...</span>
        <span id="size" style="float: right;">80x24</span>
    </div>
    <div id="terminal"></div>
    <script>
        const term = new Terminal({
            cursorBlink: true,
            fontSize: 14,
            fontFamily: 'Consolas, "Courier New", monospace',
            theme: {
                background: '#1e1e1e',
                foreground: '#d4d4d4',
            }
        });
        const fitAddon = new FitAddon.FitAddon();
        term.loadAddon(fitAddon);
        term.open(document.getElementById('terminal'));
        fitAddon.fit();

        const ws = new WebSocket(`wss://${location.host}/ws/viewer?token=${TOKEN}&session=${SESSION_ID}`);

        ws.onopen = () => {
            document.getElementById('status').textContent = 'Connected';
            // Send initial terminal size
            ws.send(encodeTerminalResize(term.cols, term.rows));
        };

        ws.onmessage = (event) => {
            const msg = decodeProtobuf(event.data);
            if (msg.terminal_data) {
                term.write(new Uint8Array(msg.terminal_data.data));
            }
        };

        term.onData((data) => {
            ws.send(encodeTerminalInput(data));
        });

        term.onResize(({ cols, rows }) => {
            document.getElementById('size').textContent = `${cols}x${rows}`;
            ws.send(encodeTerminalResize(cols, rows));
        });

        window.addEventListener('resize', () => fitAddon.fit());
    </script>
</body>
</html>

Dashboard Detection

// server/static/dashboard.js
function renderAgentRow(agent) {
    const icon = agent.terminal_mode
        ? '<i class="icon-terminal"></i> Terminal'
        : '<i class="icon-screen"></i> Screen';

    const connectButton = agent.online
        ? `<button onclick="connectToAgent('${agent.id}', ${agent.terminal_mode})">Connect</button>`
        : '<span>Offline</span>';

    return `<tr>
        <td>${agent.name}</td>
        <td>${icon}</td>
        <td>${agent.os} ${agent.os_version}</td>
        <td>${connectButton}</td>
    </tr>`;
}

function connectToAgent(agentId, terminalMode) {
    if (terminalMode) {
        window.open(`/viewer-terminal.html?session=${agentId}&token=${JWT}`, '_blank');
    } else {
        window.open(`/viewer.html?session=${agentId}&token=${JWT}`, '_blank');
    }
}

Database Schema

Minor addition to sessions table:

-- migrations/012_terminal_mode.sql
ALTER TABLE connect_sessions ADD COLUMN terminal_mode BOOLEAN DEFAULT FALSE;

-- Optional: separate table for terminal recordings
CREATE TABLE IF NOT EXISTS terminal_recordings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    session_id UUID REFERENCES connect_sessions(id) ON DELETE CASCADE,
    started_at TIMESTAMPTZ DEFAULT NOW(),
    ended_at TIMESTAMPTZ,
    recording_data BYTEA,  -- asciicast JSON or raw PTY dump (compressed)
    size_bytes BIGINT,
    INDEX idx_terminal_recordings_session (session_id)
);

Implementation Details

Files to Create

Agent (Linux-specific):

  • agent/src/platform/linux/pty.rs — PTY spawn, I/O, resize (openpty, fork, exec)
  • agent/src/platform/linux/headless.rs — Headless detection logic
  • agent/src/session/terminal.rs — Terminal session loop (PTY ↔ WebSocket)

Server:

  • server/src/relay/terminal.rs — Terminal message routing (TerminalData/Input/Resize)
  • server/static/viewer-terminal.html — xterm.js-based web terminal viewer
  • server/static/vendor/xterm/ — xterm.js library files (CDN or bundled)
  • server/migrations/012_terminal_mode.sql — Schema update

Protobuf:

  • proto/guruconnect.proto — Add TerminalData, TerminalInput, TerminalResize messages

Dashboard:

  • server/static/dashboard.js — Detect terminal_mode, render terminal icon, route to terminal viewer

Key Dependencies

# agent/Cargo.toml (Linux-specific)
[target.'cfg(target_os = "linux")'.dependencies]
libc = "0.2"  # openpty, fork, exec, ioctl
nix = "0.27"  # Safe wrappers for POSIX APIs

xterm.js (web viewer):

  • Version: 5.3.0+ (latest stable)
  • Addons: xterm-addon-fit (auto-resize)
  • Delivery: CDN link or bundled in server/static/vendor/xterm/

Security Considerations

Shell Access Risk

  • Privilege escalation: PTY spawns shell as the agent's user (typically root if agent runs as systemd service)
  • Mitigation 1: Run agent as unprivileged user (guruconnect service user), use sudo for privileged commands
  • Mitigation 2: Add allowed_commands whitelist (optional Phase 2 feature) — restrict to specific binaries
  • Mitigation 3: Audit logging: record all terminal I/O for compliance review

Authentication

Same as GUI agents:

  • Support-code for ad-hoc sessions (6-digit, time-limited)
  • Persistent agent key for managed servers (per-agent cak_* key from SPEC-004)
  • Viewer JWT token required for WebSocket connection

Session Recording (Compliance)

  • Optional toggle: dashboard setting "Record terminal sessions" (default: ON for compliance)
  • Storage: terminal_recordings table (BYTEA column, compressed)
  • Playback: Admin dashboard can replay terminal sessions as asciicast (xterm.js built-in playback)
  • Retention: configurable (default: 90 days, auto-purge older recordings)

Input Sanitization

  • No sanitization needed: PTY handles raw bytes; ANSI escape sequences are terminal-native
  • DoS risk: Malicious viewer could spam resize events; rate-limit TerminalResize (max 10/sec)

Testing Strategy

Unit Tests

  • PTY spawn/cleanup: verify openpty() success, shell exec, FD management
  • Terminal I/O: mock PTY master FD, test read/write buffers
  • Protobuf serialization: TerminalData/Input/Resize round-trip

Integration Tests

  • Headless VM: Ubuntu Server 22.04 minimal (no desktop packages)
  • Agent install: guruconnect binary, systemd service, no X11 deps
  • Connect flow: Dashboard → "Connect" → xterm.js viewer → type ls, verify output
  • Resize: Browser window resize → PTY receives SIGWINCH → htop redraws correctly
  • Session cleanup: Close viewer → PTY process exits gracefully

Manual Testing Scenarios

  1. Basic shell interaction:

    • Connect to headless agent via dashboard
    • Type ls -la, verify colorized output
    • Run vim test.txt, verify cursor movement, editing, save/quit
    • Run htop, verify full-screen TUI app renders correctly
  2. Terminal resize:

    • Start session at default 80x24
    • Resize browser window to 120x40
    • Run tput cols; tput lines → verify output matches
    • Run htop → verify UI scales to new dimensions
  3. Multi-line output:

    • Run dmesg | head -100 → verify scrollback works
    • Run journalctl -f → verify live log streaming
  4. Session recording playback:

    • Perform session actions (ls, vim, htop)
    • End session
    • Admin dashboard → "View Recording" → verify asciicast playback
  5. Privilege escalation (sudo):

    • Agent runs as guruconnect user (non-root)
    • Connect via terminal
    • Run sudo apt update → enter sudo password → verify command executes
    • Run whoami → verify shows root after sudo

Performance

  • Latency target: <100ms round-trip for input (same as GUI mode)
  • Bandwidth: ~1-5 KB/sec for typical terminal I/O (much lower than screen capture)
  • Stress test: Run yes command (infinite output) → verify relay doesn't OOM, rate-limit applied

Effort Estimate & Dependencies

Size: Medium (4-6 weeks, 1 developer)

Breakdown:

  • PTY implementation (Linux agent): 1.5 weeks
  • Protobuf protocol updates: 0.5 weeks
  • Relay server terminal routing: 1 week
  • xterm.js web viewer integration: 1 week
  • Dashboard terminal mode detection + routing: 0.5 weeks
  • Session recording + playback: 1 week
  • Testing, edge cases, systemd integration: 1 week
  • Documentation: 0.5 weeks

Dependencies:

  • SPEC-010 Linux agent base — PTY mode extends the Linux agent; can be implemented in parallel with SPEC-010's GUI capture
  • xterm.js library — mature, well-tested (used by VS Code, Jupyter, many commercial products)
  • libc/nix crates — standard Rust POSIX bindings
  • SPEC-004 per-agent keys — already shipped for persistent agent auth

Unblocks:

  • Server management use case (Linux VMs, containers, bare metal)
  • SSH replacement with centralized audit logging
  • Emergency recovery (single-user mode, systemd rescue shell)
  • Container debugging (exec into running containers via GuruConnect)

Open Questions

  1. Run agent as root or unprivileged user? — Recommend unprivileged guruconnect service user + sudo whitelist. Security-sensitive orgs may require root; make configurable.

  2. Shell selection? — v1: $SHELL env var → /bin/bash/bin/sh. Phase 2: dashboard setting to override shell per agent (/bin/zsh, /bin/fish).

  3. Concurrent PTY sessions? — v1: one PTY per agent connection (like SSH). Phase 2: tmux/screen integration for multi-viewer session sharing.

  4. Terminal recording format? — Asciicast (JSON, industry standard, xterm.js playback support) vs. raw PTY dump (more compact, custom playback). Recommend asciicast for v1.

  5. Command whitelisting? — Optional Phase 2 feature. v1 is unrestricted shell access (same as SSH). Add allowed_commands array to agent config if compliance requires it.

  6. Windows/macOS terminal mode? — Defer. Windows Server typically uses RDP or SSH (OpenSSH built-in since Server 2019). macOS servers are rare. Linux headless servers are the primary use case.

  7. File upload/download via terminal? — v1: use standard tools (scp, rsync, wget). Phase 2: integrate with SPEC (file transfer) for dashboard-native upload/download.


Cross-references:

  • SPEC-010: Cross-platform agents (macOS/Linux GUI) — headless mode extends Linux agent with PTY alternative
  • SPEC-004: Stable machine identity — headless agents use same deterministic machine_uid (/etc/machine-id)
  • ADR-001: GuruConnect is standalone — headless mode doesn't require GuruRMM integration
  • Future: File transfer spec (roadmap item) — will integrate with terminal mode for scp-like functionality