Comprehensive specification for terminal-based remote access to headless Linux servers (no X11/Wayland GUI): Core Capabilities: - PTY spawn via openpty() + fork/exec shell (/bin/bash or $SHELL) - Terminal I/O: PTY output → TerminalData protobuf → WebSocket relay - Input: keyboard → TerminalInput protobuf → PTY master write - Resize: SIGWINCH on terminal window resize, TIOCSWINSZ ioctl - Auto-detection: agent detects headless environment (no DISPLAY) at runtime Viewer: - xterm.js-based web terminal (80x24 default, resizable) - Full ANSI/VT100 support (colors, cursor control, vim/nano/htop) - Same protobuf-over-WSS protocol, support-code/agent-key auth - Dashboard shows "Terminal" badge, routes to terminal viewer Use Cases: - Server management (headless Ubuntu Server, VMs, containers) - Emergency recovery (systemd rescue mode, single-user mode) - Container debugging (exec into running containers) - SSH replacement with centralized audit logging Protobuf Extensions: - TerminalData, TerminalInput, TerminalResize messages - AgentStatus.terminal_mode flag Security: - Run agent as unprivileged user + sudo for privileged commands - Session recording to terminal_recordings table (asciicast format) - Same auth model as GUI agents (support-code / per-agent key) Estimated effort: Medium (4-6 weeks) Priority: P2 (server management is market-critical) Extends SPEC-010 Linux agent with PTY alternative to screen capture. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
23 KiB
SPEC-012: Headless Linux Mode (Direct TTY Access)
Status: Proposed Priority: P2 Requested By: Mike Swanson (2026-05-30) Estimated Effort: Medium (4-6 weeks)
Overview
Enable GuruConnect agent support for headless Linux servers (no X11/Wayland GUI) by providing direct terminal (TTY) access instead of screen capture. This addresses a critical server management use case: remote terminal access to Linux servers, VMs, and containers that run without a graphical desktop environment. Unlike SSH, this integrates with the GuruConnect dashboard for centralized access, audit logging, and support-code workflows. The viewer displays a terminal emulator (xterm.js-based web viewer or native terminal in the desktop viewer) connected to a pseudo-TTY (PTY) on the target server. Success criteria: technician can manage a headless Ubuntu Server 22.04 VM via GuruConnect dashboard with same authentication and session model as GUI agents, full terminal capabilities (colors, cursor control, vim/nano editing), and zero X11/Wayland dependencies on the target.
Use Cases:
- Remote terminal access to headless Linux servers (web hosting, databases, Docker hosts)
- Container debugging (exec into running containers via GuruConnect)
- Emergency server recovery (systemd rescue mode, single-user mode)
- MSP consolidation: one tool for both desktop support (GUI) and server management (terminal)
Success Criteria:
- GuruConnect agent runs on Ubuntu Server 22.04 minimal install (no desktop packages)
- Viewer sees full-color, interactive terminal (80x24 or larger, resizable)
- Full terminal capabilities: ANSI colors, cursor positioning, vim/nano/htop work correctly
- Same protobuf-over-WSS transport, support-code and persistent-agent authentication
- Audit logging: session recording (terminal output captured to
eventstable or file)
Scope
Included in v1
Headless Agent Mode:
- Detect headless environment (no DISPLAY, no X11/Wayland libraries) at runtime
- Spawn pseudo-TTY (PTY) via
openpty()+ fork/exec shell (/bin/bash -lor user's$SHELL) - Terminal I/O: read PTY output → encode as protobuf
TerminalData→ send via WebSocket - Input: receive protobuf
TerminalInput→ write to PTY master - Terminal resize: handle
TerminalResizemessage → sendSIGWINCHto PTY - Fallback shell selection:
$SHELLenv var →/bin/bash→/bin/sh - Same agent binary as GUI mode:
guruconnectdetects headless and switches mode automatically - Graceful PTY cleanup on session end (send exit command, wait for shell exit, close PTY)
Viewer (Web Viewer):
- xterm.js-based terminal emulator embedded in
viewer.html - Connects to same
/ws/viewerendpoint with session JWT - Relay server detects
TerminalDataframes (notFrameData) and routes accordingly - Terminal controls: resize on window resize, copy/paste support, configurable font size
- Session toolbar: connection status, terminal size (e.g., "80x24"), reconnect button
Viewer (Native Desktop Viewer - optional Phase 2):
- Defer native viewer terminal support to Phase 2
- v1: web viewer only for terminal sessions (show "Open in browser" prompt if launched via
guruconnect://)
Protobuf Protocol:
- New message types:
TerminalData(PTY output),TerminalInput(keyboard input),TerminalResize(window size) AgentStatusincludesterminal_mode: boolflag (true for headless agents)- Dashboard shows terminal icon for headless agents, camera icon for GUI agents
Dashboard:
- Detect
terminal_mode: truein agent status - "Connect" button opens web viewer in terminal mode (not screen capture mode)
- Agent list shows "Terminal" badge for headless agents
Session Recording (Audit):
- Log all terminal I/O to
eventstable or separateterminal_sessionstable - Playback: recorded session can be replayed as "terminal recording" (asciicast format or raw PTY dump)
Explicitly out of scope
- GUI mode on headless agents — v1 is terminal-only; no attempt to start Xvfb or launch GUI apps
- SSH key management — agent uses GuruConnect auth (support code / agent key), not SSH keys
- File transfer via terminal — defer to SPEC (file transfer is a separate roadmap item for all agent types)
- Multi-user terminal sessions — v1 is single-session PTY; no tmux/screen built-in sharing
- Windows terminal mode — defer; Windows Server typically has GUI (RDP) or SSH (OpenSSH)
- macOS terminal mode — defer; macOS servers are rare and typically have GUI access
Architecture
Agent PTY Handling
Headless detection:
// agent/src/platform/linux/headless.rs
pub fn is_headless() -> bool {
// Check if DISPLAY is unset and no X11/Wayland session detected
std::env::var("DISPLAY").is_err() &&
std::env::var("WAYLAND_DISPLAY").is_err() &&
!std::path::Path::new("/tmp/.X11-unix").exists()
}
PTY spawn:
// agent/src/platform/linux/pty.rs
use libc::{openpty, fork, execvp, dup2, STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO, winsize, TIOCSWINSZ};
use std::os::unix::io::RawFd;
pub struct PtySession {
master_fd: RawFd,
child_pid: libc::pid_t,
cols: u16,
rows: u16,
}
impl PtySession {
pub fn spawn(shell: &str, cols: u16, rows: u16) -> Result<Self> {
let mut master_fd: RawFd = 0;
let mut slave_fd: RawFd = 0;
let mut winsize = winsize {
ws_row: rows,
ws_col: cols,
ws_xpixel: 0,
ws_ypixel: 0,
};
unsafe {
if openpty(&mut master_fd, &mut slave_fd, std::ptr::null_mut(),
std::ptr::null(), &mut winsize as *mut _) != 0 {
return Err(anyhow!("openpty failed"));
}
let pid = fork();
if pid == 0 {
// Child process: exec shell
dup2(slave_fd, STDIN_FILENO);
dup2(slave_fd, STDOUT_FILENO);
dup2(slave_fd, STDERR_FILENO);
libc::close(master_fd);
libc::close(slave_fd);
let shell_cstr = CString::new(shell)?;
let args = [shell_cstr.as_ptr(), std::ptr::null()];
execvp(shell_cstr.as_ptr(), args.as_ptr());
std::process::exit(1); // exec failed
} else {
// Parent process: close slave, return master FD
libc::close(slave_fd);
Ok(PtySession {
master_fd,
child_pid: pid,
cols,
rows,
})
}
}
}
pub fn read(&self, buf: &mut [u8]) -> Result<usize> {
unsafe {
let n = libc::read(self.master_fd, buf.as_mut_ptr() as *mut _, buf.len());
if n < 0 {
Err(anyhow!("PTY read failed"))
} else {
Ok(n as usize)
}
}
}
pub fn write(&self, data: &[u8]) -> Result<()> {
unsafe {
let n = libc::write(self.master_fd, data.as_ptr() as *const _, data.len());
if n < 0 {
Err(anyhow!("PTY write failed"))
} else {
Ok(())
}
}
}
pub fn resize(&mut self, cols: u16, rows: u16) -> Result<()> {
self.cols = cols;
self.rows = rows;
let winsize = winsize {
ws_row: rows,
ws_col: cols,
ws_xpixel: 0,
ws_ypixel: 0,
};
unsafe {
if libc::ioctl(self.master_fd, TIOCSWINSZ, &winsize as *const _) != 0 {
return Err(anyhow!("TIOCSWINSZ failed"));
}
}
// Send SIGWINCH to child process group
unsafe { libc::kill(-self.child_pid, libc::SIGWINCH); }
Ok(())
}
}
impl Drop for PtySession {
fn drop(&mut self) {
unsafe {
libc::close(self.master_fd);
// Send SIGTERM to child, wait briefly, then SIGKILL if still alive
libc::kill(self.child_pid, libc::SIGTERM);
std::thread::sleep(std::time::Duration::from_millis(500));
libc::waitpid(self.child_pid, std::ptr::null_mut(), libc::WNOHANG);
}
}
}
Agent session loop:
// agent/src/session/terminal.rs
pub async fn run_terminal_session(ws: WebSocketClient, support_code: String) -> Result<()> {
let shell = std::env::var("SHELL").unwrap_or_else(|_| "/bin/bash".to_string());
let mut pty = PtySession::spawn(&shell, 80, 24)?;
// Status update: terminal mode
ws.send(AgentStatus {
terminal_mode: true,
os: "Linux".to_string(),
// ...
}).await?;
let mut buf = vec![0u8; 4096];
loop {
tokio::select! {
// Read PTY output, send to relay
Ok(n) = tokio::task::spawn_blocking({
let master = pty.master_fd;
move || unsafe { libc::read(master, buf.as_mut_ptr() as *mut _, buf.len()) }
}) => {
if n > 0 {
ws.send(TerminalData {
data: buf[..n as usize].to_vec(),
}).await?;
}
}
// Receive input from relay, write to PTY
Some(msg) = ws.recv() => {
match msg {
Message::TerminalInput(input) => {
pty.write(&input.data)?;
}
Message::TerminalResize(resize) => {
pty.resize(resize.cols, resize.rows)?;
}
Message::Disconnect => break,
_ => {}
}
}
}
}
Ok(())
}
Protobuf Protocol Extensions
// proto/guruconnect.proto
message AgentStatus {
// Existing fields...
optional bool terminal_mode = 21; // true for headless agents
}
message TerminalData {
bytes data = 1; // PTY raw output (may include ANSI escape sequences)
}
message TerminalInput {
bytes data = 1; // Keyboard input from viewer (UTF-8 encoded)
}
message TerminalResize {
uint32 cols = 1; // Terminal width (characters)
uint32 rows = 2; // Terminal height (lines)
}
// Update AgentMessage and ViewerMessage unions
message AgentMessage {
oneof message {
AgentStatus status = 1;
FrameData frame = 2;
TerminalData terminal_data = 10; // NEW
}
}
message ViewerMessage {
oneof message {
InputEvent input = 1;
TerminalInput terminal_input = 10; // NEW
TerminalResize terminal_resize = 11; // NEW
}
}
Relay Server Changes
Route terminal vs. screen capture sessions:
// server/src/relay/mod.rs
async fn handle_agent_message(msg: AgentMessage, session: &Session) {
match msg.message {
Some(agent_message::Message::Status(status)) => {
session.terminal_mode = status.terminal_mode.unwrap_or(false);
// Store in DB: UPDATE sessions SET terminal_mode = ? WHERE id = ?
}
Some(agent_message::Message::TerminalData(data)) => {
// Forward to viewer WebSocket
if let Some(viewer_ws) = session.viewer_ws.lock().await.as_mut() {
viewer_ws.send(ViewerMessage {
message: Some(viewer_message::Message::TerminalData(data))
}).await?;
}
// Optional: append to terminal_recording buffer for audit
}
Some(agent_message::Message::Frame(frame)) => {
// Existing screen capture logic...
}
_ => {}
}
}
async fn handle_viewer_message(msg: ViewerMessage, session: &Session) {
match msg.message {
Some(viewer_message::Message::TerminalInput(input)) => {
// Forward to agent WebSocket
if let Some(agent_ws) = session.agent_ws.lock().await.as_mut() {
agent_ws.send(AgentMessage {
message: Some(agent_message::Message::TerminalInput(input))
}).await?;
}
}
Some(viewer_message::Message::TerminalResize(resize)) => {
// Forward resize to agent
if let Some(agent_ws) = session.agent_ws.lock().await.as_mut() {
agent_ws.send(AgentMessage {
message: Some(agent_message::Message::TerminalResize(resize))
}).await?;
}
}
Some(viewer_message::Message::Input(input)) => {
// Existing GUI input logic...
}
_ => {}
}
}
Web Viewer (xterm.js)
HTML template:
<!-- server/static/viewer-terminal.html -->
<!DOCTYPE html>
<html>
<head>
<title>GuruConnect Terminal</title>
<link rel="stylesheet" href="/vendor/xterm/xterm.css" />
<script src="/vendor/xterm/xterm.js"></script>
<script src="/vendor/xterm/xterm-addon-fit.js"></script>
<style>
#terminal { height: 100vh; }
.toolbar { background: #333; color: #fff; padding: 8px; }
</style>
</head>
<body>
<div class="toolbar">
<span id="status">Connecting...</span>
<span id="size" style="float: right;">80x24</span>
</div>
<div id="terminal"></div>
<script>
const term = new Terminal({
cursorBlink: true,
fontSize: 14,
fontFamily: 'Consolas, "Courier New", monospace',
theme: {
background: '#1e1e1e',
foreground: '#d4d4d4',
}
});
const fitAddon = new FitAddon.FitAddon();
term.loadAddon(fitAddon);
term.open(document.getElementById('terminal'));
fitAddon.fit();
const ws = new WebSocket(`wss://${location.host}/ws/viewer?token=${TOKEN}&session=${SESSION_ID}`);
ws.onopen = () => {
document.getElementById('status').textContent = 'Connected';
// Send initial terminal size
ws.send(encodeTerminalResize(term.cols, term.rows));
};
ws.onmessage = (event) => {
const msg = decodeProtobuf(event.data);
if (msg.terminal_data) {
term.write(new Uint8Array(msg.terminal_data.data));
}
};
term.onData((data) => {
ws.send(encodeTerminalInput(data));
});
term.onResize(({ cols, rows }) => {
document.getElementById('size').textContent = `${cols}x${rows}`;
ws.send(encodeTerminalResize(cols, rows));
});
window.addEventListener('resize', () => fitAddon.fit());
</script>
</body>
</html>
Dashboard Detection
// server/static/dashboard.js
function renderAgentRow(agent) {
const icon = agent.terminal_mode
? '<i class="icon-terminal"></i> Terminal'
: '<i class="icon-screen"></i> Screen';
const connectButton = agent.online
? `<button onclick="connectToAgent('${agent.id}', ${agent.terminal_mode})">Connect</button>`
: '<span>Offline</span>';
return `<tr>
<td>${agent.name}</td>
<td>${icon}</td>
<td>${agent.os} ${agent.os_version}</td>
<td>${connectButton}</td>
</tr>`;
}
function connectToAgent(agentId, terminalMode) {
if (terminalMode) {
window.open(`/viewer-terminal.html?session=${agentId}&token=${JWT}`, '_blank');
} else {
window.open(`/viewer.html?session=${agentId}&token=${JWT}`, '_blank');
}
}
Database Schema
Minor addition to sessions table:
-- migrations/012_terminal_mode.sql
ALTER TABLE connect_sessions ADD COLUMN terminal_mode BOOLEAN DEFAULT FALSE;
-- Optional: separate table for terminal recordings
CREATE TABLE IF NOT EXISTS terminal_recordings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID REFERENCES connect_sessions(id) ON DELETE CASCADE,
started_at TIMESTAMPTZ DEFAULT NOW(),
ended_at TIMESTAMPTZ,
recording_data BYTEA, -- asciicast JSON or raw PTY dump (compressed)
size_bytes BIGINT,
INDEX idx_terminal_recordings_session (session_id)
);
Implementation Details
Files to Create
Agent (Linux-specific):
agent/src/platform/linux/pty.rs— PTY spawn, I/O, resize (openpty, fork, exec)agent/src/platform/linux/headless.rs— Headless detection logicagent/src/session/terminal.rs— Terminal session loop (PTY ↔ WebSocket)
Server:
server/src/relay/terminal.rs— Terminal message routing (TerminalData/Input/Resize)server/static/viewer-terminal.html— xterm.js-based web terminal viewerserver/static/vendor/xterm/— xterm.js library files (CDN or bundled)server/migrations/012_terminal_mode.sql— Schema update
Protobuf:
proto/guruconnect.proto— Add TerminalData, TerminalInput, TerminalResize messages
Dashboard:
server/static/dashboard.js— Detectterminal_mode, render terminal icon, route to terminal viewer
Key Dependencies
# agent/Cargo.toml (Linux-specific)
[target.'cfg(target_os = "linux")'.dependencies]
libc = "0.2" # openpty, fork, exec, ioctl
nix = "0.27" # Safe wrappers for POSIX APIs
xterm.js (web viewer):
- Version: 5.3.0+ (latest stable)
- Addons:
xterm-addon-fit(auto-resize) - Delivery: CDN link or bundled in
server/static/vendor/xterm/
Security Considerations
Shell Access Risk
- Privilege escalation: PTY spawns shell as the agent's user (typically
rootif agent runs as systemd service) - Mitigation 1: Run agent as unprivileged user (
guruconnectservice user), usesudofor privileged commands - Mitigation 2: Add
allowed_commandswhitelist (optional Phase 2 feature) — restrict to specific binaries - Mitigation 3: Audit logging: record all terminal I/O for compliance review
Authentication
Same as GUI agents:
- Support-code for ad-hoc sessions (6-digit, time-limited)
- Persistent agent key for managed servers (per-agent
cak_*key from SPEC-004) - Viewer JWT token required for WebSocket connection
Session Recording (Compliance)
- Optional toggle: dashboard setting "Record terminal sessions" (default: ON for compliance)
- Storage:
terminal_recordingstable (BYTEA column, compressed) - Playback: Admin dashboard can replay terminal sessions as asciicast (xterm.js built-in playback)
- Retention: configurable (default: 90 days, auto-purge older recordings)
Input Sanitization
- No sanitization needed: PTY handles raw bytes; ANSI escape sequences are terminal-native
- DoS risk: Malicious viewer could spam resize events; rate-limit
TerminalResize(max 10/sec)
Testing Strategy
Unit Tests
- PTY spawn/cleanup: verify
openpty()success, shell exec, FD management - Terminal I/O: mock PTY master FD, test read/write buffers
- Protobuf serialization: TerminalData/Input/Resize round-trip
Integration Tests
- Headless VM: Ubuntu Server 22.04 minimal (no desktop packages)
- Agent install:
guruconnectbinary, systemd service, no X11 deps - Connect flow: Dashboard → "Connect" → xterm.js viewer → type
ls, verify output - Resize: Browser window resize → PTY receives SIGWINCH →
htopredraws correctly - Session cleanup: Close viewer → PTY process exits gracefully
Manual Testing Scenarios
-
Basic shell interaction:
- Connect to headless agent via dashboard
- Type
ls -la, verify colorized output - Run
vim test.txt, verify cursor movement, editing, save/quit - Run
htop, verify full-screen TUI app renders correctly
-
Terminal resize:
- Start session at default 80x24
- Resize browser window to 120x40
- Run
tput cols; tput lines→ verify output matches - Run
htop→ verify UI scales to new dimensions
-
Multi-line output:
- Run
dmesg | head -100→ verify scrollback works - Run
journalctl -f→ verify live log streaming
- Run
-
Session recording playback:
- Perform session actions (ls, vim, htop)
- End session
- Admin dashboard → "View Recording" → verify asciicast playback
-
Privilege escalation (sudo):
- Agent runs as
guruconnectuser (non-root) - Connect via terminal
- Run
sudo apt update→ enter sudo password → verify command executes - Run
whoami→ verify showsrootafter sudo
- Agent runs as
Performance
- Latency target: <100ms round-trip for input (same as GUI mode)
- Bandwidth: ~1-5 KB/sec for typical terminal I/O (much lower than screen capture)
- Stress test: Run
yescommand (infinite output) → verify relay doesn't OOM, rate-limit applied
Effort Estimate & Dependencies
Size: Medium (4-6 weeks, 1 developer)
Breakdown:
- PTY implementation (Linux agent): 1.5 weeks
- Protobuf protocol updates: 0.5 weeks
- Relay server terminal routing: 1 week
- xterm.js web viewer integration: 1 week
- Dashboard terminal mode detection + routing: 0.5 weeks
- Session recording + playback: 1 week
- Testing, edge cases, systemd integration: 1 week
- Documentation: 0.5 weeks
Dependencies:
- SPEC-010 Linux agent base — PTY mode extends the Linux agent; can be implemented in parallel with SPEC-010's GUI capture
- xterm.js library — mature, well-tested (used by VS Code, Jupyter, many commercial products)
- libc/nix crates — standard Rust POSIX bindings
- SPEC-004 per-agent keys — already shipped for persistent agent auth
Unblocks:
- Server management use case (Linux VMs, containers, bare metal)
- SSH replacement with centralized audit logging
- Emergency recovery (single-user mode, systemd rescue shell)
- Container debugging (exec into running containers via GuruConnect)
Open Questions
-
Run agent as root or unprivileged user? — Recommend unprivileged
guruconnectservice user + sudo whitelist. Security-sensitive orgs may require root; make configurable. -
Shell selection? — v1:
$SHELLenv var →/bin/bash→/bin/sh. Phase 2: dashboard setting to override shell per agent (/bin/zsh,/bin/fish). -
Concurrent PTY sessions? — v1: one PTY per agent connection (like SSH). Phase 2: tmux/screen integration for multi-viewer session sharing.
-
Terminal recording format? — Asciicast (JSON, industry standard, xterm.js playback support) vs. raw PTY dump (more compact, custom playback). Recommend asciicast for v1.
-
Command whitelisting? — Optional Phase 2 feature. v1 is unrestricted shell access (same as SSH). Add
allowed_commandsarray to agent config if compliance requires it. -
Windows/macOS terminal mode? — Defer. Windows Server typically uses RDP or SSH (OpenSSH built-in since Server 2019). macOS servers are rare. Linux headless servers are the primary use case.
-
File upload/download via terminal? — v1: use standard tools (
scp,rsync,wget). Phase 2: integrate with SPEC (file transfer) for dashboard-native upload/download.
Cross-references:
- SPEC-010: Cross-platform agents (macOS/Linux GUI) — headless mode extends Linux agent with PTY alternative
- SPEC-004: Stable machine identity — headless agents use same deterministic
machine_uid(/etc/machine-id) - ADR-001: GuruConnect is standalone — headless mode doesn't require GuruRMM integration
- Future: File transfer spec (roadmap item) — will integrate with terminal mode for
scp-like functionality