diff --git a/docs/specs/SPEC-012-headless-linux-tty.md b/docs/specs/SPEC-012-headless-linux-tty.md index 7fe013a..9bb366b 100644 --- a/docs/specs/SPEC-012-headless-linux-tty.md +++ b/docs/specs/SPEC-012-headless-linux-tty.md @@ -1,40 +1,63 @@ -# SPEC-012: Headless Linux Mode (Direct TTY Access) +# SPEC-012: Headless Linux Mode (Serial Console + PTY Shell Access) **Status:** Proposed **Priority:** P2 **Requested By:** Mike Swanson (2026-05-30) -**Estimated Effort:** Medium (4-6 weeks) +**Estimated Effort:** Medium (5-7 weeks) ## Overview -Enable GuruConnect agent support for headless Linux servers (no X11/Wayland GUI) by providing direct terminal (TTY) access instead of screen capture. This addresses a critical server management use case: remote terminal access to Linux servers, VMs, and containers that run without a graphical desktop environment. Unlike SSH, this integrates with the GuruConnect dashboard for centralized access, audit logging, and support-code workflows. The viewer displays a terminal emulator (xterm.js-based web viewer or native terminal in the desktop viewer) connected to a pseudo-TTY (PTY) on the target server. Success criteria: technician can manage a headless Ubuntu Server 22.04 VM via GuruConnect dashboard with same authentication and session model as GUI agents, full terminal capabilities (colors, cursor control, vim/nano editing), and zero X11/Wayland dependencies on the target. +Enable GuruConnect agent support for headless Linux servers (no X11/Wayland GUI) by providing two modes of terminal access: **Serial Console Mode** for boot-level access (GRUB, kernel messages, panics) and **PTY Shell Mode** for normal server management. This addresses critical server management use cases—from emergency recovery to routine administration—without requiring SSH. The viewer displays a terminal emulator (xterm.js web viewer) connected to either the system serial console (`/dev/ttyS0`) or a pseudo-TTY shell session. Serial Console Mode provides true "remote console" access like KVM-over-IP or IPMI Serial-over-LAN, seeing everything the physical monitor would show. PTY Shell Mode provides an interactive shell for normal management tasks. Success criteria: technician can access GRUB bootloader, view kernel boot messages, handle kernel panics, AND perform routine server management—all via GuruConnect dashboard with centralized authentication and audit logging. **Use Cases:** -- Remote terminal access to headless Linux servers (web hosting, databases, Docker hosts) -- Container debugging (exec into running containers via GuruConnect) -- Emergency server recovery (systemd rescue mode, single-user mode) -- MSP consolidation: one tool for both desktop support (GUI) and server management (terminal) +- **Boot-level access:** GRUB menu selection, kernel parameter editing, single-user mode +- **Emergency recovery:** Kernel panic diagnosis, filesystem repair, systemd rescue shell +- **Server management:** Package updates, configuration changes, log review (normal shell access) +- **Container debugging:** Exec into running containers via GuruConnect +- **MSP consolidation:** One tool for desktop support (GUI), server boot recovery (console), and server management (shell) **Success Criteria:** +- **Serial Console Mode:** View GRUB bootloader, kernel boot messages, kernel panics, login prompts—as if sitting at physical console +- **PTY Shell Mode:** Interactive shell (bash/zsh) with full ANSI color, cursor control, vim/nano/htop support - GuruConnect agent runs on Ubuntu Server 22.04 minimal install (no desktop packages) -- Viewer sees full-color, interactive terminal (80x24 or larger, resizable) -- Full terminal capabilities: ANSI colors, cursor positioning, vim/nano/htop work correctly +- Dashboard mode selector: "Console" vs. "Shell" per agent (user chooses at connection time) - Same protobuf-over-WSS transport, support-code and persistent-agent authentication -- Audit logging: session recording (terminal output captured to `events` table or file) +- Audit logging: session recording for both console and shell modes ## Scope ### Included in v1 -**Headless Agent Mode:** +**Mode 1: Serial Console Mode (True Remote Console)** +- Open system serial console device (`/dev/ttyS0` or `/dev/console`) for raw I/O +- Relay all bytes bidirectionally: console output → `TerminalData` → viewer; viewer input → `TerminalInput` → console +- **Sees everything:** GRUB bootloader menu, kernel boot messages, systemd startup, login prompts, kernel panics +- **Boot-time interaction:** Select GRUB entries, edit kernel parameters, boot into single-user mode +- Requires root privileges (serial console access restricted to root) +- Requires serial console enabled on target server (GRUB + kernel parameters configured) +- No PTY spawning—direct device I/O, like `screen /dev/ttyS0 115200` +- Agent config flag: `console_mode: true` + `console_device: "/dev/ttyS0"` + +**Mode 2: PTY Shell Mode (Interactive Shell)** - Detect headless environment (no DISPLAY, no X11/Wayland libraries) at runtime - Spawn pseudo-TTY (PTY) via `openpty()` + fork/exec shell (`/bin/bash -l` or user's `$SHELL`) - Terminal I/O: read PTY output → encode as protobuf `TerminalData` → send via WebSocket - Input: receive protobuf `TerminalInput` → write to PTY master - Terminal resize: handle `TerminalResize` message → send `SIGWINCH` to PTY - Fallback shell selection: `$SHELL` env var → `/bin/bash` → `/bin/sh` -- Same agent binary as GUI mode: `guruconnect` detects headless and switches mode automatically - Graceful PTY cleanup on session end (send exit command, wait for shell exit, close PTY) +- Standard user privileges (runs as agent service user) + +**Mode Selection:** +- Dashboard shows mode selector when connecting to headless agent: "Console" vs. "Shell" +- "Console" button: viewer sends `mode: console` in connection request → agent opens `/dev/ttyS0` +- "Shell" button: viewer sends `mode: shell` in connection request → agent spawns PTY +- Agent config specifies default mode if serial console unavailable +- If serial console device doesn't exist or permission denied, fall back to PTY shell mode with warning + +**Both Modes Share:** +- Same agent binary: `guruconnect` detects headless and offers both modes +- Same xterm.js viewer (handles both serial console and PTY identically) **Viewer (Web Viewer):** - xterm.js-based terminal emulator embedded in `viewer.html` @@ -66,13 +89,226 @@ Enable GuruConnect agent support for headless Linux servers (no X11/Wayland GUI) - **GUI mode on headless agents** — v1 is terminal-only; no attempt to start Xvfb or launch GUI apps - **SSH key management** — agent uses GuruConnect auth (support code / agent key), not SSH keys - **File transfer via terminal** — defer to SPEC (file transfer is a separate roadmap item for all agent types) -- **Multi-user terminal sessions** — v1 is single-session PTY; no tmux/screen built-in sharing +- **Multi-user terminal sessions** — v1 is single-session console/PTY; no tmux/screen built-in sharing - **Windows terminal mode** — defer; Windows Server typically has GUI (RDP) or SSH (OpenSSH) - **macOS terminal mode** — defer; macOS servers are rare and typically have GUI access +- **Framebuffer capture (`/dev/fb0`)** — defer; serial console is more reliable and doesn't require framebuffer device + +### Serial Console Setup Requirements (Mode 1) + +To use Serial Console Mode, the target Linux server must be configured to output to serial console. This is a **one-time setup per server** (typically done during provisioning): + +**Step 1: Configure GRUB** + +Edit `/etc/default/grub`: +```bash +# Enable serial console output at 115200 baud +GRUB_TERMINAL="serial console" +GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1" + +# Kernel console output to both VGA (tty0) and serial (ttyS0) +GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8" +``` + +Update GRUB and reboot: +```bash +sudo update-grub # Debian/Ubuntu +# OR +sudo grub2-mkconfig -o /boot/grub2/grub.cfg # RHEL/CentOS +sudo reboot +``` + +**Step 2: Enable getty on Serial Console** + +Ensure a login prompt appears on serial console after boot: +```bash +sudo systemctl enable serial-getty@ttyS0.service +sudo systemctl start serial-getty@ttyS0.service +``` + +**Step 3: Verify** + +Test serial console locally before configuring GuruConnect: +```bash +sudo screen /dev/ttyS0 115200 +# Should see kernel messages, login prompt +``` + +**What This Provides:** +- ✓ GRUB bootloader menu visible via serial console +- ✓ Kernel boot messages stream to serial console +- ✓ Login prompt on `/dev/ttyS0` after boot +- ✓ Kernel panics output to serial console +- ✓ systemd rescue shell accessible via serial console + +**Compatibility:** +- Physical servers: Uses hardware serial port (COM1 = ttyS0) +- Virtual machines: VMware/Proxmox/KVM expose virtual serial port; configure VM to attach serial port +- Cloud VMs: AWS, GCP, Azure offer "Serial Console" feature (already configured); GuruConnect agent can relay it ## Architecture -### Agent PTY Handling +### Agent Mode Selection + +**Connection request handling:** + +```rust +// agent/src/session/terminal.rs +pub async fn handle_terminal_session( + ws: WebSocketClient, + mode: TerminalMode, // Console or Shell + support_code: String +) -> Result<()> { + match mode { + TerminalMode::Console => run_console_session(ws, support_code).await, + TerminalMode::Shell => run_shell_session(ws, support_code).await, + } +} + +pub enum TerminalMode { + Console, // Serial console (/dev/ttyS0) + Shell, // PTY shell session +} +``` + +### Agent Serial Console Handling (Mode 1) + +**Serial device open:** + +```rust +// agent/src/platform/linux/console.rs +use std::fs::OpenOptions; +use std::os::unix::io::AsRawFd; + +pub struct ConsoleSession { + device_fd: RawFd, + device_path: String, // "/dev/ttyS0" or "/dev/console" +} + +impl ConsoleSession { + pub fn open(device_path: &str) -> Result { + // Open serial console device for read/write + // Requires root privileges + let file = OpenOptions::new() + .read(true) + .write(true) + .open(device_path) + .context("Failed to open serial console - requires root")?; + + let device_fd = file.as_raw_fd(); + + // Configure terminal settings (115200 baud, 8N1) + unsafe { + let mut termios: libc::termios = std::mem::zeroed(); + if libc::tcgetattr(device_fd, &mut termios) != 0 { + return Err(anyhow!("tcgetattr failed")); + } + + // Set baud rate to 115200 + libc::cfsetispeed(&mut termios, libc::B115200); + libc::cfsetospeed(&mut termios, libc::B115200); + + // 8N1 (8 data bits, no parity, 1 stop bit) + termios.c_cflag &= !libc::CSIZE; + termios.c_cflag |= libc::CS8; + termios.c_cflag &= !(libc::PARENB | libc::PARODD); + termios.c_cflag &= !libc::CSTOPB; + + // Raw mode (no line buffering, no echo) + libc::cfmakeraw(&mut termios); + + if libc::tcsetattr(device_fd, libc::TCSANOW, &termios) != 0 { + return Err(anyhow!("tcsetattr failed")); + } + } + + Ok(ConsoleSession { + device_fd, + device_path: device_path.to_string(), + }) + } + + pub fn read(&self, buf: &mut [u8]) -> Result { + unsafe { + let n = libc::read(self.device_fd, buf.as_mut_ptr() as *mut _, buf.len()); + if n < 0 { + Err(anyhow!("Console read failed")) + } else { + Ok(n as usize) + } + } + } + + pub fn write(&self, data: &[u8]) -> Result<()> { + unsafe { + let n = libc::write(self.device_fd, data.as_ptr() as *const _, data.len()); + if n < 0 { + Err(anyhow!("Console write failed")) + } else { + Ok(()) + } + } + } +} + +impl Drop for ConsoleSession { + fn drop(&mut self) { + unsafe { libc::close(self.device_fd); } + } +} +``` + +**Console session loop:** + +```rust +// agent/src/session/console.rs +pub async fn run_console_session(ws: WebSocketClient, support_code: String) -> Result<()> { + // Try /dev/ttyS0 first, fall back to /dev/console + let console = ConsoleSession::open("/dev/ttyS0") + .or_else(|_| ConsoleSession::open("/dev/console"))?; + + // Status update: terminal mode, console + ws.send(AgentStatus { + terminal_mode: true, + console_mode: true, // NEW flag + os: "Linux".to_string(), + // ... + }).await?; + + let mut buf = vec![0u8; 4096]; + loop { + tokio::select! { + // Read console output, send to relay + Ok(n) = tokio::task::spawn_blocking({ + let fd = console.device_fd; + move || unsafe { libc::read(fd, buf.as_mut_ptr() as *mut _, buf.len()) } + }) => { + if n > 0 { + ws.send(TerminalData { + data: buf[..n as usize].to_vec(), + }).await?; + } + } + + // Receive input from relay, write to console + Some(msg) = ws.recv() => { + match msg { + Message::TerminalInput(input) => { + console.write(&input.data)?; + } + Message::Disconnect => break, + // Note: Resize ignored for serial console (not applicable) + _ => {} + } + } + } + } + + Ok(()) +} +``` + +### Agent PTY Handling (Mode 2) **Headless detection:** @@ -255,11 +491,12 @@ pub async fn run_terminal_session(ws: WebSocketClient, support_code: String) -> message AgentStatus { // Existing fields... - optional bool terminal_mode = 21; // true for headless agents + optional bool terminal_mode = 21; // true for headless agents + optional bool console_mode = 22; // true for serial console mode, false for PTY shell mode } message TerminalData { - bytes data = 1; // PTY raw output (may include ANSI escape sequences) + bytes data = 1; // Raw terminal output (PTY or serial console, includes ANSI escape sequences) } message TerminalInput { @@ -269,6 +506,17 @@ message TerminalInput { message TerminalResize { uint32 cols = 1; // Terminal width (characters) uint32 rows = 2; // Terminal height (lines) + // Note: Resize only applies to PTY shell mode; serial console ignores this +} + +enum TerminalModeRequest { + SHELL = 0; // Request PTY shell session + CONSOLE = 1; // Request serial console session +} + +message SessionRequest { + // Existing fields... + optional TerminalModeRequest terminal_mode_request = 10; // NEW: viewer specifies console vs. shell } // Update AgentMessage and ViewerMessage unions @@ -412,7 +660,7 @@ async fn handle_viewer_message(msg: ViewerMessage, session: &Session) { ``` -### Dashboard Detection +### Dashboard Mode Selector ```javascript // server/static/dashboard.js @@ -421,27 +669,54 @@ function renderAgentRow(agent) { ? ' Terminal' : ' Screen'; - const connectButton = agent.online - ? `` - : 'Offline'; + // For headless agents, show mode selector (Console vs. Shell) + let connectButtons; + if (agent.terminal_mode && agent.online) { + connectButtons = ` +
+ + +
+ `; + } else if (!agent.terminal_mode && agent.online) { + // GUI agent + connectButtons = ``; + } else { + connectButtons = 'Offline'; + } return ` ${agent.name} ${icon} ${agent.os} ${agent.os_version} - ${connectButton} + ${connectButtons} `; } -function connectToAgent(agentId, terminalMode) { - if (terminalMode) { - window.open(`/viewer-terminal.html?session=${agentId}&token=${JWT}`, '_blank'); - } else { - window.open(`/viewer.html?session=${agentId}&token=${JWT}`, '_blank'); - } +function connectToAgent(agentId) { + // GUI agent connection + window.open(`/viewer.html?session=${agentId}&token=${JWT}`, '_blank'); +} + +function connectToTerminal(agentId, mode) { + // Terminal agent connection with mode parameter + window.open(`/viewer-terminal.html?session=${agentId}&token=${JWT}&mode=${mode}`, '_blank'); } ``` +**Dashboard UI for headless agents:** +- Shows two buttons: "Console" and "Shell" +- "Console" button: opens serial console session (GRUB, boot messages, panics) +- "Shell" button: opens PTY shell session (normal server management) +- Tooltip on hover explains each mode +- Mode parameter passed to viewer via URL query string + ### Database Schema **Minor addition to `sessions` table:** @@ -467,9 +742,11 @@ CREATE TABLE IF NOT EXISTS terminal_recordings ( ### Files to Create **Agent (Linux-specific):** +- `agent/src/platform/linux/console.rs` — NEW: Serial console device I/O (`/dev/ttyS0`, termios config) +- `agent/src/session/console.rs` — NEW: Console session loop (serial device ↔ WebSocket) - `agent/src/platform/linux/pty.rs` — PTY spawn, I/O, resize (openpty, fork, exec) - `agent/src/platform/linux/headless.rs` — Headless detection logic -- `agent/src/session/terminal.rs` — Terminal session loop (PTY ↔ WebSocket) +- `agent/src/session/terminal.rs` — Mode dispatcher (console vs. shell), shell session loop **Server:** - `server/src/relay/terminal.rs` — Terminal message routing (TerminalData/Input/Resize) @@ -499,10 +776,25 @@ nix = "0.27" # Safe wrappers for POSIX APIs ## Security Considerations -### Shell Access Risk +### Serial Console Access (Mode 1) -- **Privilege escalation:** PTY spawns shell as the agent's user (typically `root` if agent runs as systemd service) -- **Mitigation 1:** Run agent as unprivileged user (`guruconnect` service user), use `sudo` for privileged commands +- **Requires root privileges:** Opening `/dev/ttyS0` or `/dev/console` requires root access +- **Implication:** Agent must run as root for console mode, OR use capabilities (`CAP_SYS_TTY_CONFIG`) +- **Boot-level control:** Serial console grants full boot-time control (GRUB menu, kernel parameters, single-user mode) +- **Risk:** Attacker with console access can modify bootloader, disable security features, boot into recovery +- **Mitigation 1:** Restrict console mode to authorized users only (dashboard RBAC: "console_access" permission) +- **Mitigation 2:** Require MFA for console mode sessions (stronger auth than shell mode) +- **Mitigation 3:** Audit logging: record ALL console I/O with immutable timestamps +- **Mitigation 4:** Alert on console mode connections (notify admin when console session starts) + +**Recommended deployment:** +- Run agent as unprivileged user for shell mode (default) +- For console mode: either run agent as root OR grant `CAP_SYS_TTY_CONFIG` capability via systemd unit + +### Shell Access Risk (Mode 2) + +- **Privilege escalation:** PTY spawns shell as the agent's user (typically unprivileged `guruconnect` service user) +- **Mitigation 1:** Run agent as unprivileged user, use `sudo` for privileged commands - **Mitigation 2:** Add `allowed_commands` whitelist (optional Phase 2 feature) — restrict to specific binaries - **Mitigation 3:** Audit logging: record all terminal I/O for compliance review @@ -578,17 +870,19 @@ nix = "0.27" # Safe wrappers for POSIX APIs ## Effort Estimate & Dependencies -**Size:** Medium (4-6 weeks, 1 developer) +**Size:** Medium (5-7 weeks, 1 developer) **Breakdown:** +- Serial console implementation (Linux agent): 1.5 weeks - PTY implementation (Linux agent): 1.5 weeks -- Protobuf protocol updates: 0.5 weeks +- Mode selection + dispatcher: 0.5 weeks +- Protobuf protocol updates (mode enum, console_mode flag): 0.5 weeks - Relay server terminal routing: 1 week - xterm.js web viewer integration: 1 week -- Dashboard terminal mode detection + routing: 0.5 weeks -- Session recording + playback: 1 week -- Testing, edge cases, systemd integration: 1 week -- Documentation: 0.5 weeks +- Dashboard mode selector UI + routing: 0.5 weeks +- Session recording + playback (both modes): 1 week +- Testing (console + shell modes), edge cases, systemd integration: 1.5 weeks +- Documentation (setup guide for serial console): 0.5 weeks **Dependencies:** - **SPEC-010 Linux agent base** — PTY mode extends the Linux agent; can be implemented in parallel with SPEC-010's GUI capture @@ -597,26 +891,36 @@ nix = "0.27" # Safe wrappers for POSIX APIs - **SPEC-004 per-agent keys** — already shipped for persistent agent auth **Unblocks:** -- Server management use case (Linux VMs, containers, bare metal) -- SSH replacement with centralized audit logging -- Emergency recovery (single-user mode, systemd rescue shell) -- Container debugging (exec into running containers via GuruConnect) +- **Boot-level access** (GRUB menu, kernel parameters, single-user mode) via serial console mode +- **Emergency recovery** (kernel panics, filesystem repair, systemd rescue shell) via serial console +- **Server management** (Linux VMs, containers, bare metal) via shell mode +- **SSH replacement** with centralized audit logging and GuruConnect auth +- **Container debugging** (exec into running containers via GuruConnect) +- **KVM-over-IP alternative** (serial console provides text-mode equivalent to IPMI Serial-over-LAN) ## Open Questions -1. **Run agent as root or unprivileged user?** — Recommend unprivileged `guruconnect` service user + sudo whitelist. Security-sensitive orgs may require root; make configurable. +1. **Serial console permissions - root vs. capabilities?** — Opening `/dev/ttyS0` requires root. Options: (a) run agent as root for console mode, (b) use Linux capabilities (`CAP_SYS_TTY_CONFIG`), (c) add agent user to `dialout` group (may not work for `/dev/console`). Recommend (b) via systemd unit: `AmbientCapabilities=CAP_SYS_TTY_CONFIG`. -2. **Shell selection?** — v1: `$SHELL` env var → `/bin/bash` → `/bin/sh`. Phase 2: dashboard setting to override shell per agent (`/bin/zsh`, `/bin/fish`). +2. **Default mode if serial console unavailable?** — If `/dev/ttyS0` doesn't exist or permission denied, fall back to shell mode automatically or show error? Recommend auto-fallback with warning message in viewer. -3. **Concurrent PTY sessions?** — v1: one PTY per agent connection (like SSH). Phase 2: tmux/screen integration for multi-viewer session sharing. +3. **Serial console baud rate?** — v1 hardcodes 115200 (industry standard). Phase 2: make configurable if slower links needed (9600, 38400). -4. **Terminal recording format?** — Asciicast (JSON, industry standard, xterm.js playback support) vs. raw PTY dump (more compact, custom playback). Recommend asciicast for v1. +4. **Shell selection (PTY mode)?** — v1: `$SHELL` env var → `/bin/bash` → `/bin/sh`. Phase 2: dashboard setting to override shell per agent (`/bin/zsh`, `/bin/fish`). -5. **Command whitelisting?** — Optional Phase 2 feature. v1 is unrestricted shell access (same as SSH). Add `allowed_commands` array to agent config if compliance requires it. +5. **Concurrent sessions?** — v1: one console/shell session per agent connection (like SSH). Phase 2: tmux/screen integration for multi-viewer session sharing. -6. **Windows/macOS terminal mode?** — Defer. Windows Server typically uses RDP or SSH (OpenSSH built-in since Server 2019). macOS servers are rare. Linux headless servers are the primary use case. +6. **Terminal recording format?** — Asciicast (JSON, industry standard, xterm.js playback support) vs. raw dump (more compact, custom playback). Recommend asciicast for v1. -7. **File upload/download via terminal?** — v1: use standard tools (`scp`, `rsync`, `wget`). Phase 2: integrate with SPEC (file transfer) for dashboard-native upload/download. +7. **Command whitelisting (shell mode)?** — Optional Phase 2 feature. v1 is unrestricted shell access (same as SSH). Add `allowed_commands` array to agent config if compliance requires it. + +8. **RBAC for console vs. shell access?** — Should some users only have shell access (not console, which grants boot-level control)? Recommend yes: add `console_access` permission, separate from `shell_access`. + +9. **MFA for console mode?** — Given boot-level control risk, require MFA for console mode sessions? Defer to Phase 2 (MFA is a broader GuruConnect feature). + +10. **Windows/macOS terminal mode?** — Defer. Windows Server typically uses RDP or SSH (OpenSSH built-in since Server 2019). macOS servers are rare. Linux headless servers are the primary use case. + +11. **File upload/download via terminal?** — v1: use standard tools (`scp`, `rsync`, `wget`). Phase 2: integrate with SPEC (file transfer) for dashboard-native upload/download. ---