# SPEC-010: Cross-Platform Agent Support (macOS and Linux) **Status:** Proposed **Priority:** P2 **Requested By:** Mike Swanson (2026-05-30) **Estimated Effort:** X-Large ## Overview Expand GuruConnect's agent support beyond Windows to include macOS and Linux platforms, enabling remote control and support across the full spectrum of enterprise and home computing environments. This cross-platform expansion addresses a critical market gap—ScreenConnect, Splashtop, and AnyDesk all support macOS/Linux, and GuruConnect's Windows-only limitation blocks adoption by multi-platform organizations. The primary technical challenge is abstracting platform-specific screen capture, input injection, and video encoding while maintaining the core protobuf-over-WSS transport and session model. Success criteria: feature parity with the Windows agent (capture, input, tray presence, persistent/support-code modes, H.264 encoding) on macOS 12+ and Ubuntu 22.04+ LTS. ## Scope ### Included in v1 **macOS Agent (Priority 1):** - Screen capture via ScreenCaptureKit API (macOS 13+) with AVFoundation fallback (macOS 12) - Input injection via CGEvent - H.264 encoding via VideoToolbox - Menu bar icon (NSStatusItem) with status and exit option - `guruconnect://` protocol handler registration - .app bundle packaging and code signing - Screen Recording permission prompt and validation - Universal binary (x86_64 + arm64) **Linux Agent (Priority 2):** - Screen capture via X11 (XShm) with Wayland pipewire fallback detection - Input injection via XTest (X11) or uinput (Wayland) - H.264 encoding via VA-API (hardware) or software fallback (libx264) - System tray icon via StatusNotifier/AppIndicator - `guruconnect://` protocol handler (.desktop file) - .deb packaging (Ubuntu/Debian) and .rpm (Fedora/RHEL) - x86_64 binary **Shared Cross-Platform:** - Unified agent codebase with platform-specific modules behind traits - Same protobuf protocol (no wire format changes) - Same persistent/support-code authentication modes - Same AgentStatus metadata reporting (OS, architecture, uptime, logged-on user) - Viewer compatibility: existing native Windows viewer and web viewer work unchanged ### Explicitly out of scope - **FreeBSD/BSD support** — defer; assess demand post-launch - **Wayland-native screen capture** — v1 uses X11 compatibility layer; native Wayland portal support is Phase 2 - **Multi-monitor switching on macOS/Linux** — already deferred on Windows (P2 roadmap item); cross-platform parity follows - **macOS < 12** — 12.x is the minimum (released Oct 2021, 95%+ adoption) - **Linux distros outside Ubuntu 22.04+ / Debian 11+ / Fedora 38+ / RHEL 9+** — official support limited; community builds acceptable ## Architecture ### Agent Refactor: Platform Abstraction Layer Current Windows agent (`agent/src/`) tightly couples platform APIs. Refactor into: ``` agent/src/ ├── main.rs # CLI entry + platform dispatch ├── session/ # Session logic (platform-agnostic) ├── transport/ # WebSocket (unchanged) ├── platform/ # NEW: trait definitions │ ├── mod.rs # PlatformCapture, PlatformInput, PlatformTray traits │ ├── windows/ # Windows impl (existing code refactored) │ │ ├── capture.rs # DXGI + GDI behind PlatformCapture │ │ ├── input.rs # Win32 SendInput behind PlatformInput │ │ ├── encoder.rs # Media Foundation H.264 │ │ └── tray.rs # Win32 shell tray │ ├── macos/ # NEW: macOS impl │ │ ├── capture.rs # ScreenCaptureKit/AVFoundation │ │ ├── input.rs # CGEvent │ │ ├── encoder.rs # VideoToolbox H.264 │ │ └── tray.rs # NSStatusItem (Objective-C via objc2 crate) │ └── linux/ # NEW: Linux impl │ ├── capture.rs # X11 XShm / pipewire detection │ ├── input.rs # XTest / uinput │ ├── encoder.rs # VA-API / libx264 │ └── tray.rs # StatusNotifier ├── encoder/ # Color conversion, raw fallback (platform-agnostic) ├── viewer/ # Native viewer (winit cross-platform, unchanged) ├── config.rs # Config (unchanged) └── install.rs # Installation (platform-specific #[cfg] blocks) ``` **Key traits:** ```rust // agent/src/platform/mod.rs pub trait PlatformCapture: Send { fn capture_frame(&mut self) -> Result; fn get_dimensions(&self) -> (u32, u32); } pub trait PlatformInput: Send { fn inject_mouse(&mut self, event: MouseEvent) -> Result<()>; fn inject_keyboard(&mut self, event: KeyboardEvent) -> Result<()>; } pub trait PlatformEncoder: Send { fn encode(&mut self, frame: &[u8], width: u32, height: u32) -> Result>; fn set_bitrate(&mut self, kbps: u32); } pub trait PlatformTray: Send { fn show(&mut self, status: &str) -> Result<()>; fn poll_events(&mut self) -> Vec; } ``` ### Protobuf Changes **None.** The existing `AgentStatus` message already carries OS/arch metadata; viewer consumes frames opaquely. Protocol remains unchanged. ### Database Schema **No migration required.** The `connect_machines` table's `os` and `architecture` fields (part of SPEC-003 inventory) already accommodate macOS/Linux values. ### Agent Build & Packaging **Cargo.toml adjustments:** ```toml [target.'cfg(target_os = "macos")'.dependencies] core-graphics = "0.23" core-foundation = "0.9" objc2 = "0.5" objc2-foundation = "0.2" objc2-app-kit = "0.2" security-framework = "2.9" [target.'cfg(target_os = "linux")'.dependencies] x11 = { version = "2.21", features = ["xlib", "xtest", "xrandr"] } libva = "0.4" ``` **macOS packaging:** - `.app` bundle structure: `GuruConnect.app/Contents/MacOS/guruconnect`, `Info.plist` with protocol handler - Code signing: `codesign --sign "Developer ID Application: Arizona Computer Guru LLC"` - Notarization via `notarytool` (Apple requirement for macOS 10.15+) - Universal binary: `cargo build --release --target x86_64-apple-darwin && cargo build --release --target aarch64-apple-darwin && lipo -create ...` **Linux packaging:** - `.deb` via `cargo-deb`: ```toml [package.metadata.deb] maintainer = "AZ Computer Guru " depends = "libx11-6, libxext6, libxtst6" section = "net" assets = [ ["target/release/guruconnect", "usr/bin/", "755"], ["deploy/linux/guruconnect.desktop", "usr/share/applications/", "644"], ] ``` - `.rpm` via `cargo-generate-rpm` ### Distribution - **Server downloads:** `/downloads/guruconnect-macos.dmg`, `/downloads/guruconnect-linux.deb`, `/downloads/guruconnect-linux.rpm` - **Dashboard detection:** browser User-Agent → suggest correct download - **Auto-update (out of scope for v1):** defer cross-platform updater to Phase 2 ## Implementation Details ### Files to Create **macOS:** - `agent/src/platform/macos/capture.rs` — ScreenCaptureKit stream + AVFoundation fallback - `agent/src/platform/macos/input.rs` — CGEventCreateMouseEvent, CGEventCreateKeyboardEvent, CGEventPost - `agent/src/platform/macos/encoder.rs` — VTCompressionSession (VideoToolbox) - `agent/src/platform/macos/tray.rs` — NSStatusBar + NSMenu via objc2 - `agent/Info.plist.template` — bundle metadata, protocol handler (`CFBundleURLSchemes`) **Linux:** - `agent/src/platform/linux/capture.rs` — X11Display::open, XShmGetImage; pipewire detection fallback (log warning) - `agent/src/platform/linux/input.rs` — XTestFakeMotionEvent, XTestFakeButtonEvent, XTestFakeKeyEvent - `agent/src/platform/linux/encoder.rs` — libva VAConfigAttrib + VABufferType; software fallback via openh264 - `agent/src/platform/linux/tray.rs` — dbus org.kde.StatusNotifierItem protocol - `deploy/linux/guruconnect.desktop` — .desktop file with `MimeType=x-scheme-handler/guruconnect` **Shared:** - `agent/src/platform/mod.rs` — trait definitions + platform factory `fn create_platform() -> Box` - Refactor `agent/src/capture/{dxgi,gdi}.rs` → `agent/src/platform/windows/capture.rs` - Refactor `agent/src/input/{mouse,keyboard}.rs` → `agent/src/platform/windows/input.rs` - Refactor `agent/src/encoder/h264.rs` → `agent/src/platform/windows/encoder.rs` (Media Foundation) ### Key Logic **macOS screen capture (ScreenCaptureKit, macOS 13+):** ```rust // agent/src/platform/macos/capture.rs use core_graphics::display::CGDisplay; use objc2::rc::Id; use objc2_foundation::NSArray; use objc2_screen_capture_kit::{SCStreamConfiguration, SCStream, SCContentFilter, SCStreamOutputType}; pub struct MacOSCapture { stream: Id, latest_frame: Arc>>>, } impl PlatformCapture for MacOSCapture { fn capture_frame(&mut self) -> Result { // ScreenCaptureKit delivers frames to delegate callback let frame = self.latest_frame.lock().unwrap().clone() .ok_or(anyhow!("No frame available"))?; Ok(CapturedFrame { data: frame, width: self.width, height: self.height }) } } ``` **macOS input injection:** ```rust // agent/src/platform/macos/input.rs use core_graphics::event::{CGEvent, CGEventType, CGMouseButton, EventField}; impl PlatformInput for MacOSInput { fn inject_mouse(&mut self, event: MouseEvent) -> Result<()> { let cg_event = match event.button { MouseButton::Left => CGEvent::new_mouse_event( None, CGEventType::LeftMouseDown, CGPoint::new(event.x as f64, event.y as f64), CGMouseButton::Left )?, // ... }; cg_event.post(CGEventTapLocation::HID); Ok(()) } } ``` **Linux X11 capture:** ```rust // agent/src/platform/linux/capture.rs use x11::xlib::{XOpenDisplay, XDefaultRootWindow, XGetImage, ZPixmap}; use x11::xshm::{XShmAttach, XShmGetImage}; pub struct LinuxCapture { display: *mut Display, root: Window, shm_info: XShmSegmentInfo, } impl PlatformCapture for LinuxCapture { fn capture_frame(&mut self) -> Result { let img = unsafe { XShmGetImage(self.display, self.root, ...) }; // Convert XImage BGRA → frame buffer Ok(CapturedFrame { data: converted, width, height }) } } ``` **Permission handling (macOS Screen Recording):** ```rust // agent/src/platform/macos/permissions.rs use objc2_av_foundation::AVCaptureDevice; pub fn check_screen_recording_permission() -> bool { // macOS 10.15+ requires explicit Screen Recording permission // Trigger prompt on first run; subsequent checks use TCC database status AVCaptureDevice::authorizationStatusForMediaType(AVMediaTypeVideo) == AVAuthorizationStatusAuthorized } ``` ## Security Considerations ### macOS Security - **Screen Recording permission:** User must grant permission in System Settings → Privacy & Security → Screen Recording. Agent detects denial and logs instructions. - **Gatekeeper:** Code signing + notarization required to avoid "unidentified developer" block. - **Hardened runtime:** Enable hardened runtime entitlements (`com.apple.security.cs.allow-unsigned-executable-memory` for JIT, if needed). - **Input injection:** No additional permission for CGEvent (runs as the logged-on user). ### Linux Security - **X11 security:** XTest extension enabled by default; uinput requires `/dev/uinput` write access (typically granted to `input` group). - **Wayland sandboxing:** v1 runs under XWayland (X11 compatibility); native Wayland requires PipeWire portal (user consent per-session). - **AppArmor/SELinux:** Agent may need profile exemptions for screen capture on hardened distros. ### Authentication **No protocol changes.** macOS/Linux agents authenticate with the same support-code or persistent agent key (`AGENT_API_KEY` or per-agent `cak_*` key from SPEC-004). Relay server validates identically. ### Audit Events Existing `events` table logs agent connections with `os` and `architecture` fields. No schema change needed. ## Testing Strategy ### Unit Tests - Mock platform traits for capture/input/encoder modules - Test frame conversion (BGRA → RGB, stride alignment) - Test protobuf message serialization (unchanged, but validate on new platforms) ### Integration Tests - **macOS test rig:** macOS 13 VM (UTM or Parallels), run agent, verify viewer connects and displays screen - **Linux test rig:** Ubuntu 22.04 LXC container (X11 forwarded), verify capture + input - **Cross-platform viewer:** existing Windows native viewer connects to macOS/Linux agents ### Manual Testing Scenarios 1. **macOS install:** Download `.app`, drag to Applications, launch, grant Screen Recording permission, enter support code, verify viewer connects. 2. **Linux install:** `sudo dpkg -i guruconnect.deb`, run `guruconnect --support-code ABC123`, verify tray icon, test remote control. 3. **Encoding quality:** Compare H.264 output from VideoToolbox (macOS), VA-API (Linux), Media Foundation (Windows) — validate frame rate and bitrate parity. 4. **Input fidelity:** Test full keyboard (including modifiers, function keys) and mouse (click, drag, scroll) on all three platforms. 5. **Protocol handler:** Click `guruconnect://connect?session=...` link, verify viewer launches with session ID pre-filled. ### CI/CD Additions - **macOS build:** Gitea Actions runner on macOS (M1 or x86_64), build universal binary, sign, notarize - **Linux build:** Existing `ubuntu-latest` runner, build `.deb` and `.rpm` via cargo-deb/cargo-generate-rpm - **Artifact upload:** Upload binaries to `server/static/downloads/` on release tag ## Effort Estimate & Dependencies **Size:** X-Large (12-16 weeks, 1 developer) **Breakdown:** - Platform abstraction refactor (Windows code): 2 weeks - macOS implementation (capture, input, encoder, tray): 4 weeks - Linux implementation (X11 capture, input, VA-API encoder, tray): 3 weeks - Packaging and code signing (macOS notarization, .deb/.rpm): 2 weeks - Testing, documentation, CI/CD integration: 2 weeks - Buffer for platform-specific edge cases: 1-3 weeks **Dependencies:** - **SPEC-002 v2 Phase 1 completion** — per-agent keys and secure session core must be stable (already shipped) - **SPEC-004 deterministic machine identity** — macOS/Linux need stable `machine_uid` derivation (macOS: `IOPlatformUUID`, Linux: `/etc/machine-id`) - **Code signing infrastructure:** Apple Developer Program account + Azure Trusted Signing already in place (for Windows); reuse for macOS - **Test infrastructure:** macOS VM or bare-metal test host, Linux LXC container with X11 **Unblocks:** - Multi-platform MSP adoption (organizations with mixed Windows/macOS/Linux fleets) - Education/developer market (high macOS/Linux penetration) - SPEC-003 machine inventory (macOS/Linux populate same fields) ## Open Questions 1. **Wayland native support timeline?** — v1 uses X11 compatibility (XWayland); when do we commit to PipeWire portal? Defer to Phase 2 pending Wayland adoption metrics. 2. **ARM Linux (Raspberry Pi, ARM servers)?** — Ubuntu 22.04 arm64 is viable; add as a build target? Defer to community request. 3. **macOS Ventura (13.0) vs. Monterey (12.0) adoption split?** — ScreenCaptureKit (13.0+) is preferred; fallback to AVFoundation for 12.x. Monitor 12.x usage after 6 months; deprecate if <5%. 4. **Linux H.264 licensing?** — VA-API uses hardware codecs (no license issue); software fallback via openh264 (Cisco BSD license, royalty-free). Confirm distro packaging allows bundling. 5. **System tray on headless Linux servers?** — Agent can run headless (tray is optional); `--no-tray` flag for server deployments. --- **Cross-references:** - ADR-001: GuruConnect is a standalone product (no RMM coupling) - SPEC-002: v2 modernization architecture (agent key model) - SPEC-003: Machine inventory (OS/arch fields) - SPEC-004: Stable machine identity (macOS IOPlatformUUID, Linux /etc/machine-id)