diff --git a/docs/FEATURE_ROADMAP.md b/docs/FEATURE_ROADMAP.md index ae8e4d8..477df4b 100644 --- a/docs/FEATURE_ROADMAP.md +++ b/docs/FEATURE_ROADMAP.md @@ -93,5 +93,5 @@ Bringing GC to parity with GuruRMM's release engineering. Full plan: [SPEC-001]( ## Future Considerations -- [ ] macOS / Linux remote-control agents — P3 +- [ ] **Cross-platform agent support (macOS and Linux)** — P2 — Enable remote control beyond Windows with native agents for macOS 12+ and Ubuntu 22.04+ LTS. Platform abstraction layer for capture/input/encoding, VideoToolbox (macOS) and VA-API (Linux) H.264 encoding, .app/.deb/.rpm packaging. Unblocks multi-platform MSP adoption. ([SPEC-010](specs/SPEC-010-cross-platform-agents.md)) - [ ] Auto-update for the agent — P3 diff --git a/docs/specs/SPEC-010-cross-platform-agents.md b/docs/specs/SPEC-010-cross-platform-agents.md new file mode 100644 index 0000000..c31e5f6 --- /dev/null +++ b/docs/specs/SPEC-010-cross-platform-agents.md @@ -0,0 +1,358 @@ +# SPEC-010: Cross-Platform Agent Support (macOS and Linux) + +**Status:** Proposed +**Priority:** P2 +**Requested By:** Mike Swanson (2026-05-30) +**Estimated Effort:** X-Large + +## Overview + +Expand GuruConnect's agent support beyond Windows to include macOS and Linux platforms, enabling remote control and support across the full spectrum of enterprise and home computing environments. This cross-platform expansion addresses a critical market gap—ScreenConnect, Splashtop, and AnyDesk all support macOS/Linux, and GuruConnect's Windows-only limitation blocks adoption by multi-platform organizations. The primary technical challenge is abstracting platform-specific screen capture, input injection, and video encoding while maintaining the core protobuf-over-WSS transport and session model. Success criteria: feature parity with the Windows agent (capture, input, tray presence, persistent/support-code modes, H.264 encoding) on macOS 12+ and Ubuntu 22.04+ LTS. + +## Scope + +### Included in v1 + +**macOS Agent (Priority 1):** +- Screen capture via ScreenCaptureKit API (macOS 13+) with AVFoundation fallback (macOS 12) +- Input injection via CGEvent +- H.264 encoding via VideoToolbox +- Menu bar icon (NSStatusItem) with status and exit option +- `guruconnect://` protocol handler registration +- .app bundle packaging and code signing +- Screen Recording permission prompt and validation +- Universal binary (x86_64 + arm64) + +**Linux Agent (Priority 2):** +- Screen capture via X11 (XShm) with Wayland pipewire fallback detection +- Input injection via XTest (X11) or uinput (Wayland) +- H.264 encoding via VA-API (hardware) or software fallback (libx264) +- System tray icon via StatusNotifier/AppIndicator +- `guruconnect://` protocol handler (.desktop file) +- .deb packaging (Ubuntu/Debian) and .rpm (Fedora/RHEL) +- x86_64 binary + +**Shared Cross-Platform:** +- Unified agent codebase with platform-specific modules behind traits +- Same protobuf protocol (no wire format changes) +- Same persistent/support-code authentication modes +- Same AgentStatus metadata reporting (OS, architecture, uptime, logged-on user) +- Viewer compatibility: existing native Windows viewer and web viewer work unchanged + +### Explicitly out of scope + +- **FreeBSD/BSD support** — defer; assess demand post-launch +- **Wayland-native screen capture** — v1 uses X11 compatibility layer; native Wayland portal support is Phase 2 +- **Multi-monitor switching on macOS/Linux** — already deferred on Windows (P2 roadmap item); cross-platform parity follows +- **macOS < 12** — 12.x is the minimum (released Oct 2021, 95%+ adoption) +- **Linux distros outside Ubuntu 22.04+ / Debian 11+ / Fedora 38+ / RHEL 9+** — official support limited; community builds acceptable + +## Architecture + +### Agent Refactor: Platform Abstraction Layer + +Current Windows agent (`agent/src/`) tightly couples platform APIs. Refactor into: + +``` +agent/src/ +├── main.rs # CLI entry + platform dispatch +├── session/ # Session logic (platform-agnostic) +├── transport/ # WebSocket (unchanged) +├── platform/ # NEW: trait definitions +│ ├── mod.rs # PlatformCapture, PlatformInput, PlatformTray traits +│ ├── windows/ # Windows impl (existing code refactored) +│ │ ├── capture.rs # DXGI + GDI behind PlatformCapture +│ │ ├── input.rs # Win32 SendInput behind PlatformInput +│ │ ├── encoder.rs # Media Foundation H.264 +│ │ └── tray.rs # Win32 shell tray +│ ├── macos/ # NEW: macOS impl +│ │ ├── capture.rs # ScreenCaptureKit/AVFoundation +│ │ ├── input.rs # CGEvent +│ │ ├── encoder.rs # VideoToolbox H.264 +│ │ └── tray.rs # NSStatusItem (Objective-C via objc2 crate) +│ └── linux/ # NEW: Linux impl +│ ├── capture.rs # X11 XShm / pipewire detection +│ ├── input.rs # XTest / uinput +│ ├── encoder.rs # VA-API / libx264 +│ └── tray.rs # StatusNotifier +├── encoder/ # Color conversion, raw fallback (platform-agnostic) +├── viewer/ # Native viewer (winit cross-platform, unchanged) +├── config.rs # Config (unchanged) +└── install.rs # Installation (platform-specific #[cfg] blocks) +``` + +**Key traits:** + +```rust +// agent/src/platform/mod.rs +pub trait PlatformCapture: Send { + fn capture_frame(&mut self) -> Result; + fn get_dimensions(&self) -> (u32, u32); +} + +pub trait PlatformInput: Send { + fn inject_mouse(&mut self, event: MouseEvent) -> Result<()>; + fn inject_keyboard(&mut self, event: KeyboardEvent) -> Result<()>; +} + +pub trait PlatformEncoder: Send { + fn encode(&mut self, frame: &[u8], width: u32, height: u32) -> Result>; + fn set_bitrate(&mut self, kbps: u32); +} + +pub trait PlatformTray: Send { + fn show(&mut self, status: &str) -> Result<()>; + fn poll_events(&mut self) -> Vec; +} +``` + +### Protobuf Changes + +**None.** The existing `AgentStatus` message already carries OS/arch metadata; viewer consumes frames opaquely. Protocol remains unchanged. + +### Database Schema + +**No migration required.** The `connect_machines` table's `os` and `architecture` fields (part of SPEC-003 inventory) already accommodate macOS/Linux values. + +### Agent Build & Packaging + +**Cargo.toml adjustments:** + +```toml +[target.'cfg(target_os = "macos")'.dependencies] +core-graphics = "0.23" +core-foundation = "0.9" +objc2 = "0.5" +objc2-foundation = "0.2" +objc2-app-kit = "0.2" +security-framework = "2.9" + +[target.'cfg(target_os = "linux")'.dependencies] +x11 = { version = "2.21", features = ["xlib", "xtest", "xrandr"] } +libva = "0.4" +``` + +**macOS packaging:** + +- `.app` bundle structure: `GuruConnect.app/Contents/MacOS/guruconnect`, `Info.plist` with protocol handler +- Code signing: `codesign --sign "Developer ID Application: Arizona Computer Guru LLC"` +- Notarization via `notarytool` (Apple requirement for macOS 10.15+) +- Universal binary: `cargo build --release --target x86_64-apple-darwin && cargo build --release --target aarch64-apple-darwin && lipo -create ...` + +**Linux packaging:** + +- `.deb` via `cargo-deb`: + ```toml + [package.metadata.deb] + maintainer = "AZ Computer Guru " + depends = "libx11-6, libxext6, libxtst6" + section = "net" + assets = [ + ["target/release/guruconnect", "usr/bin/", "755"], + ["deploy/linux/guruconnect.desktop", "usr/share/applications/", "644"], + ] + ``` +- `.rpm` via `cargo-generate-rpm` + +### Distribution + +- **Server downloads:** `/downloads/guruconnect-macos.dmg`, `/downloads/guruconnect-linux.deb`, `/downloads/guruconnect-linux.rpm` +- **Dashboard detection:** browser User-Agent → suggest correct download +- **Auto-update (out of scope for v1):** defer cross-platform updater to Phase 2 + +## Implementation Details + +### Files to Create + +**macOS:** +- `agent/src/platform/macos/capture.rs` — ScreenCaptureKit stream + AVFoundation fallback +- `agent/src/platform/macos/input.rs` — CGEventCreateMouseEvent, CGEventCreateKeyboardEvent, CGEventPost +- `agent/src/platform/macos/encoder.rs` — VTCompressionSession (VideoToolbox) +- `agent/src/platform/macos/tray.rs` — NSStatusBar + NSMenu via objc2 +- `agent/Info.plist.template` — bundle metadata, protocol handler (`CFBundleURLSchemes`) + +**Linux:** +- `agent/src/platform/linux/capture.rs` — X11Display::open, XShmGetImage; pipewire detection fallback (log warning) +- `agent/src/platform/linux/input.rs` — XTestFakeMotionEvent, XTestFakeButtonEvent, XTestFakeKeyEvent +- `agent/src/platform/linux/encoder.rs` — libva VAConfigAttrib + VABufferType; software fallback via openh264 +- `agent/src/platform/linux/tray.rs` — dbus org.kde.StatusNotifierItem protocol +- `deploy/linux/guruconnect.desktop` — .desktop file with `MimeType=x-scheme-handler/guruconnect` + +**Shared:** +- `agent/src/platform/mod.rs` — trait definitions + platform factory `fn create_platform() -> Box` +- Refactor `agent/src/capture/{dxgi,gdi}.rs` → `agent/src/platform/windows/capture.rs` +- Refactor `agent/src/input/{mouse,keyboard}.rs` → `agent/src/platform/windows/input.rs` +- Refactor `agent/src/encoder/h264.rs` → `agent/src/platform/windows/encoder.rs` (Media Foundation) + +### Key Logic + +**macOS screen capture (ScreenCaptureKit, macOS 13+):** + +```rust +// agent/src/platform/macos/capture.rs +use core_graphics::display::CGDisplay; +use objc2::rc::Id; +use objc2_foundation::NSArray; +use objc2_screen_capture_kit::{SCStreamConfiguration, SCStream, SCContentFilter, SCStreamOutputType}; + +pub struct MacOSCapture { + stream: Id, + latest_frame: Arc>>>, +} + +impl PlatformCapture for MacOSCapture { + fn capture_frame(&mut self) -> Result { + // ScreenCaptureKit delivers frames to delegate callback + let frame = self.latest_frame.lock().unwrap().clone() + .ok_or(anyhow!("No frame available"))?; + Ok(CapturedFrame { data: frame, width: self.width, height: self.height }) + } +} +``` + +**macOS input injection:** + +```rust +// agent/src/platform/macos/input.rs +use core_graphics::event::{CGEvent, CGEventType, CGMouseButton, EventField}; + +impl PlatformInput for MacOSInput { + fn inject_mouse(&mut self, event: MouseEvent) -> Result<()> { + let cg_event = match event.button { + MouseButton::Left => CGEvent::new_mouse_event( + None, CGEventType::LeftMouseDown, + CGPoint::new(event.x as f64, event.y as f64), + CGMouseButton::Left + )?, + // ... + }; + cg_event.post(CGEventTapLocation::HID); + Ok(()) + } +} +``` + +**Linux X11 capture:** + +```rust +// agent/src/platform/linux/capture.rs +use x11::xlib::{XOpenDisplay, XDefaultRootWindow, XGetImage, ZPixmap}; +use x11::xshm::{XShmAttach, XShmGetImage}; + +pub struct LinuxCapture { + display: *mut Display, + root: Window, + shm_info: XShmSegmentInfo, +} + +impl PlatformCapture for LinuxCapture { + fn capture_frame(&mut self) -> Result { + let img = unsafe { XShmGetImage(self.display, self.root, ...) }; + // Convert XImage BGRA → frame buffer + Ok(CapturedFrame { data: converted, width, height }) + } +} +``` + +**Permission handling (macOS Screen Recording):** + +```rust +// agent/src/platform/macos/permissions.rs +use objc2_av_foundation::AVCaptureDevice; + +pub fn check_screen_recording_permission() -> bool { + // macOS 10.15+ requires explicit Screen Recording permission + // Trigger prompt on first run; subsequent checks use TCC database status + AVCaptureDevice::authorizationStatusForMediaType(AVMediaTypeVideo) == AVAuthorizationStatusAuthorized +} +``` + +## Security Considerations + +### macOS Security + +- **Screen Recording permission:** User must grant permission in System Settings → Privacy & Security → Screen Recording. Agent detects denial and logs instructions. +- **Gatekeeper:** Code signing + notarization required to avoid "unidentified developer" block. +- **Hardened runtime:** Enable hardened runtime entitlements (`com.apple.security.cs.allow-unsigned-executable-memory` for JIT, if needed). +- **Input injection:** No additional permission for CGEvent (runs as the logged-on user). + +### Linux Security + +- **X11 security:** XTest extension enabled by default; uinput requires `/dev/uinput` write access (typically granted to `input` group). +- **Wayland sandboxing:** v1 runs under XWayland (X11 compatibility); native Wayland requires PipeWire portal (user consent per-session). +- **AppArmor/SELinux:** Agent may need profile exemptions for screen capture on hardened distros. + +### Authentication + +**No protocol changes.** macOS/Linux agents authenticate with the same support-code or persistent agent key (`AGENT_API_KEY` or per-agent `cak_*` key from SPEC-004). Relay server validates identically. + +### Audit Events + +Existing `events` table logs agent connections with `os` and `architecture` fields. No schema change needed. + +## Testing Strategy + +### Unit Tests + +- Mock platform traits for capture/input/encoder modules +- Test frame conversion (BGRA → RGB, stride alignment) +- Test protobuf message serialization (unchanged, but validate on new platforms) + +### Integration Tests + +- **macOS test rig:** macOS 13 VM (UTM or Parallels), run agent, verify viewer connects and displays screen +- **Linux test rig:** Ubuntu 22.04 LXC container (X11 forwarded), verify capture + input +- **Cross-platform viewer:** existing Windows native viewer connects to macOS/Linux agents + +### Manual Testing Scenarios + +1. **macOS install:** Download `.app`, drag to Applications, launch, grant Screen Recording permission, enter support code, verify viewer connects. +2. **Linux install:** `sudo dpkg -i guruconnect.deb`, run `guruconnect --support-code ABC123`, verify tray icon, test remote control. +3. **Encoding quality:** Compare H.264 output from VideoToolbox (macOS), VA-API (Linux), Media Foundation (Windows) — validate frame rate and bitrate parity. +4. **Input fidelity:** Test full keyboard (including modifiers, function keys) and mouse (click, drag, scroll) on all three platforms. +5. **Protocol handler:** Click `guruconnect://connect?session=...` link, verify viewer launches with session ID pre-filled. + +### CI/CD Additions + +- **macOS build:** Gitea Actions runner on macOS (M1 or x86_64), build universal binary, sign, notarize +- **Linux build:** Existing `ubuntu-latest` runner, build `.deb` and `.rpm` via cargo-deb/cargo-generate-rpm +- **Artifact upload:** Upload binaries to `server/static/downloads/` on release tag + +## Effort Estimate & Dependencies + +**Size:** X-Large (12-16 weeks, 1 developer) + +**Breakdown:** +- Platform abstraction refactor (Windows code): 2 weeks +- macOS implementation (capture, input, encoder, tray): 4 weeks +- Linux implementation (X11 capture, input, VA-API encoder, tray): 3 weeks +- Packaging and code signing (macOS notarization, .deb/.rpm): 2 weeks +- Testing, documentation, CI/CD integration: 2 weeks +- Buffer for platform-specific edge cases: 1-3 weeks + +**Dependencies:** +- **SPEC-002 v2 Phase 1 completion** — per-agent keys and secure session core must be stable (already shipped) +- **SPEC-004 deterministic machine identity** — macOS/Linux need stable `machine_uid` derivation (macOS: `IOPlatformUUID`, Linux: `/etc/machine-id`) +- **Code signing infrastructure:** Apple Developer Program account + Azure Trusted Signing already in place (for Windows); reuse for macOS +- **Test infrastructure:** macOS VM or bare-metal test host, Linux LXC container with X11 + +**Unblocks:** +- Multi-platform MSP adoption (organizations with mixed Windows/macOS/Linux fleets) +- Education/developer market (high macOS/Linux penetration) +- SPEC-003 machine inventory (macOS/Linux populate same fields) + +## Open Questions + +1. **Wayland native support timeline?** — v1 uses X11 compatibility (XWayland); when do we commit to PipeWire portal? Defer to Phase 2 pending Wayland adoption metrics. +2. **ARM Linux (Raspberry Pi, ARM servers)?** — Ubuntu 22.04 arm64 is viable; add as a build target? Defer to community request. +3. **macOS Ventura (13.0) vs. Monterey (12.0) adoption split?** — ScreenCaptureKit (13.0+) is preferred; fallback to AVFoundation for 12.x. Monitor 12.x usage after 6 months; deprecate if <5%. +4. **Linux H.264 licensing?** — VA-API uses hardware codecs (no license issue); software fallback via openh264 (Cisco BSD license, royalty-free). Confirm distro packaging allows bundling. +5. **System tray on headless Linux servers?** — Agent can run headless (tray is optional); `--no-tray` flag for server deployments. + +--- + +**Cross-references:** +- ADR-001: GuruConnect is a standalone product (no RMM coupling) +- SPEC-002: v2 modernization architecture (agent key model) +- SPEC-003: Machine inventory (OS/arch fields) +- SPEC-004: Stable machine identity (macOS IOPlatformUUID, Linux /etc/machine-id)