# Session Log — 2026-05-28 — GuruRMM Discovery Testing + Bug Fixes ## User - **User:** Howard Enos (howard) - **Machine:** Howard-Home - **Role:** tech --- ## Session Summary Howard installed the GuruRMM agent on WIN-TG2STMODJG8 at site eeb5f001-447b-4c1e-adc8-e18db2be9b5b and wanted to test the network discovery feature — specifically whether it could find devices on the network and auto-install the agent on them. Research confirmed discovery is partially implemented: TCP connect probing + reverse DNS + ARP lookup shipped; ICMP/ARP/SNMP scanning and scheduled scans are not yet implemented (roadmap P2). Auto-installing the agent on discovered devices is not built — the "deploying" status is a label only, with no actual push-install mechanism behind it. WIN-TG2STMODJG8 (agent ID eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a) was configured as the discovery node for the site via API, with suggested subnet 172.16.0.0/23 auto-populated from the agent's network interface data. A scan was triggered and completed in ~16 seconds, finding 4 devices: 172.16.1.6 (TGC-SERVER, Windows, port 3389), 172.16.1.46 (WIN-TG2STMODJG8 itself, ports 135/445/5985), 172.16.1.136 (Windows, port 3389), 172.16.1.15 (Linux, port 22). All marked unmanaged. Howard noticed the device count growing with each scan and asked about a timeout. Investigation confirmed two bugs: (1) no scan timeout — if the agent disconnects mid-scan, the scan record stays status=running forever; (2) no guard against triggering a second scan while one is running. Fixed in server/src/api/discovery.rs and server/src/db/discovery.rs: `expire_stale_scans()` marks any running scan older than 10 minutes as failed, and `has_running_scan()` blocks new triggers with HTTP 409 while a scan is active. Committed as c6f1f73. The growing device count question was also resolved: the discovered_devices table is cumulative (unique constraint on site_id+ip_address, ON CONFLICT UPDATE). The apparent growth was because early scans were finding different IPs as ARP cache populated — once stable, `new_devices: 0` confirmed no duplicates were being created. Howard then reported that a machine on the network could not be pinged from GuruRMM and also was not found by the scanner. Root cause: the ping check uses the system `ping` command (ICMP), which Windows Firewall blocks by default. The discovery scanner was TCP-only — a host with all ports firewalled and ICMP blocked would be invisible to both. Fixed in agent/src/discovery/mod.rs: added `ping_host()` as an ICMP fallback after TCP probing. If no TCP ports respond, the scanner runs `ping -n 1 -w 500 ` (Windows) or `ping -c 1 -W 1 ` (Linux). Hosts that respond to ICMP but have no open TCP ports now appear in discovery with `open_ports: []` and `os_hint: unknown`. Committed as fcf5833. --- ## Key Decisions - **Discovery does NOT auto-install agents** — the "deploying" status flag exists in the UI and DB but there is no actual push-install mechanism. This is a future P2 feature. Clearly communicated to Howard. - **Scheduled scans not implemented** — the UI shows Daily/Weekly options but the backend scheduler is not wired up. On-demand only. Roadmap P2. - **Discovery is not AI-driven** — once the node is configured (one-time setup through the dashboard UI), scans are triggered by button click or (future) schedule. No AI involvement at runtime. Howard confirmed this was his expectation. - **ICMP fallback uses shell ping, not raw sockets** — raw ICMP sockets require elevated privileges and are blocked on Windows without manifest changes. Shell `ping` binary approach matches the existing checks.rs pattern and works within the agent's current privilege level. - **Stale scan timeout set to 10 minutes** — conservative enough to not expire legitimate scans on large subnets, aggressive enough to clean up disconnected-agent orphans before the next triggered scan. - **HTTP 409 for concurrent scan guard** — standard REST conflict code; the dashboard's toast error handling will display the message to the user. --- ## Problems Encountered - **Push rejected (twice)** — Mike had pushed between commit and push during both fixes. Resolved both times with `git pull --rebase origin main && git push origin main`. --- ## Configuration Changes - `projects/msp-tools/guru-rmm/server/src/db/discovery.rs` — added `expire_stale_scans()` and `has_running_scan()` functions - `projects/msp-tools/guru-rmm/server/src/api/discovery.rs` — wired both into `trigger_scan()`: expire stale scans before each trigger, block if already running (HTTP 409) - `projects/msp-tools/guru-rmm/agent/src/discovery/mod.rs` — added `ping_host()` ICMP fallback; updated comment in run_scan to reflect fallback logic - `projects/msp-tools/guru-rmm` (submodule) — advanced to fcf5833 --- ## Credentials & Secrets None created or modified this session. --- ## Infrastructure & Servers - **GuruRMM dashboard:** https://rmm.azcomputerguru.com - **GuruRMM server:** 172.16.3.30:3001 - **Discovery test site:** eeb5f001-447b-4c1e-adc8-e18db2be9b5b - **Discovery node agent:** WIN-TG2STMODJG8 (eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a), online - **Subnet scanned:** 172.16.0.0/23 --- ## Commands & Outputs ```bash # Authenticate POST /api/auth/login { email, password } → JWT token # Get agents at site GET /api/agents?site_id=eeb5f001-447b-4c1e-adc8-e18db2be9b5b # → 10 agents; WIN-TG2STMODJG8 online # Get suggested subnets GET /api/agents/eee9f26d.../discovery/subnets # → ["172.16.0.0/23"] # Configure discovery node POST /api/agents/eee9f26d.../discovery { site_id, ip_ranges: ["172.16.0.0/23"], ... } # → node created # Trigger scan POST /api/sites/eeb5f001.../discovery/scan # → scan_id: 6c25d374-..., status: initiated # Completed in ~16s, devices_found: 4, new_devices: 4 # Second scan (dedup confirmation) # → devices_found: 9, new_devices: 0 (stable, no duplicates) # gururmm commits c6f1f73 fix(discovery): add scan timeout and in-progress guard fcf5833 fix(discovery): add ICMP ping fallback for TCP-silent hosts ``` --- ## Pending / Incomplete Tasks - **Discovery auto-deploy (P2):** Not built. Needs a mechanism to push the agent installer to a discovered Windows machine — likely SMB + PSExec or WMI with tech-provided credentials. Would be a significant new feature. - **Discovery scheduling (P2):** UI has Daily/Weekly options but backend scheduler not implemented. Needs a background task on the server. - **New agent build + deploy needed:** Both discovery fixes are in the codebase but won't take effect on the agent (ICMP fallback) or server (timeout + concurrency guard) until the next build is deployed to 172.16.3.30. - **SPEC-012 implementation:** Sortable table headers, 4h estimate, no blockers. - **SPEC-013 (P3):** Deferred — revisit after file transfer (P2) ships. - **SPEC-014 follow-up (Mike's):** Policy tab UI for watch rules; push rules to agent on connect. - **Cascades pending migration:** Ashley Jensen folder redirect, RECEPTIONIST-PC drives, NURSESTATION-PC HIPAA GPO, Nurses credential vault, Phase 3 domain joins, Entra Connect OU expansion, M365 relicensing (time-sensitive). --- ## Reference Information - **Discovery node agent:** WIN-TG2STMODJG8 — eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a - **Discovery test site ID:** eeb5f001-447b-4c1e-adc8-e18db2be9b5b - **Scan timeout fix commit:** c6f1f73 (server: expire_stale_scans 10min + 409 on concurrent) - **ICMP fallback commit:** fcf5833 (agent: ping_host() fallback in discovery/mod.rs) - **Default scan ports:** 22, 80, 135, 443, 445, 3389, 5985, 9100, 161 - **Scan concurrency:** 50 (configurable), timeout_ms: 500 per connection - **discovery_nodes table:** agent_id PK, scan_config JSONB - **discovered_devices table:** UNIQUE (site_id, ip_address) — cumulative across scans - **discovery_scans table:** status IN ('running','completed','failed') - **Syncro note:** POST /tickets/{id}/comments and POST /tickets/{id}/invoice return 404 for large-format ticket IDs. Workaround: POST /invoices (top-level) works. Comments require GUI.