7.9 KiB
Session Log — 2026-05-28 — GuruRMM Discovery Testing + Bug Fixes
User
- User: Howard Enos (howard)
- Machine: Howard-Home
- Role: tech
Session Summary
Howard installed the GuruRMM agent on WIN-TG2STMODJG8 at site eeb5f001-447b-4c1e-adc8-e18db2be9b5b and wanted to test the network discovery feature — specifically whether it could find devices on the network and auto-install the agent on them.
Research confirmed discovery is partially implemented: TCP connect probing + reverse DNS + ARP lookup shipped; ICMP/ARP/SNMP scanning and scheduled scans are not yet implemented (roadmap P2). Auto-installing the agent on discovered devices is not built — the "deploying" status is a label only, with no actual push-install mechanism behind it.
WIN-TG2STMODJG8 (agent ID eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a) was configured as the discovery node for the site via API, with suggested subnet 172.16.0.0/23 auto-populated from the agent's network interface data. A scan was triggered and completed in ~16 seconds, finding 4 devices: 172.16.1.6 (TGC-SERVER, Windows, port 3389), 172.16.1.46 (WIN-TG2STMODJG8 itself, ports 135/445/5985), 172.16.1.136 (Windows, port 3389), 172.16.1.15 (Linux, port 22). All marked unmanaged.
Howard noticed the device count growing with each scan and asked about a timeout. Investigation confirmed two bugs: (1) no scan timeout — if the agent disconnects mid-scan, the scan record stays status=running forever; (2) no guard against triggering a second scan while one is running. Fixed in server/src/api/discovery.rs and server/src/db/discovery.rs: expire_stale_scans() marks any running scan older than 10 minutes as failed, and has_running_scan() blocks new triggers with HTTP 409 while a scan is active. Committed as c6f1f73.
The growing device count question was also resolved: the discovered_devices table is cumulative (unique constraint on site_id+ip_address, ON CONFLICT UPDATE). The apparent growth was because early scans were finding different IPs as ARP cache populated — once stable, new_devices: 0 confirmed no duplicates were being created.
Howard then reported that a machine on the network could not be pinged from GuruRMM and also was not found by the scanner. Root cause: the ping check uses the system ping command (ICMP), which Windows Firewall blocks by default. The discovery scanner was TCP-only — a host with all ports firewalled and ICMP blocked would be invisible to both. Fixed in agent/src/discovery/mod.rs: added ping_host() as an ICMP fallback after TCP probing. If no TCP ports respond, the scanner runs ping -n 1 -w 500 <ip> (Windows) or ping -c 1 -W 1 <ip> (Linux). Hosts that respond to ICMP but have no open TCP ports now appear in discovery with open_ports: [] and os_hint: unknown. Committed as fcf5833.
Key Decisions
- Discovery does NOT auto-install agents — the "deploying" status flag exists in the UI and DB but there is no actual push-install mechanism. This is a future P2 feature. Clearly communicated to Howard.
- Scheduled scans not implemented — the UI shows Daily/Weekly options but the backend scheduler is not wired up. On-demand only. Roadmap P2.
- Discovery is not AI-driven — once the node is configured (one-time setup through the dashboard UI), scans are triggered by button click or (future) schedule. No AI involvement at runtime. Howard confirmed this was his expectation.
- ICMP fallback uses shell ping, not raw sockets — raw ICMP sockets require elevated privileges and are blocked on Windows without manifest changes. Shell
pingbinary approach matches the existing checks.rs pattern and works within the agent's current privilege level. - Stale scan timeout set to 10 minutes — conservative enough to not expire legitimate scans on large subnets, aggressive enough to clean up disconnected-agent orphans before the next triggered scan.
- HTTP 409 for concurrent scan guard — standard REST conflict code; the dashboard's toast error handling will display the message to the user.
Problems Encountered
- Push rejected (twice) — Mike had pushed between commit and push during both fixes. Resolved both times with
git pull --rebase origin main && git push origin main.
Configuration Changes
projects/msp-tools/guru-rmm/server/src/db/discovery.rs— addedexpire_stale_scans()andhas_running_scan()functionsprojects/msp-tools/guru-rmm/server/src/api/discovery.rs— wired both intotrigger_scan(): expire stale scans before each trigger, block if already running (HTTP 409)projects/msp-tools/guru-rmm/agent/src/discovery/mod.rs— addedping_host()ICMP fallback; updated comment in run_scan to reflect fallback logicprojects/msp-tools/guru-rmm(submodule) — advanced to fcf5833
Credentials & Secrets
None created or modified this session.
Infrastructure & Servers
- GuruRMM dashboard: https://rmm.azcomputerguru.com
- GuruRMM server: 172.16.3.30:3001
- Discovery test site: eeb5f001-447b-4c1e-adc8-e18db2be9b5b
- Discovery node agent: WIN-TG2STMODJG8 (eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a), online
- Subnet scanned: 172.16.0.0/23
Commands & Outputs
# Authenticate
POST /api/auth/login { email, password } → JWT token
# Get agents at site
GET /api/agents?site_id=eeb5f001-447b-4c1e-adc8-e18db2be9b5b
# → 10 agents; WIN-TG2STMODJG8 online
# Get suggested subnets
GET /api/agents/eee9f26d.../discovery/subnets
# → ["172.16.0.0/23"]
# Configure discovery node
POST /api/agents/eee9f26d.../discovery { site_id, ip_ranges: ["172.16.0.0/23"], ... }
# → node created
# Trigger scan
POST /api/sites/eeb5f001.../discovery/scan
# → scan_id: 6c25d374-..., status: initiated
# Completed in ~16s, devices_found: 4, new_devices: 4
# Second scan (dedup confirmation)
# → devices_found: 9, new_devices: 0 (stable, no duplicates)
# gururmm commits
c6f1f73 fix(discovery): add scan timeout and in-progress guard
fcf5833 fix(discovery): add ICMP ping fallback for TCP-silent hosts
Pending / Incomplete Tasks
- Discovery auto-deploy (P2): Not built. Needs a mechanism to push the agent installer to a discovered Windows machine — likely SMB + PSExec or WMI with tech-provided credentials. Would be a significant new feature.
- Discovery scheduling (P2): UI has Daily/Weekly options but backend scheduler not implemented. Needs a background task on the server.
- New agent build + deploy needed: Both discovery fixes are in the codebase but won't take effect on the agent (ICMP fallback) or server (timeout + concurrency guard) until the next build is deployed to 172.16.3.30.
- SPEC-012 implementation: Sortable table headers, 4h estimate, no blockers.
- SPEC-013 (P3): Deferred — revisit after file transfer (P2) ships.
- SPEC-014 follow-up (Mike's): Policy tab UI for watch rules; push rules to agent on connect.
- Cascades pending migration: Ashley Jensen folder redirect, RECEPTIONIST-PC drives, NURSESTATION-PC HIPAA GPO, Nurses credential vault, Phase 3 domain joins, Entra Connect OU expansion, M365 relicensing (time-sensitive).
Reference Information
- Discovery node agent: WIN-TG2STMODJG8 — eee9f26d-0dbc-4b8e-8e42-3a901b4ff73a
- Discovery test site ID: eeb5f001-447b-4c1e-adc8-e18db2be9b5b
- Scan timeout fix commit: c6f1f73 (server: expire_stale_scans 10min + 409 on concurrent)
- ICMP fallback commit: fcf5833 (agent: ping_host() fallback in discovery/mod.rs)
- Default scan ports: 22, 80, 135, 443, 445, 3389, 5985, 9100, 161
- Scan concurrency: 50 (configurable), timeout_ms: 500 per connection
- discovery_nodes table: agent_id PK, scan_config JSONB
- discovered_devices table: UNIQUE (site_id, ip_address) — cumulative across scans
- discovery_scans table: status IN ('running','completed','failed')
- Syncro note: POST /tickets/{id}/comments and POST /tickets/{id}/invoice return 404 for large-format ticket IDs. Workaround: POST /invoices (top-level) works. Comments require GUI.