Session log: GuruRMM 4-bug fix + MSP360 backup integration 2026-05-19
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
54
session-logs/2026-05-19-gururmm-backup-fixes.md
Normal file
54
session-logs/2026-05-19-gururmm-backup-fixes.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Session Log: GuruRMM — Bug Fixes & MSP360 Backup Integration
|
||||
|
||||
**Date:** 2026-05-19
|
||||
**Duration:** ~3 hours
|
||||
|
||||
## User
|
||||
- **User:** Mike Swanson (mike)
|
||||
- **Machine:** DESKTOP-0O8A1RL
|
||||
- **Role:** admin
|
||||
|
||||
## Summary
|
||||
|
||||
Two areas of work: fixed 4 agent/server bugs identified from AD2's crash loop, then diagnosed and fixed the MSP360 backup integration which had never been configured.
|
||||
|
||||
## Part 1: 4-Bug Fix (v0.6.25)
|
||||
|
||||
Investigated why AD2's RMM agent was crash-looping and why the watchdog never fired. Root cause: agent 0.6.22/0.6.23 sent `user_inventory_report` WS messages the server couldn't deserialize. Also found a 48-minute update gap where the 30s grace period was too short for a Windows Defender scan of the new binary.
|
||||
|
||||
### Bugs fixed (commits 56723b1, 2a7b74b):
|
||||
1. **Grace period too short during updates** — extended to poll `agent_updates` for up to 2 hours before marking agent offline
|
||||
2. **AgentMessage unknown variant crash** — silently skips unknown WS message types (forward-compat); previously crashed the WS handler
|
||||
3. **WatchdogEvent not persisted** — WatchdogEvent messages now written to `watchdog_events` DB table
|
||||
4. **Watchdog never started** — `ensure_watchdog_running()` was implemented but never called from `run_agent()`; `agent-id.txt` sidecar (required by `post_watchdog_alert`) was never written after WS auth
|
||||
5. **Reviewer notes** (commit 2a7b74b): `has_in_progress_update` NULL gap fixed; `warn!` on WatchdogEvent DB insert failure
|
||||
|
||||
## Part 2: MSP360 Backup Integration
|
||||
|
||||
Backup tab on AD2 showed nothing. Root cause chain:
|
||||
|
||||
1. `mspbackups_config` was empty — API credentials never configured. Fixed: loaded credentials from vault (`msp-tools/msp360-api.sops.yaml`), configured via API.
|
||||
2. `POST /api/mspbackups/config` failed with `partner_id` NOT NULL violation — handler was passing `None`. Fixed in commit `3b29acc`.
|
||||
3. Build pipeline only builds agents, not server. Discovered `build-server.sh` at `/opt/gururmm/build-server.sh`.
|
||||
4. SOPS vault file had unquoted YAML timestamp (`created: 2026-05-18T00:00:00Z`) causing `time.Time` walk error. Fixed by quoting it in the raw YAML.
|
||||
5. MSP360 `/api/Monitoring` returns `null` for `LastStart`/`NextStart` on 14 records — struct had `String` not `Option<String>`. Fixed in commit `91630cb`.
|
||||
6. Hostname match picked offline phantom AD2 agent (f6a99fe7, crash-loop duplicate) instead of online agent (49c66d8b). Fixed in commit `86e7ade`: `find_agent_by_hostname_ci` now orders by `status='online'` first.
|
||||
7. `last_backup_at`/`next_backup_at` stored as NULL — MSP360 dates lack timezone (`2026-05-19T07:00:04`, not RFC3339). Fixed in commit `f146bd9`: fallback parser treats naive timestamps as UTC.
|
||||
|
||||
### Result
|
||||
AD2 backup tab now shows: `status: success`, last backup `2026-05-19T07:00:04Z`, next `2026-05-20T07:00:00Z`, plan `AD2 Image`, 6 files, ~355 GB. Syncs every 15 minutes.
|
||||
|
||||
## Server Builds (manual — not triggered by agent pipeline)
|
||||
- `sudo /opt/gururmm/build-server.sh` — used for all server-only deploys
|
||||
- Server binary at `/opt/gururmm/gururmm-server`, service: `gururmm-server`
|
||||
|
||||
## Commits (gururmm repo)
|
||||
- `56723b1` — fix: 4-bug fix (grace period, AgentMessage forward-compat, WatchdogEvent, watchdog start)
|
||||
- `2a7b74b` — fix: reviewer notes (NULL gap, warn! on watchdog event)
|
||||
- `3b29acc` — fix: mspbackups config partner_id lookup
|
||||
- `91630cb` — fix: handle null LastStart/NextStart in MSP360 BackupPlan
|
||||
- `86e7ade` — fix: prefer online agent in MSP360 hostname match
|
||||
- `f146bd9` — fix: parse MSP360 no-timezone dates as UTC
|
||||
|
||||
## Anti-Pattern Added
|
||||
Build-server.sh is separate from build-agents.sh. Server code changes require manual `sudo /opt/gururmm/build-server.sh` after pushing to Gitea.
|
||||
Reference in New Issue
Block a user