sync: auto-sync from GURU-5070 at 2026-05-25 09:33:09

Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-05-25 09:33:09
This commit is contained in:
2026-05-25 09:33:13 -07:00
parent f2ece8eccf
commit 3de9c16743

View File

@@ -584,6 +584,102 @@ git push origin main
---
## Update: 09:32 PT — SPEC-007 production deployment
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-5070
- **Role:** admin
- **Session span:** 2026-05-25 ~09:15-09:32 PT
## Session Summary
Deployed SPEC-007 (OS recognition) to production. Before executing, read the build-server.sh script from the server to understand the deployment procedure. The script header notes that new migrations require `cargo sqlx prepare` to be run and committed before building, since SQLX_OFFLINE=true is used. Checked whether the coding agent had updated the `.sqlx` offline cache — it had not.
SSHed to 172.16.3.30 to assess actual state. Discovered that migration 045 was already applied (installed_on: 2026-05-25 15:46 UTC) and the server binary had already been rebuilt and deployed (v0.3.12, binary modified at 16:17 UTC). Confirmed via build log: `build-server.sh` had run and succeeded with "Server build complete: v0.3.12" at 16:17 UTC. This happened because the Gitea webhook triggered the build pipeline on our push, and the pipeline rebuilt the server (not just the agents) — and since the new queries in `inventory.rs` used `sqlx::query()` (not `sqlx::query!()` compile-time macros), SQLX_OFFLINE=true did not cause a compile failure. The server auto-runs `sqlx::migrate!()` on startup, which applied migration 045 cleanly.
Verified the API was returning `os_name` correctly by authenticating via vault credentials and calling `GET /api/agents`. Results showed proper friendly names: "Windows Server 2022 Datacenter" (NEPTUNE), "Windows Server 2019 Standard" (PLUTO), "Windows 11 Pro" (GURU-5070), "Ubuntu 22.04.5 LTS" (gururmm), "Debian GNU/Linux 12 (bookworm)" (Jupiter), "CloudLinux 9.7 (Pavel Popovich)" (ix.azcomputerguru.com). Built and deployed the dashboard: `npm run build` on the server (11.57s), then `rsync` to `/var/www/gururmm/dashboard/`. Dashboard nginx confirmed serving new build (assets timestamped 16:24 UTC). Final fleet check: 38/57 agents with `os_name` populated; 19 remain null pending their next inventory cycle (dashboard falls back to `os_type` for those).
## Key Decisions
- **Did not re-run cargo sqlx prepare:** The coding agent used `sqlx::query()` (not `sqlx::query!()`) for the new UPDATE — no compile-time validation needed, SQLX_OFFLINE=true was not an issue. Verified by confirming the build succeeded.
- **Did not apply migration manually:** Server auto-runs `sqlx::migrate!()` on startup (line 118 of main.rs). Migration 045 was applied by the build pipeline's server restart at 15:46 UTC. No manual psql intervention needed.
- **Did not run build-server.sh manually:** It had already run via the webhook pipeline. Running it again would have been redundant and caused unnecessary downtime.
- **Confirmed working before dashboard deploy:** Verified API response included `os_name` field with correct values before touching the dashboard, to confirm the server layer was solid.
## Problems Encountered
- **`psql` peer auth failure:** Running `psql -U gururmm -d gururmm` on the server fails with "Peer authentication failed" — must use full connection string `psql postgres://gururmm:PASSWORD@localhost:5432/gururmm`. Not a new issue; connection string approach worked.
- **Dashboard HTTPS 403 from server-side curl:** `curl https://rmm.azcomputerguru.com/` from the server returns 403 — Cloudflare bot protection blocks server-side curl. Not a real error; `curl http://localhost/dashboard/` returned 200 and confirmed correct assets.
## Configuration Changes
No new files created this session. Changes were deployed to production:
- `/opt/gururmm/gururmm-server` — rebuilt binary (v0.3.12, 13.4 MB)
- `/var/www/gururmm/dashboard/assets/index-BbCznyHt.js` — new dashboard build
- `/var/www/gururmm/dashboard/assets/index-BPcJRrHX.css` — new dashboard build
- PostgreSQL `agents` table — column `os_name TEXT` added (migration 045)
- PostgreSQL `_sqlx_migrations` — row inserted for version 45
## Credentials & Secrets
Used (not newly created):
- GuruRMM API admin: `claude-api@azcomputerguru.com` + password from vault at `infrastructure/gururmm-server.sops.yaml` → `credentials.gururmm-api.admin-email` / `credentials.gururmm-api.admin-password`
- PostgreSQL gururmm: `gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm` (in CONTEXT.md and wiki)
## Infrastructure & Servers
**172.16.3.30 (gururmm-build VM):**
- Service: `gururmm-server` — active (running) since 2026-05-25 16:17:20 UTC
- Binary: `/opt/gururmm/gururmm-server` — v0.3.12, rebuilt 16:17 UTC
- Dashboard: `/var/www/gururmm/dashboard/` — deployed 16:24 UTC
- PostgreSQL `gururmm` DB: migration 045 applied 15:46 UTC
## Commands & Outputs
```bash
# Check server status + binary age
ssh guru@172.16.3.30 "stat /opt/gururmm/gururmm-server | grep Modify && systemctl status gururmm-server"
# Binary: Modify: 2026-05-25 16:17:20, Active: running since 16:17:20
# Check migration state
psql postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm \
-c "SELECT version, description, installed_on FROM _sqlx_migrations ORDER BY version DESC LIMIT 5"
# version=45, description="agents os name", installed_on=2026-05-25 15:46:59 UTC, success=t
# Verify API response includes os_name
curl -s http://172.16.3.30:3001/api/agents -H "Authorization: Bearer $TOKEN"
# Sample: {"hostname":"NEPTUNE","os_type":"windows","os_name":"Windows Server 2022 Datacenter",...}
# Build dashboard
ssh guru@172.16.3.30 "cd /home/guru/gururmm/dashboard && sudo -u guru npm run build"
# built in 11.57s — dist/assets/index-BbCznyHt.js (1,267 kB)
# Deploy dashboard
ssh guru@172.16.3.30 "sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dashboard/"
# sent 1,342,246 bytes at 2.6 MB/s
```
## Pending / Incomplete Tasks
- 19/57 agents have `os_name = NULL` — will populate on next inventory report cycle (no action needed)
- URGENT: Neptune SSL cert expires 2026-05-31 (6 days remaining)
- URGENT: Western Tire SSL — verify AutoSSL on IX cPanel
- HIGH: Kittle WS2025 EVAL license, no backup, no firewall
- HIGH: Kittle-Design Ken inbox rule (potential active compromise)
- MEDIUM: Seed wiki/systems/neptune.md, wiki/systems/beast.md
## Reference Information
- Server version: v0.3.12 (Cargo.toml)
- Migration: 045_agents_os_name.sql (applied 2026-05-25 15:46 UTC)
- Fleet state: 57 agents total, 40 online, 38 with os_name populated
- GuruRMM dashboard: https://rmm.azcomputerguru.com
- Build log: /var/log/gururmm-build.log (on 172.16.3.30)
- Deployment SHAs: spec=80c6b34, implementation=1c05222, rebased on 7374e8a
---
## Update: 09:20 PT — GuruRMM Ollama log analysis: socat relay + findings deserialization fix
### User