223 lines
16 KiB
Markdown
223 lines
16 KiB
Markdown
# Session Log — 2026-05-24 (GURU-KALI, continued)
|
|
|
|
## User
|
|
- **User:** Mike Swanson (mike)
|
|
- **Machine:** GURU-KALI
|
|
- **Role:** admin
|
|
|
|
> Continues `2026-05-24-session.md`. Namespaced per-machine: the shared dated log
|
|
> conflicted repeatedly today across GURU-KALI / GURU-5070 / MacBook (concurrent
|
|
> EOF appends), so this machine's later updates live here to avoid merge conflicts.
|
|
## Update: 12:16 MST — Merges live, fleet auto-update confirmed, ProtectSystem bugs, repo hygiene, straggler fix
|
|
|
|
### Session Summary
|
|
|
|
Merged the three GuruRMM PRs to production main, confirmed the fleet auto-update path end to end, fixed two ProtectSystem=strict bugs the hardened unit introduced, hardened ClaudeTools against recurring garbled-filename cruft, and cleared a stuck agent.
|
|
|
|
PRs #13 (Linux tray IPC+GTK), #14 (Phase 4 peer-cred authz + real actions), and #21 (self-update ReadWritePaths) were all merged to `azcomputerguru/gururmm` main. SSH access to Saturn (172.16.3.30) was established from GURU-KALI by generating an ed25519 key and authorizing it via the vaulted SSH password, enabling direct build-server diagnostics. The pipeline published 0.6.37 (from #13/#14) and GURU-KALI auto-updated 0.6.29 -> 0.6.37 (130 MB dev build replaced by the 4 MB signed release), confirming the full chain.
|
|
|
|
Two ProtectSystem=strict bugs surfaced on GURU-KALI's recently-hardened unit: (1) the IPC socket needed RuntimeDirectory=gururmm (fixed in #13), and (2) the self-updater could not write the backup (/etc/gururmm) or the replaced binary (/usr/local/bin) — "Failed to backup current binary". Fixed by widening ReadWritePaths to `/var/log /usr/local/bin /etc/gururmm` (GURU-KALI unit patched directly; template fix = Issue #20 + PR #21, merged).
|
|
|
|
In ClaudeTools, removed 5 garbled tracked files (Windows paths stored with Unicode PUA substitutes) and added a purge guard to sync.sh, plus a PreToolUse(Bash) hook (block-backslash-winpath.sh) that blocks backslashed-Windows-path redirects — the root cause (Git Bash strips backslashes and PUA-substitutes ':'). jq was installed on GURU-KALI (hooks need it).
|
|
|
|
The "fleet remediation" turned out unnecessary: only GURU-KALI had the hardened unit; Saturn/ix/Jupiter run old ProtectSystem=false units and updated fine. SL-SERVER (Scileppi, dead 5 months, now a Synology) was deleted from the RMM. The #21 merge build was debounced (a concurrent auto-bump build held the lock), so an empty commit (fee5d7e) re-fired the webhook -> building 0.6.38 with the fix. Finally, the one online straggler RECEPTIONIST-PC (Cascades, stuck on 0.6.29 due to a flaky WebSocket that misses auto-dispatch windows) was force-updated via the API to 0.6.37.
|
|
|
|
### Key Decisions
|
|
|
|
- **Fixed my own SSH access** (per user instruction) via ed25519 key + vaulted password rather than asking for credentials — gives durable key auth to Saturn for build diagnostics.
|
|
- **ReadWritePaths needs BOTH /usr/local/bin AND /etc/gururmm** — verified from the updater code (backup -> config_dir=/etc/gururmm, replace -> /usr/local/bin); the /usr/local/bin-only first attempt still failed.
|
|
- **Did NOT push a fleet remediation** — investigation showed only GURU-KALI had the broken hardened unit; the others (ProtectSystem=false) self-update fine. Pushing a needless restart to client servers was avoided.
|
|
- **Backslash hook ignores quoted strings** — it quote-strips the command first so the pattern inside commit messages / echoes doesn't false-trigger (the hook caught its own commit message during testing); dropped a second check that re-introduced the false positive.
|
|
- **Empty commit to re-trigger the debounced build** (vs manual build-agents.sh) — keeps it pipeline-driven and traceable; auto-bump then correctly produced 0.6.38.
|
|
- **Force-triggered RECEPTIONIST-PC** rather than waiting — confirmed the update mechanism works for it and that the real issue is WS instability, not policy/server.
|
|
|
|
### Problems Encountered
|
|
|
|
- **Session log had committed merge-conflict markers** (multiple machines appended 2026-05-24-session.md). Resolved by removing the 3 marker lines (union of both sides kept).
|
|
- **kill -0 false "zombie lock"** — checking a root-owned build PID as `guru` returned permission-denied, misread as dead. The build was alive (waiting on Pluto). Corrected; no lock was cleared.
|
|
- **GURU-KALI auto-update failed twice** ("Failed to backup current binary") under ProtectSystem=strict — fixed by widening ReadWritePaths (both paths).
|
|
- **#21 merge produced no rollout** — its webhook was debounced by a concurrent build; the published 0.6.37 lacked the fix (confirmed via `strings` on the binary). Re-triggered with an empty commit -> 0.6.38.
|
|
- **pkill aborting compound bash commands** (exit 144) repeatedly — ran affected steps separately.
|
|
- **Foreground `sleep` blocked by the harness** mid-poll — switched to immediate checks / background watchers.
|
|
|
|
### Configuration Changes (this update)
|
|
|
|
GuruRMM (`azcomputerguru/gururmm`):
|
|
- PR #21 merged: `agent/src/main.rs` systemd unit template ReadWritePaths -> `/var/log /usr/local/bin /etc/gururmm` (merge 175b7f5).
|
|
- Empty commit `fee5d7e` to re-trigger the build (0.6.38).
|
|
|
|
ClaudeTools (committed a3c7064, 6d065cf):
|
|
- Removed 5 garbled tracked files (C:\ProgramData\gp_user.txt, D:\...\current-mode, 3 script fragments).
|
|
- `.claude/scripts/sync.sh` — added `purge_garbled_paths()` before each `git add -A`.
|
|
- `.claude/hooks/block-backslash-winpath.sh` (new) — PreToolUse(Bash) guard.
|
|
- `.claude/settings.json` — wired the PreToolUse hook (matcher Bash).
|
|
- `.claude/CLAUDE.md` — Windows mode-write note.
|
|
- `.claude/machines/guru-kali.md` — Rust, GTK build libs, passwordless sudo, gururmm clone, jq, enrolled-agent note.
|
|
|
|
Machine-level (GURU-KALI):
|
|
- `~/.ssh/id_ed25519` (new keypair, no passphrase) authorized on guru@172.16.3.30.
|
|
- jq 1.8.1 installed (apt).
|
|
- `/etc/systemd/system/gururmm-agent.service` — ReadWritePaths widened + RuntimeDirectory (applied directly); agent now on 0.6.37.
|
|
|
|
RMM data:
|
|
- Deleted agent SL-SERVER (id 2585f6d5-3887-412e-a586-1dec030f0a40).
|
|
- Force-update RECEPTIONIST-PC (9c91d324...) 0.6.29 -> 0.6.37.
|
|
|
|
### Credentials & Secrets (this update)
|
|
|
|
- **GURU-KALI SSH key**: `~/.ssh/id_ed25519` (ed25519, no passphrase). Public key:
|
|
`ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHd5ZblkziRIOI+57C4y7OkV3DvxlqmAe7VyBgPIYsyy guru@GURU-KALI`
|
|
Authorized in `guru@172.16.3.30:~/.ssh/authorized_keys`.
|
|
- All other creds used are vaulted (paths only, no values transcribed):
|
|
- Saturn SSH/sudo password: `infrastructure/gururmm-server.sops.yaml` -> `credentials.password`.
|
|
- RMM API admin: same file -> `credentials.gururmm-api.admin-email` (claude-api@azcomputerguru.com) / `credentials.gururmm-api.admin-password`.
|
|
- Gitea API token: `services/gitea.sops.yaml` -> `credentials.api.api-token`.
|
|
|
|
### Infrastructure & Servers (this update)
|
|
|
|
- **Saturn** `172.16.3.30` = hostname `gururmm`, Ubuntu 22.04 (kernel 5.15) — build server + RMM API (`:3001`) + PostgreSQL `gururmm` + ClaudeTools MariaDB + coord API (`:8001`). SSH key auth (guru) now works.
|
|
- **Pluto** `172.16.3.36` (Administrator) — Windows build + Authenticode signing; build-agents.sh SSHes to it (5 cargo targets + WiX MSI). Reachable.
|
|
- **Build pipeline**: webhook (`172.16.3.30:9000` / `/webhook/build`) -> `/opt/gururmm/build-agents.sh`; lock `/var/run/gururmm-build.lock`; log `/var/log/gururmm-build.log`; artifacts `/var/www/gururmm/downloads/` (e.g. `gururmm-agent-linux-amd64-<ver>`, `-latest` symlink). Per-component auto-version bump.
|
|
- **Auto-update**: server scanner every 300s; dispatched over WebSocket on heartbeat; gated on effective_policy `auto_update` (default on when policy null). Linux config_dir `/etc/gururmm` (backup + pending-update.json); binary `/usr/local/bin/gururmm-agent`; temp download via PrivateTmp.
|
|
- **RECEPTIONIST-PC** (Cascades of Tucson / site CascadesTucson): agent `9c91d324-1073-449c-8cc0-45c5bccfc218`, Windows 11 (22631) amd64. Flaky WS ("Connection reset without closing handshake"), ~3 connects/6h.
|
|
|
|
### Commands & Outputs (this update)
|
|
|
|
```bash
|
|
# SSH access self-fix
|
|
ssh-keygen -t ed25519 -N "" -f ~/.ssh/id_ed25519 -C "guru@GURU-KALI"
|
|
SSHPASS="$(vault get-field infrastructure/gururmm-server.sops.yaml credentials.password)" \
|
|
sshpass -e ssh-copy-id -i ~/.ssh/id_ed25519.pub guru@172.16.3.30 # Number of key(s) added: 1
|
|
|
|
# RMM API (auth + enumerate + force update)
|
|
TOKEN=$(curl -s -X POST $API/auth/login -d '{"email":..,"password":..}' | jq -r .token)
|
|
curl -s $API/agents -H "Authorization: Bearer $TOKEN" | jq ... # 56 -> 55 agents
|
|
curl -s -X DELETE $API/agents/2585f6d5-... -H "Authorization: Bearer $TOKEN" # HTTP 204 (SL-SERVER)
|
|
curl -s -X POST $API/agents/9c91d324-.../update -H "Authorization: Bearer $TOKEN"
|
|
# -> {"success":true,"target_version":"0.6.37","message":"Update triggered: 0.6.29 -> 0.6.37"}
|
|
# server: reconnected after update: 0.6.29 -> 0.6.37 (landed in ~37s despite a WS reset)
|
|
|
|
# Build re-trigger (debounced merge)
|
|
git commit --allow-empty -m "chore: re-trigger agent build ..." && git push origin main # fee5d7e
|
|
# build log: "Agent: 0.6.37 -> 0.6.38" / "Building version: 0.6.38"
|
|
|
|
# Session-log conflict cleanup
|
|
sed -i '228d;451d;534d' session-logs/2026-05-24-session.md # removed <<<<<<< ======= >>>>>>>
|
|
```
|
|
|
|
### Pending / Incomplete Tasks (this update)
|
|
|
|
- **Fleet watcher running** (`/tmp/gururmm-fleet-watch.sh`, ~60 min): waits for 0.6.38 to publish, then confirms fleet convergence; will report + flag laggards.
|
|
- **RECEPTIONIST-PC WS instability** (Cascades site) — durable fix pending; it will likely lag 0.6.38 too unless its WS is up. Force-trigger again as needed, or investigate the site firewall/NAT killing the long-lived WebSocket.
|
|
- **Open GuruRMM issues**: #15 pipeline tray build, #16 Windows IPC peer authz, #17 logind console user, #18 macOS tray, #19 subscriber broadcast. (#20 closed by #21.)
|
|
- **Session-log multi-machine conflicts**: 2026-05-24-session.md conflicted across machines; consider per-machine namespacing for same-day logs.
|
|
|
|
### Reference Information (this update)
|
|
|
|
- GuruRMM merges: PR #13, #14, #21 (#21 commit 72b8510, merge 175b7f5; closed Issue #20). Re-trigger empty commit fee5d7e. Building 0.6.38.
|
|
- Versions: 0.6.37 published/fleet; 0.6.38 building.
|
|
- Agent ids: GURU-KALI `a73ba38e-cd02-4331-b8bf-474cd899ec22`, Saturn `8cd0440f-a65c-4ed2-9fa8-9c6de83492a4`, RECEPTIONIST-PC `9c91d324-1073-449c-8cc0-45c5bccfc218`. Deleted SL-SERVER `2585f6d5-3887-412e-a586-1dec030f0a40`.
|
|
- ClaudeTools commits: a3c7064 (garbled cleanup + sync.sh guard), 6d065cf (backslash hook + CLAUDE.md note).
|
|
- Issues/PRs: https://git.azcomputerguru.com/azcomputerguru/gururmm/issues/{15..19}, /pulls/{13,14,21}.
|
|
|
|
---
|
|
|
|
## Update: 12:33 MST — fleet converged 0.6.38; laggards noted in coord
|
|
|
|
The re-triggered build published 0.6.38 and the fleet converged to 37/39 online on
|
|
0.6.38 within ~10 min (1 -> 32 -> 36 -> 37). Two laggards remain one version back on
|
|
0.6.37 (flaky WebSockets that miss the auto-update dispatch window; mechanism works,
|
|
force-update lands when WS is up):
|
|
- BB-SERVER (BirthBiologic) id 6c02baa7-0f1c-4990-b466-c9ab9eaefd3b
|
|
- RECEPTIONIST-PC (Cascades of Tucson) id 9c91d324-1073-449c-8cc0-45c5bccfc218
|
|
|
|
Noted for future rechecks in the coord messenger (to=mike, project=gururmm, msg id
|
|
a254202a-aa33-4736-ba9c-cd5678dbef58): recheck their versions; if still behind latest,
|
|
force-update via POST /api/agents/{id}/update; durable fix = investigate the client-site
|
|
firewall/NAT resetting the long-lived WebSocket. All watchers finished; none running.
|
|
|
|
---
|
|
|
|
## Update: 17:55 MST — Xfce lock-screen disabled + machine-doc note + community how-to published
|
|
|
|
### Session Summary
|
|
|
|
Disabled the lock screen on idle/screen-timeout on GURU-KALI (Kali/Xfce) per user request,
|
|
documented it in the machine doc, and published the fix as a how-to on the community site.
|
|
|
|
The lock came from `xfce4-screensaver` (the only locker installed — no light-locker /
|
|
xscreensaver / gnome-screensaver; `xfce4-power-manager` already had
|
|
`lock-screen-suspend-hibernate=false`). The catch: the screensaver's lock keys were NOT
|
|
present in xfconf, so the daemon used its compiled default of lock = on. Fixed by creating
|
|
them set to false (`-n` creates a missing property). Change is live (the running daemon,
|
|
PID 2566, watches xfconf) and persisted to the per-user XML, surviving reboot. Screen blanking
|
|
was left intact — only the password prompt is gone.
|
|
|
|
Recorded the change in `.claude/machines/guru-kali.md` Notes (flagged "INTENTIONALLY DISABLED —
|
|
do NOT re-enable" with the exact keys) so a future session does not undo it; committed/pushed
|
|
(e5c31f8). Then turned it into a public how-to via the `/forum-post` skill — drafted from this
|
|
session, previewed, and on confirmation inserted into Flarum on IX via paramiko → PHP/PDO.
|
|
Live as discussion #12.
|
|
|
|
### Key Decisions
|
|
|
|
- **Disabled the lock only, not screen blanking** — user asked only about the lock prompt; the
|
|
saver/DPMS still blank the screen on idle.
|
|
- **Created the xfconf keys with `-n`** rather than expecting them to exist — they default to
|
|
lock-on when absent, which is why GUI toggles appeared to do nothing.
|
|
- **Flagged the change "do NOT re-enable" in the machine doc** — it is an intentional deviation a
|
|
future session/agent might otherwise "fix."
|
|
- **Published via the forum-post DB-insert path** (paramiko → PHP/PDO on IX), per the skill —
|
|
external HTTP is Cloudflare-blocked and localhost curl has a redirect issue.
|
|
|
|
### Configuration Changes (this update)
|
|
|
|
Machine-local (GURU-KALI, not in repo):
|
|
- `xfce4-screensaver` xfconf: `/lock/enabled=false`, `/lock/saver-activation/enabled=false`,
|
|
`/lock/sleep-activation=false`. Persisted in
|
|
`~/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-screensaver.xml`.
|
|
- `xfce4-power-manager`: `lock-screen-suspend-hibernate=false` (confirmed, already set).
|
|
|
|
ClaudeTools (committed e5c31f8):
|
|
- `.claude/machines/guru-kali.md` — Notes entry: lock-on-timeout intentionally disabled, with keys + do-not-re-enable.
|
|
|
|
Community site:
|
|
- Flarum discussion #12 / post #11 inserted (tag 7, How-Tos & Tips).
|
|
|
|
### Credentials & Secrets (this update)
|
|
|
|
- No new secrets. Used vaulted creds only: IX SSH password `infrastructure/ix-server.sops.yaml`
|
|
-> `credentials.password`; Flarum DB password is embedded in the forum-post skill
|
|
(`Fl@rum2026!CGS`, db azcompu_flarum). The local publish script (which embedded the DB
|
|
password) was deleted from /tmp after posting.
|
|
|
|
### Infrastructure & Servers (this update)
|
|
|
|
- **IX** `172.16.3.10` (root) — hosts the Flarum community forum (`community.azcomputerguru.com`),
|
|
MySQL db `azcompu_flarum` (user `azcompu_flarum`), admin user_id 1 (MikeSwanson). Posts inserted
|
|
via paramiko SSH + PHP/PDO (s9e TextFormatter XML), tags table; default tag 7 = How-Tos & Tips.
|
|
|
|
### Commands & Outputs (this update)
|
|
|
|
```bash
|
|
# disable the lock (created the keys; they were absent -> defaulted to on)
|
|
DISPLAY=:0.0 DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus \
|
|
xfconf-query -c xfce4-screensaver -p /lock/enabled -n -t bool -s false
|
|
# (+ /lock/saver-activation/enabled, /lock/sleep-activation = false)
|
|
xfconf-query -c xfce4-screensaver -p /lock/enabled # -> false (daemon picked it up live)
|
|
|
|
# publish: /forum-post skill -> rc=0
|
|
# Discussion ID: 12 / Post ID: 11
|
|
```
|
|
|
|
### Pending / Incomplete Tasks (this update)
|
|
|
|
- None for this segment. (Carried over: laggards BB-SERVER + RECEPTIONIST-PC on 0.6.37 noted in
|
|
coord msg a254202a; GuruRMM issues #15-19 open.)
|
|
|
|
### Reference Information (this update)
|
|
|
|
- Forum post: https://community.azcomputerguru.com/d/12-disable-the-xfce-screen-lock-on-idle-timeout-without-losing-the-screensaver (discussion 12, post 11, tag 7)
|
|
- ClaudeTools commit: e5c31f8 (machine-doc note)
|
|
- xfconf XML: `~/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-screensaver.xml`
|