diff --git a/clients/birth-biologic/session-logs/2026-06/2026-06-26-mike-birthbio-mail-migration-and-datto-vm.md b/clients/birth-biologic/session-logs/2026-06/2026-06-26-mike-birthbio-mail-migration-and-datto-vm.md index 9ad72152..2cdc22af 100644 --- a/clients/birth-biologic/session-logs/2026-06/2026-06-26-mike-birthbio-mail-migration-and-datto-vm.md +++ b/clients/birth-biologic/session-logs/2026-06/2026-06-26-mike-birthbio-mail-migration-and-datto-vm.md @@ -191,3 +191,96 @@ mailboxes, mail + calendar + contacts) was created and auto-started — Status: - RMM install one-liner (BirthBio site): `irm https://rmm.azcomputerguru.com/install/BRIGHT-PEAK-5980/windows | iex`. - Discord DMs to Mike: message_id 1520034139900739627 (initial DWD), 1520055625302675537 (corrected 5-scope). - Vault enrollment key: `clients/birth-biologic/gururmm-site-main` (site BRIGHT-PEAK-5980, id 3b20ef97-…). + +--- + +## Update: 04:42 PT (2026-06-27) — Datto->SharePoint delta completion, Quality recovery, April-vs-now reconcile, ticket #32187 billed + +Continuation of the same session. Covers the SharePoint side: completing the additive delta, recovering an +accidentally-deleted Quality site, reconciling SharePoint to match Datto (source of truth), freezing the +Datto source, and updating/billing the Datto migration ticket. + +### Session Summary (update) + +After ACG-DWP-X-BB finished re-syncing with Datto cloud, ran the **Datto -> SharePoint delta**. The April +SPMT run was additive and never re-synchronized, so the delta only needed to add files that had never +transferred. Built `delta-recon-v2.ps1` (sanitize-aware reconcile: matches on both raw and sanitized +paths to find GENUINELY_MISSING files) and `delta-upload-v3.ps1` (simple-PUT for <=244MB auto-creating +parent folders, EnsureFolder + chunked for larger, FileShare.ReadWrite shared reads for Datto-locked +files, long-path `\\?\` for [IO.File] reads, SanRemote trim of leading/trailing spaces + trailing dots). +Reconciled to **0 missing** across Supply Management, Admin, Birth Biologic Activity Reports, Donor +Services (107 GB / ~57K files), and Quality. Renamed 19 Datto source files to match SharePoint (stripped +leading/trailing spaces + trailing dots). + +**Quality site recovery.** The Quality Department SharePoint site was deleted 6/26. Unified audit log +showed `operations@` deleted the connected M365 Group, which cascaded (AAD -> SharePoint sync) to remove +the site. Restored from the SharePoint deleted-site recycle bin (cert-based app token; SP REST rejects +app-only tokens). Since Quality is being reorganized into the **Quality Systems Department (QSD)** site, +relocated the migrated Quality content there via server-side copy, then filled 44 missing + 3 file-lock +stragglers. Old `/sites/QualityDepartment` auto-purges ~7/26. + +**April-vs-now divergence + mirror.** Because the April push was additive (not a sync), anything deleted, +moved, or changed in Datto after April was stale in SharePoint. Treating Datto as source of truth, built +a consolidated change-list (`consolidated_changelist.csv`): **1,583 deleted/moved + 161 modified (~1,744 +differences)**. Cross-checked the SP unified audit log to find files users had created/edited directly in +SharePoint (operations@, ksteen, jbeck, etc. on live sites) and **flagged 11 to protect**. Ran +`mirror-execute.ps1` (re-validates each row against a frozen Datto set, DELETEs stale by path-addressed +Graph DELETE -> recycle bin, refreshes modified via PUT, skips protected): **deleted=1,564, refreshed=160, +protected-skip=11, fail=0**. For the 1 modified protected file, pushed the Datto version beside the user's +edit as "...Datto Copy.docx"; the 10 deleted/moved protected are SP-only (no Datto copy) -> left as-is. + +**Froze the source.** Stopped + disabled the Datto Workplace Server service on ACG-DWP-X-BB so the source +no longer changes (also resolves the "reappearing files" complaint by removing the stale SP copies). + +**Ticket #32187 (Datto, Syncro 109277420).** Posted a highly-detailed public+email completion/remediation +note and billed **5.0h Labor - Remote Business ($150/hr = $750)**. Posted #bot-alerts notifications. + +### Key Decisions (update) + +- **Datto = source of truth** for the reconcile; SharePoint mirrored to it. Deletes go to the SP recycle + bin (recoverable 93 days), never hard-deleted. +- **Protect user-touched SP files** — never overwrite/delete the 11 flagged via audit log; for the one + edited file, keep both (Datto pushed as "Datto Copy") rather than overwrite. +- **Relocate Quality content to QSD** rather than rebuild under the restored old site, matching the planned + reorg; let the old site auto-purge. +- **Root-cause correction (Mike):** the "reappearing" files were NOT a Datto resurrection. They were stale + SP copies sitting in SharePoint since the April additive push (files deleted from Datto after April were + never removed from SP). Rewrote the ticket note's root-cause section accordingly. + +### Problems Encountered (update) + +- **SP REST "Unsupported app only token"** -> SharePoint requires a cert-based token; granted + Sites.FullControl.All and used a client_assertion JWT (x5t = cert thumbprint b64url). Fixed. +- **Chunked-upload 400 into brand-new folders** -> switched to simple-PUT (auto-creates parents). Fixed. +- **Long-path SKIP-nofile** -> `\\?\` prefix for [IO.File] reads (not for Rename-Item/File.Move in PS5.1). +- **Filename 400s** = leading/trailing spaces / trailing dots -> SanRemote trim; renamed 19 source files. +- **Datto file-locks** -> FileShare.ReadWrite shared read. Fixed. +- **Background poller broke** (curl --data-binary @file errored each iteration due to $0-relative temp paths + under run_in_background) -> read the mirror log directly instead. Logged as friction. +- **bot-alert missing link** -> first #bot-alerts post for #32187 omitted the mandated `-> ` tail; the + helper posts text verbatim and does not auto-append. Reposted correctly + logged friction. + +### Configuration Changes (update) + +- Created on ACG-DWP-X-BB / scratchpad: `delta-recon-v2.ps1`, `delta-upload-v3.ps1`, `mirror-execute.ps1`, + `consolidated_changelist.csv`, divergence CSVs, `mirror-execute.log`. +- Stopped + disabled Datto Workplace Server service on ACG-DWP-X-BB. +- Renamed 19 Datto source files (whitespace/trailing-dot cleanup) under + `C:\Users\Public\Desktop\Datto Workplace Server Projects`. + +### Pending / Incomplete Tasks (update) + +- **Mail:** MX cutover still pending (Batch 1 complete). Then authorize Workspace write scopes + (apps.licensing + admin.directory.user + Licensing API), unlicense migrated Google users, run Batch 2. +- **SP-only user files** (Shift Coms / DEMO and similar) — decide whether to fold into Datto. +- Old `/sites/QualityDepartment` auto-purges ~7/26 (no action needed). + +### Reference Information (update) + +- Datto migration ticket: #32187 (Syncro id 109277420). Comment id 420992239 (public+email); + line item id 43043687 (5.0h Labor - Remote Business, product 1190473, $150). +- #bot-alerts: message_id 1520266361996316802 (corrected, with link). +- SP site IDs — Donor: `birthbiologic.sharepoint.com,bcbfa272-dc85-424c-af66-3f14c75ffeb4,8b0975dd-...`; + Admin: `...,1baf65c1-c4b3-4602-9111-1f99ae800023,...`; Supply: `...,4700ecf3-25ba-41b6-918c-9fe620038172,...`; + QSD: `...,3173c017-58bd-406a-8858-2c969667336f,...`. +- Tenant 19a568e8-9e88-413b-9341-cbc224b39145; Graph app client 709e6eed-0711-4875-9c44-2d3518c47063. diff --git a/errorlog.md b/errorlog.md index 129eeeae..bc091a95 100644 --- a/errorlog.md +++ b/errorlog.md @@ -17,6 +17,12 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure · +2026-06-27 | GURU-5070 | syncro/bot-alert | [friction] posted bot-alert without the mandated '-> ' tail; format is [SYNCRO] #num (cust) - summary -> https://computerguru.syncromsp.com/tickets/ [ctx: ref=syncro.md#post-to-bot-alerts] + +2026-06-27 | GURU-5070 | birthbio/datto-sharepoint | [correction] assumed 'reappearing' files were a Datto two-way-sync resurrection from the transfer VM; correct: they were stale copies left in SharePoint since the April additive push (deleted from Datto later, never removed from SP). Mirror cleanup removed them. + +2026-06-27 | GURU-5070 | bash/background-poller | [friction] background poll script used $0-relative temp files ($0.l.json etc.) for curl --data-binary @file; under run_in_background $0 didn't resolve to a writable path so every poll errored + it never detected completion. Fix: use absolute scratchpad paths in background scripts, not $0-relative + 2026-06-26 | Howard-Home | syncro/billing | [correction] hand-rolled add_line_item API calls from memory instead of using the /syncro skill; malformed tickets reached Winter for cleanup. Correct: route ALL Syncro billing/invoicing through the skill. Generalized to a CORE skill-first rule. [ctx: rule=skill-first memory=feedback_skill_first_routing tickets=#32193,#32194] 2026-06-26 | Howard-Home | bash/env | [friction] used relative .claude/scripts/rmm-auth.sh after an earlier cd into a skill scripts dir (cwd persists across Bash calls) -> 'No such file or directory'; fix: cd /c/claudetools first or use absolute paths [ctx: ref=2026-06-25-edr-rollout cwd-drift note] @@ -25,6 +31,12 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure · 2026-06-26 | Howard-Home | discord-dm | Discord send to howard (DM) failed [ctx: http=400 resp={"message": "Invalid Form Body", "code": 50035, "errors": {"content": {"_errors"] +2026-06-26 | GURU-5070 | output/style | [correction] used emoji in responses despite CLAUDE.md NO-EMOJIS rule; Mike corrected. Use ASCII markers only [ctx: ref=CLAUDE.md-key-rules] + +2026-06-26 | GURU-5070 | rmm | [correction] bypassed /rmm skill (raw API via Windows curl due to broken git-bash curl) and skipped the mandatory [RMM] #dev-alerts post on write ops; alert is required regardless of dispatch method + +2026-06-26 | GURU-5070 | post-bot-alert | Discord POST failed (non-200/unreachable) [ctx: channel=#dev-alerts http=400 resp={"message": "The request body contains invalid JSON.", "code": 50109}] + 2026-06-26 | GURU-5070 | bash/env | [friction] git-bash /mingw64/bin/curl quarantined by Windows Defender -> RMM helpers (rmm-ps.sh/rmm-auth.sh) fail 'Permission denied'; workaround use C:/Windows/System32/curl.exe [ctx: machine=GURU-5070 fix=defender-exclusion-on-git-mingw64-bin] 2026-06-26 | Howard-Home | rmm/smb-testing | [friction] RMM-dispatched net use/net view/Test-Path/Get-SmbConnection are UNRELIABLE for SMB client testing - they fail with error 67 / RPC 1702 / 'none' even for KNOWN-GOOD targets (Karen's NAS she uses daily; Crystal had a live 5-open-file server session but Get-SmbConnection via RMM showed none). The agent-injected process lacks the user's real network-logon session. Wasted a long investigation treating these artifacts as a CS-SERVER SMB outage; server truth (Get-SmbSession) showed 7 live users + 30 open files + new sessions. VALIDATE SMB with Get-SmbSession server-side or a REAL interactive test, never RMM-dispatched client cmds. [ctx: host=CS-SERVER client=cascades ref=drive-map-verify] diff --git a/session-logs/2026-06/2026-06-27-mike-jupiter-plexrequest-seerr-dns-relay.md b/session-logs/2026-06/2026-06-27-mike-jupiter-plexrequest-seerr-dns-relay.md new file mode 100644 index 00000000..eeffa28c --- /dev/null +++ b/session-logs/2026-06/2026-06-27-mike-jupiter-plexrequest-seerr-dns-relay.md @@ -0,0 +1,133 @@ +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +> Work performed 2026-06-26 (Phoenix); log saved just after midnight 2026-06-27. UOS Rocky +> update and Wolkin SMB fix from earlier in the same session are logged separately +> (`session-logs/2026-06/2026-06-26-mike-uos-rocky-update.md`, +> `clients/wolkin/session-logs/2026-06/2026-06-26-mike-wolkin-smb-zerotier-adapter.md`). + +## Session Summary + +Restored the `plexrequest.azcomputerguru.com` service on Jupiter, which the user reported down. +NPM proxies that hostname to `172.16.3.31:5055`. Diagnosis showed the backing container ("Seerr", +Unraid template `my-Seerr.xml`, appdata `/mnt/user/appdata/seerr`) had been removed entirely from +Docker (gone from `docker ps -a`; every other container came back after a Docker restart ~15h +prior, Seerr did not). The image (`ghcr.io/seerr-team/seerr:latest`) and appdata were intact, so +the container was recreated on br0 `172.16.3.31`. It served, but redirected to `/setup` — +`initialized:false`. Investigation found this Seerr instance was a half-finished May-27 migration +that was never configured; the real working data lived in the old `binhex-overseerr` appdata +(`/mnt/user/appdata/binhex-overseerr/overseerr`), whose container was also gone. + +The user confirmed Seerr is the maintained successor to Overseerr (Overseerr being abandoned) and +asked to migrate. Verified the old Overseerr data was a genuine configured instance +(`initialized:true`, Plex + Radarr/Sonarr configured, real user accounts, 191 requests in a +180KB db + 4MB uncommitted WAL). Backed up, copied the Overseerr config into Seerr's appdata, +fixed ownership to `99:100`, and started Seerr — its built-in auto-migration ran +("Overseerr to Seerr migration completed successfully"), preserving all data (191 requests, users, +Plex/Radarr/Sonarr config). Verified public access (HTTP 200 -> `/login`, Cloudflare-fronted). + +The user then reported it was "REALLY slow." Backend measured fast (8ms API, 110ms login), so the +cause was elsewhere: every external lookup inside the container took ~4s. Root cause was DNS — +Jupiter's primary resolver `172.16.3.50` is dead (100% ping loss, `:53` times out ~5s), yet it is +first in the host `/etc/resolv.conf`, so every cache-miss lookup waited for it before falling back +to 8.8.8.8. Seerr (very DNS-heavy via TMDB) was worst hit. Fixed Seerr by recreating it with +`--dns 1.1.1.1 --dns 8.8.8.8` and `LOG_LEVEL=info` (was `debug`, dumping a full Radarr JSON per +title), and added it to the Unraid autostart list. In-container lookups dropped 4s -> ~0s. + +Finally, to revive `.50` for every other device/config still pointed at it, deployed a `dns-relay` +container (dnsmasq `4km3/dnsmasq`) on br0 `172.16.3.50` forwarding all queries to the gateway +`172.16.0.1` (pfSense unbound, verified healthy). Verified resolution through `.50` from a sibling +container and a LAN client (0.32s cold / 0.04s cached). Also fixed the Unraid `my-Seerr.xml` +template (was `bridge`; set to br0/.31) so a UI re-apply won't break the NPM target. Closed with a +general advisory on forcing SharePoint docs to open in desktop Office (no changes made). + +## Key Decisions +- Recreated Seerr on **br0 `172.16.3.31`** (not bridge) to match the NPM target and the sibling + binhex containers' addressing; the template's `bridge` was stale. +- Chose to **migrate Overseerr -> Seerr** rather than finish a fresh Seerr setup, to preserve the + 191 requests / users / Plex config. Migration is the official supported path (auto on first boot). +- Copied the Overseerr appdata (cp -a, all three sqlite files) so the **4MB WAL replayed** rather + than checkpointing the source — source left untouched as a rollback. +- Fixed Seerr slowness with a **per-container `--dns`** override (immediate, low-risk) instead of + changing the host DNS config, then solved the broader problem with a **relay at `.50`** so no + LAN devices need repointing. +- Set `LOG_LEVEL=info` — the template default `debug` generated heavy per-title log IO. +- Left the dead `.50` host-DNS-config decision (Unraid Network Settings) to the user; only flagged + it. The relay covers LAN clients; Jupiter's own host can't use it (ipvlan host->own-container). + +## Problems Encountered +- **plexrequest down** — backing Seerr container removed from Docker. Recreated from intact + image+appdata on br0 `.31`. +- **Recreated Seerr showed `/setup`** — appdata was a never-configured half-migration. Resolved by + migrating the real Overseerr data in. +- **"REALLY slow"** — root cause dead primary DNS `172.16.3.50` adding ~5s per cache-miss lookup. + Fixed Seerr via `--dns`; deployed `dns-relay` at `.50 -> 0.1` for everything else. +- **plink first connect hung** in `-batch` (host key uncached). Resolved by pinning + `-hostkey SHA256:czsrHxWg1cPekUeyn1D5V+u8oXgI0f5QUXRdJBv9tPc`. +- **`python3` absent on Unraid host** — used `grep` + host `/usr/bin/sqlite3` for db/settings checks. +- **ipvlan off-subnet TCP** — curl to `172.16.3.31:5055` from GURU-5070 returned 000 while ping + worked; verified the service from the Jupiter host / sibling container instead (a vantage-point + artifact, not a fault). + +## Configuration Changes +- **Jupiter (live, not in git):** + - Recreated container `Seerr` (br0 `172.16.3.31:5055`, `--init --user 99:100`, + `--restart unless-stopped`, `--dns 1.1.1.1 --dns 8.8.8.8`, `LOG_LEVEL=info`, + image `ghcr.io/seerr-team/seerr:latest`, appdata `/mnt/user/appdata/seerr`). + - Migrated `/mnt/user/appdata/binhex-overseerr/overseerr` -> `/mnt/user/appdata/seerr` (chown 99:100). + - New container `dns-relay` (br0 `172.16.3.50:53`, dnsmasq `4km3/dnsmasq`, + `--no-resolv --no-hosts --server=172.16.0.1 --cache-size=1000`, `--restart unless-stopped`). + - `/var/lib/docker/unraid-autostart`: added `dns-relay` (first), replaced stale `binhex-overseerr` + with `Seerr`. Backup `*.bak-20260626`. + - `/boot/config/plugins/dockerMan/templates-user/my-Seerr.xml`: `bridge` -> + `br0`, `` -> `172.16.3.31`. Backup `*.bak-20260626`. +- **Repo (this commit):** `wiki/systems/jupiter.md` — Docker table (Seerr + dns-relay rows), NPM + table (plexrequest=Seerr), Known Issues (plexrequest migration + slowness/DNS fix, dead `.50` + resolver entry), frontmatter bumped. + +## Credentials & Secrets +- Jupiter (Unraid) root SSH — already vaulted: `infrastructure/jupiter-unraid-primary.sops.yaml` + (`credentials.username`=root, `credentials.password`). Used via PuTTY `plink`. No new credentials + created. Jupiter SSH host key fingerprint: `SHA256:czsrHxWg1cPekUeyn1D5V+u8oXgI0f5QUXRdJBv9tPc`. + +## Infrastructure & Servers +- **Jupiter** `172.16.3.20` (Unraid, Dell, br0 ipvlan subnet `172.16.0.0/22`, gw `172.16.0.1`). +- **plexrequest** = `Seerr` container, br0 `172.16.3.31:5055`, behind NPM + (`plexrequest.azcomputerguru.com`, Cloudflare-fronted -> bare curl returns 403, use browser UA). +- **dns-relay** container, br0 `172.16.3.50:53` -> forwards `172.16.0.1`. +- **Dead resolver:** `172.16.3.50` (old primary DNS, host down) — now impersonated by the relay. +- **pfSense gateway / DNS:** `172.16.0.1:53` (unbound), healthy (0.06s lookups). +- Plex server (binhex-plexpass) br0 `172.16.3.32:32400` — Seerr's configured media server. + +## Commands & Outputs +- SSH: `plink -ssh -pw -batch -hostkey SHA256:czsr... root@172.16.3.20 ''` +- Migration (key): `cp -a /mnt/user/appdata/binhex-overseerr/overseerr /mnt/user/appdata/seerr`; + `chown -R 99:100 /mnt/user/appdata/seerr`; `docker start Seerr` -> + `[Seerr Migration]: Yeah! Overseerr to Seerr migration completed successfully!` +- DNS proof: in-container `nslookup api.themoviedb.org` 4.02s (before) -> ~0s (after `--dns`). + `nslookup api.themoviedb.org 172.16.3.50` from host = timeout/host-unreachable; + `... 172.16.0.1` = 0.06s. LAN client via relay: 0.32s cold / 0.04s cached. +- Verify: `curl -sk http://172.16.3.31:5055/api/v1/status` -> `{"version":"3.2.0",...}`; + public (browser UA) `/` -> 200 -> `/login`. DB: 191 `media_request` rows. + +## Pending / Incomplete Tasks +- **Jupiter host's own DNS** still lists dead `.50` first; host can't use the relay (ipvlan + host->own-container). Optional: set DNS1=`172.16.0.1` in Unraid Settings -> Network Settings + (`/boot/config/network.cfg` `DNS_SERVER1`). Pending user decision on what `.50` was. +- **Seerr Plex Scan error** post-migration: `Cannot read properties of undefined (reading 'some')` + — re-select Plex libraries in Seerr -> Settings -> Plex to clear (UI task). +- `dns-relay` has **no Unraid template** (created via `docker run`) — optional to add one. +- A failed local sign-in by mike@azcomputerguru.com was logged during testing — use Plex SSO or + reset the local Seerr password if needed. +- Backups to clean up later: `/mnt/user/appdata/_migbackup_20260626/overseerr-source.tgz`, + `/mnt/user/appdata/seerr.empty.preMig`. + +## Reference Information +- Seerr migration docs: https://docs.seerr.dev/migration-guide/ +- Image: `ghcr.io/seerr-team/seerr:latest` (v3.2.0); relay image `4km3/dnsmasq:latest` (dnsmasq 2.91). +- SharePoint "open in desktop app" advisory (no change made): per-library Advanced settings -> + "Open in the client application"; site-collection feature "Open Documents in Client Applications + by Default"; or per-user Sync/"Add shortcut to OneDrive". Caveat: Business Basic (no desktop + Office license) users can hit an error instead of browser fallback. diff --git a/wiki/systems/jupiter.md b/wiki/systems/jupiter.md index d7fd8dad..cb98b532 100644 --- a/wiki/systems/jupiter.md +++ b/wiki/systems/jupiter.md @@ -2,11 +2,12 @@ type: system name: jupiter display_name: Jupiter -last_compiled: 2026-05-24 -compiled_by: DESKTOP-0O8A1RL/claude-main +last_compiled: 2026-06-26 +compiled_by: GURU-5070/claude-main sources: - credentials.md - .claude/memory/infra_office_network.md + - 2026-06-26 plexrequest Overseerr->Seerr migration (mike) backlinks: - systems/gururmm-build - systems/pluto @@ -36,6 +37,8 @@ Not documented. iDRAC available at 172.16.1.73 (DHCP) for OOB management. | `npm` | 1880 (HTTP), 18443 (HTTPS), 7818 (admin) | Nginx Proxy Manager — handles all external reverse proxying | | `gitea` | 3000 (HTTP), 2222 (SSH) | Internal Gitea git server; http://172.16.3.20:3000 | | `seafile` + mysql + elasticsearch + memcached | 8082 | Seafile Pro file sync stack | +| `dns-relay` | br0 `172.16.3.50`:53 | **DNS relay** — dnsmasq (`4km3/dnsmasq`) forwarding all queries to the gateway `172.16.0.1` (pfSense unbound). Stood up 2026-06-26 to revive the dead `172.16.3.50` resolver IP so every device/config hardcoded to `.50` works without being touched. `--no-resolv --no-hosts --server=172.16.0.1 --cache-size=1000`, `--restart unless-stopped`, **first in the autostart list** (DNS up before other containers). dnsmasq's default `local-service` limits answers to the `172.16.0.0/22` LAN (not an open resolver). No Unraid template (created via `docker run`). | +| `Seerr` | br0 `172.16.3.31`:5055 | Plex request manager (Overseerr successor). Runs on br0 with a static IP + `--init --user 99:100`, `--restart unless-stopped`. Image `ghcr.io/seerr-team/seerr:latest`, appdata `/mnt/user/appdata/seerr`. Template `my-Seerr.xml` fixed to br0/.31 on 2026-06-26 (was `bridge` — a UI re-apply in bridge mode would break the NPM `.31` target). **Not yet in Unraid autostart list** — toggle on in the Docker tab so it survives an array stop/start. | **NPM → 443 routing:** iptables PREROUTING rule on Jupiter: `dpt:443 → 172.17.0.2:443` (NPM Docker bridge IP). Persisted in `/boot/config/go` so it survives reboots. @@ -72,7 +75,7 @@ Not documented. iDRAC available at 172.16.1.73 (DHCP) for OOB management. | rmm-api.azcomputerguru.com | 172.16.3.20:3001 | **STALE** — actual GuruRMM API is on 172.16.3.30:3001; update this in NPM admin | | unifi.azcomputerguru.com | 172.16.3.29:11443 | **UOS Server** (UniFi OS). Verified from NPM API 2026-06-15 — earlier `.28:8443` was stale. The real HTTPS port is **11443** (8443/443 are closed). See [[uos-server]]. | | sync.azcomputerguru.com | 172.16.3.20:8082 | Seafile Pro | -| plexrequest.azcomputerguru.com | 172.16.3.31:5055 | Plex request manager | +| plexrequest.azcomputerguru.com | 172.16.3.31:5055 | **Seerr** (Plex request manager) — `Seerr` Docker container on **br0 `172.16.3.31`**, appdata `/mnt/user/appdata/seerr`. **Migrated Overseerr -> Seerr 3.2.0 on 2026-06-26** (Overseerr is being abandoned; Seerr is its successor). Cloudflare-fronted, so bare `curl` returns 403 — test with a browser UA. See Known Issues for the outage that prompted the migration. | **[ACTION REQUIRED]** Update `rmm-api.azcomputerguru.com` proxy target from `172.16.3.20:3001` → `172.16.3.30:3001` in NPM admin (http://172.16.3.20:7818). @@ -85,11 +88,13 @@ Not documented. iDRAC available at 172.16.1.73 (DHCP) for OOB management. ## Known Issues & Quirks +- **[HOST-WIDE] Primary DNS `172.16.3.50` is DEAD but still Jupiter's first resolver (found 2026-06-26):** `/etc/resolv.conf` (generated by `rc.inet1` from Unraid network settings) lists `nameserver 172.16.3.50` first, then `8.8.8.8`, `1.1.1.1`. `172.16.3.50` is **down** (100% ping loss, host-unreachable, `:53` times out ~5s). Result: **every cache-miss DNS lookup on the host AND in every container that forwards to the host eats a ~5s timeout** before falling back to 8.8.8.8 — slows all DNS-heavy containers (Seerr was the worst-hit). Per-container workaround applied to Seerr (`--dns 1.1.1.1 8.8.8.8`). **FIXED 2026-06-26 via a DNS relay:** stood up the `dns-relay` container (dnsmasq on br0 `172.16.3.50`, see Docker table) forwarding to `172.16.0.1` — `.50` now answers again (0.3s cold / 0.04s cached, verified from a LAN client), so every device/config hardcoded to `.50` works without being repointed. **Caveat — Jupiter's OWN host DNS:** the host's `/etc/resolv.conf` still lists `.50` first, but **ipvlan blocks a host from reaching its own br0 container**, so the host itself can't use the relay and still eats the ~5s fallback for its own lookups. To fix the host specifically, set its DNS1 to `172.16.0.1` directly in Unraid **Settings -> Network Settings** (`/boot/config/network.cfg` `DNS_SERVER1`). LAN clients and other-host devices are unaffected by this caveat — only Jupiter-the-host. - **iptables PREROUTING for port 443** persists via `/boot/config/go` — if NPM routing breaks after a reboot, check this file first. - **iDRAC IP is DHCP** (172.16.1.73) — may drift. Verify before relying on it for OOB access. - **guruRMM API proxy stale** — see NPM table above. Fix before it causes a routing incident. - **Post-power-failure recovery order matters** — see `.claude/POWER_FAILURE_RUNBOOK.md` for the full recovery sequence (Tailscale routes, libvirt/VMs, Seafile, NPM/DNS in order). - **VM "Windows Server 2016" (`ACG-DWP-X-BB`) — no LAN (2026-06-07):** guest stuck on APIPA `169.254.157.152`, no DHCP lease. Host side is healthy (vnet8 bridged to br0, forwarding, receiving LAN broadcast); fault is guest-side — single e1000 NIC set to DHCP, pfSense (172.16.0.1) not leasing it. Diagnose via `virsh domifaddr 9 --source agent` and qemu guest-exec `ipconfig /all`. Fix path: `ipconfig /renew` in-guest (stuck-client case) or assign a static IP if that is the intended config. PAUSED pending Mike's DHCP-vs-static decision. +- **plexrequest (Seerr) outage + Overseerr->Seerr migration (2026-06-26):** Reported down. Root cause: the `Seerr` container (NPM target `172.16.3.31:5055`) had been **removed entirely** (gone from `docker ps -a`; everything else came back after a Docker restart, Seerr didn't) — it was a half-finished May-27 migration left `initialized:false`. The old working instance was `binhex-overseerr` (also stopped). Fix: recreated the `Seerr` container on br0 `.31`, then **migrated the real Overseerr data into it** (copied `/mnt/user/appdata/binhex-overseerr/overseerr` -> `/mnt/user/appdata/seerr`, chown `99:100`, started Seerr -> auto-migration "Overseerr to Seerr migration completed successfully"). Verified initialized, Plex/Radarr/Sonarr config + 191 requests + users preserved, public 200. Backups: old source untouched + `/mnt/user/appdata/_migbackup_20260626/overseerr-source.tgz`; pre-migration empty config at `/mnt/user/appdata/seerr.empty.preMig`. **Autostart:** added `Seerr` to `/var/lib/docker/unraid-autostart` (replaced the stale `binhex-overseerr`). **"Really slow" -> DNS:** Seerr felt very slow because every external lookup (TMDB metadata/posters) took ~4s — the container forwarded DNS to the host, whose **primary resolver `172.16.3.50` is DEAD** (see separate entry). Fixed by recreating Seerr with `--dns 1.1.1.1 --dns 8.8.8.8` (bypasses `.50`) and `LOG_LEVEL=info` (the template default `debug` dumped a full Radarr JSON per title — heavy log IO). In-container lookups went 4s -> ~0s. **Follow-up:** the `[Plex Scan]` job errors post-migration (`Cannot read properties of undefined (reading 'some')`) — re-select Plex libraries in Seerr settings to clear it. ## Backlinks