sync: auto-sync from GURU-5070 at 2026-06-23 21:37:26

Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-23 21:37:26
This commit is contained in:
2026-06-23 21:38:25 -07:00
parent 4df2232bbd
commit 2c9f99e45d

View File

@@ -0,0 +1,115 @@
# 22nd St office "slow internet" investigation (dead PoE switch + AP, Winter fix) + Factorio mod design
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-5070
- **Role:** admin
## Session Summary
Two threads. **(1) ACG 22nd St office network troubleshooting** — reported "internet slower than
normal," with Winter's phone showing Wi-Fi "no internet" and her wired PC getting partial page
loads. Worked it top-down. The office pfSense (172.16.0.1, SSH :2248, pfSense 2.8.1) and its single
Cox fiber WAN tested fully healthy: 0% loss, 14 ms, ~570 Mbps down / ~385 up single-stream, 1G
full-duplex WAN NIC with 0 errors, full-1500 MTU passes, DNS reliable, CPU idle, DHCP pool
(172.16.1.1-254, 2 h leases, ~28 active devices) nowhere near exhaustion. Confirmed it is a flat
/22 (172.16.0.0/22) with no VLANs — the `192.168.206.x` seen on Winter's Wi-Fi was stale junk, not
a real guest VLAN.
Pivoted to Winter's PC (DESKTOP-U303G5J, added to GuruRMM mid-session): wired path was clean (1G
FD, 0 loss, 476 Mbps). Her **partial page loads were caused by a broken Wi-Fi adapter** — APIPA
169.254 (no DHCP) with a **stale static DNS 192.168.206.48** that bled into Windows' multi-adapter
DNS resolution and caused intermittent lookup timeouts. **Disabled the Wi-Fi adapter via RMM**
(she's wired) — DNS now clean, resolved.
Then the office Wi-Fi via the UniFi controller (UOS 172.16.3.29; site "22nd St"). gw-audit showed
disconnected devices; ping + controller stat confirmed the **root cause: the US-8-60W PoE switch
(172.16.1.8) is DOWN**, which took the **U7-Pro AP (172.16.1.34) offline** — office running on 1 of
2 APs → Wi-Fi unusable in that AP's area (Winter's phone). Also found the **USW Pro 24 Fiber Port
24 linked at only 10 Mbps** (1G-capable; far end non-UniFi, no LLDP) = a throttled segment, and the
**USW-Lite-8-PoE (172.16.1.12)** up but unmanaged. **Remote fixes:** re-adopted the .12 switch.
Created Syncro ticket **#32454** (Arizona Computer Guru, cust 15353550) with the full findings +
on-site dispatch note (power-cycle the .8 switch, check the port-24 SFP/fiber). The dead AC switch +
the SFP need on-site hands.
**(2) Factorio mod — design discussion** (personal project). Researched Factorio 2.0 mod anatomy
(Grok live web; Gemini CLI auth is broken on this box). Mike wants a **deterministic, cost-based
quality system** replacing vanilla's RNG quality: a Refinery machine converts N lower-quality items
into 1 higher-quality item (no chance), keeping vanilla's quality tiers + perks untouched, with an
escalating compounding cost (≈3/4/5/6 per tier step). Captured the full design + the 2.0 mod-anatomy
reference under `projects/factorio-quality-mod/` (committed earlier this session, 30841fbf).
## Key Decisions
- **pfSense exonerated by measurement, not assumption** — ran loss/latency/throughput/MTU/DNS/DHCP
before concluding; ruled out WAN, DHCP exhaustion, and MTU before pivoting downstream.
- **Diagnose from the user's actual machine** — adding Winter's PC to RMM let me prove the wired path
was clean and isolate the dead Wi-Fi adapter as the partial-load cause.
- **Disabled (not reconfigured) Winter's Wi-Fi adapter** — she's wired; killing the broken adapter
removes the bad-DNS interference cleanly and reversibly.
- **On-site dispatch via a Syncro ticket** for the physical items (dead AC PoE switch, 10M SFP) —
can't power-cycle a fully-offline AC switch or reseat an SFP remotely.
- **Factorio mod = full replacement of RNG quality**, keep vanilla tiers/perks, deterministic
Refinery, escalating compounding cost (Mike, 2026-06-23).
## Problems Encountered
- **MSYS path conversion** mangled `/bin/sh` to a Windows path when passed to ssh.exe (Git-Bash
gotcha) — fixed with `MSYS_NO_PATHCONV=1` + a non-slash remote command (`sh -s`).
- **Gemini CLI auth broken** (`throwIneligibleOrProjectIdError` / `_doSetupUser`) — empty responses;
needs interactive `gemini` re-login. Logged to errorlog; fell back to Grok for research.
- **Old mongo shell rejected `else if`** in the uos-mongo query — rewrote with separate `if` blocks.
- **First post-reboot verify (CCroom1New, earlier)** read stale uptime in the reboot grace window —
re-verified after the box was down/up.
- **VWP-QBS firewall correction (carryover):** logged that the disabled firewall is intentional for
testing — leave it (don't re-flag).
## Configuration Changes
- **DESKTOP-U303G5J (Winter):** disabled the `Wi-Fi` net adapter via RMM (`Disable-NetAdapter`);
Ethernet-only, DNS now 172.16.0.1/8.8.8.8/1.1.1.1.
- **UniFi 22nd St:** re-adopted USW-Lite-8-PoE 172.16.1.12 (device-control adopt, rc:ok).
- **Syncro #32454** created (Arizona Computer Guru 15353550) — findings + on-site dispatch comment.
- **New project** `projects/factorio-quality-mod/``DESIGN.md` + `factorio-mod-anatomy.md` (committed 30841fbf).
- No pfSense changes (read-only diagnostics only).
## Credentials & Secrets
- None created. Read-only use of: office pfSense SSH key (`C:/Users/guru/.ssh/id_ed25519`, admin@172.16.0.1:2248);
`infrastructure/uos-server-ssh-key` (uos-mongo), `infrastructure/uos-server-network-api-rw`
(controller stat read), Syncro mike key, GuruRMM admin. pfSense admin pw vault
`infrastructure/pfsense-firewall` (unused — key auth worked).
## Infrastructure & Servers
- **Office pfSense** 172.16.0.1 (SSH :2248, web :4433), pfSense 2.8.1, single Cox fiber WAN
(igc0 98.181.90.163/31, gw .162), flat LAN igc2 172.16.0.0/22, DHCP kea 172.16.1.1-254 (2 h).
Tailscale 100.119.153.74. Stale dead `OptGW` (184.187.220.90) still monitored (false alarms).
- **UniFi UOS** 172.16.3.29 (controller :11443; mongo via uos-server-ssh-key). Site "22nd St" =
`5f493a90c9e77c010bbb134c` (short `1p7jvx8r`).
- U7-Pro AP `172.16.1.34` (mac 28:70:4e:d5:36:cf) — **DOWN**.
- UAPA6A9 AP `172.16.1.63` — up.
- US-8-60W switch `172.16.1.8` (mac f4:e2:c6:e4:3e:dd) — **DOWN** (AC-powered).
- USW Pro 24 `172.16.1.11` — up; **Fiber Port 24 @ 10M** (10G SFP+ 25/26 unused).
- USW-Lite-8-PoE `172.16.1.12` — re-adopted.
- DMarc USMINI `172.16.1.15` — up.
- **Winter PC** DESKTOP-U303G5J 172.16.1.158, GuruRMM agent `52c90de1-6d58-4654-a6ed-b779b8ad93fc`.
- Syncro: Arizona Computer Guru cust **15353550** (7437 E 22nd St).
## Commands & Outputs
- pfSense diag over SSH (key auth): `MSYS_NO_PATHCONV=1 ssh ... -p 2248 admin@172.16.0.1 sh -s <<EOF`
ping/netstat/pfctl/ifconfig/curl speedtest. Down ~71.2 MB/s (50MB Cloudflare), up ~48 MB/s.
- UOS mongo: `echo 'db.device.find({site_id:...})...' | bash .claude/scripts/uos-mongo.sh`.
- Controller stat: POST $BASE/api/auth/login → GET /proxy/network/api/s/1p7jvx8r/stat/device (RW cred).
- unifi-wifi: `gw-audit.sh "22nd St"`, `switch-audit.sh "22nd St" --all-ports`, `device-control.sh "22nd St" adopt <mac> --apply`.
- Research: `ask-grok.sh text --prompt-file ...` (Gemini failed auth).
## Pending / Incomplete Tasks
- **#32454 on-site (HIGH):** power-cycle/check US-8-60W (172.16.1.8) → restores U7-Pro AP + office Wi-Fi.
- **#32454 on-site:** inspect/reseat Fiber Port 24 SFP+fiber on USW Pro 24 (10M→1G); trace what it feeds.
- Optional: confirm the "USW Pro Max 16 PoE" (controller-offline/never-onboarded) is actually in service.
- VWP-QBS firewall stays OFF intentionally (do not re-enable) until VWP testing done.
- **Gemini CLI:** interactive re-login needed on GURU-5070 to restore that research path.
- Office pfSense cruft: remove the dead `OptGW` gateway (stops false down-alarms) — needs go (single point of failure).
- **Factorio mod:** next = feasibility check (deterministic quality-up recipe; neutralize quality-module RNG). See `projects/factorio-quality-mod/DESIGN.md`.
## Reference Information
- Ticket #32454 (id 112996018) https://computerguru.syncromsp.com/tickets/112996018 (comment 420428386).
- #dev-alerts/#bot-alerts posts this session: Winter Wi-Fi, ticket #32454.
- Factorio project: `projects/factorio-quality-mod/{DESIGN.md,factorio-mod-anatomy.md}` (commit 30841fbf).
- Related earlier logs: clients/valleywide/.../2026-06-23-mike-vwp-smb1-orders-xp-g-drive.md; session-logs/2026-06/2026-06-23-mike-vwp-qbs-firewall-ccroom1new-uac.md.