# Session Log: 2026-02-25 ## Session Summary Continued diagnostics on Peaceful Spirit Country Club UCG Ultra speed issues. Performed SSH-based monitoring, identified ECM crash-loop patterns, rebooted gateway, and ran 15-minute stability monitoring. Gateway fully exonerated -- issue confirmed as Cox plant-side. --- ## Peaceful Spirit Country Club - UCG Ultra Continued Diagnostics ### Pre-Reboot Findings (via SSH) Connected via VPN to 192.168.0.10 after fixing SSH key (had to add to `/root/.ssh/authorized_keys` directly -- GUI-added key required password). **ECM crash-loop confirmed ongoing:** - ECM was NOT loaded (`lsmod | grep ecm` = empty) - Cycle pattern from dmesg: runs 2-6 minutes, crashes, stays down 15-39 minutes - Last cycle before reboot: init at 89499s, exit at 89638s (~2 min run), then never reloaded **Other findings:** - Load average: 1.26 (elevated, CPU handling all forwarding without ECM) - Memory: 1169 MB / 2947 MB (40%), 65 MB swap used - IDS/IPS: confirmed OFF (no suricata process) - eth4 RX: 4 errors, 4 CRC errors (physical layer corruption from modem) - WAN link flap: eth4 went down for 6 seconds at 76591s (modem sync loss) - QUIC reassembly failures: multiple bursts, including triple failure at 96270s - WireGuard tunnel: down (VPN was hung, had to be restarted on our side) ### Reboot and Hardware Acceleration User rebooted UCG Ultra. Initial post-reboot check (7 min uptime): - ECM was NOT loaded -- initially suspected PCIe probe failure (`qcom-pcie: probe of 20000000.pcie failed with error -110`) - Actual cause: **Hardware Acceleration was disabled in UI settings** - User re-enabled Hardware Acceleration - ECM loaded immediately: `ECM init` at 669s, `ECM init complete` at 669s ### 15-Minute Stability Monitoring Ran automated check every 60 seconds for 15 minutes (08:24 - 08:39). **Results:** - ECM: STABLE for entire 15 minutes -- zero crashes, zero restarts - RX errors: 0 across all 15 checks - CRC errors: 0 across all 15 checks - Drops: 0 both directions - QUIC failures: 0 - Link flaps: 0 - dmesg: clean -- only the initial ECM init message **Load trend:** | Time | Load (1m) | Load (5m) | Load (15m) | |------|-----------|-----------|------------| | 08:24 | 1.53 | 1.43 | 0.92 | | 08:28 | 1.33 | 1.43 | 1.04 | | 08:32 | 1.74 | 1.57 | 1.19 | | 08:36 | 2.12 | 1.72 | 1.33 | | 08:38 | 2.32 | 1.80 | 1.38 | | 08:39 | 1.74 | 1.73 | 1.38 | Load persistently above 1.0 -- likely WireGuard VPN crypto (can't be offloaded to ECM). ### Configuration Changes Made 1. **IDS/IPS:** Disabled (was on High) -- done 2026-02-25 earlier 2. **Hardware Acceleration:** Re-enabled after reboot 3. **MSS Clamping:** Changed from Custom 1452 to Auto - iptables now shows `clamp to PMTU` on tun1 only (correct behavior) - No MSS rules on eth4/WAN (confirmed -- MSS setting never affected WAN traffic) ### Speed Test Results - Post-reboot with ECM running: **29/28 Mbps** (300/30 provisioned) - Upload hitting near-provisioned speed (28 of 30) - Download at ~10% of provisioned (29 of 300) - Occasionally achieves full provisioned speeds (200-278 Mbps seen previously) ### Final Status Check (08:41, 33 min uptime) - ECM: loaded, stable - Load: 1.25 (trending down) - Memory: 981 MB / 2947 MB (33%), 2 MB swap - eth4: 0 errors, 0 CRC, 0 drops - dmesg: clean since boot - MSS: Auto, clamp to PMTU on tun1 only ### Sequential Thinking Re-Evaluation Performed full sequential thinking analysis (8 steps) re-evaluating all evidence: **Two overlapping problems identified:** **Problem 1 - Cox Plant (Primary):** - Speed decays from 200+ to 70 Mbps under sustained load = marginal DOCSIS channels de-bonding - 50% packet loss at all packet sizes = not MTU or gateway related - Download degraded, upload stable = downstream RF path - New modem, same symptoms = rules out CPE - Persists with all gateway configurations tested - Occasionally hits provisioned speed = CMTS config is correct, channels are marginal **Problem 2 - Gateway ECM (Secondary, resolved):** - ECM crash-loop amplified plant symptoms (caused <1 Mbps drops) - Resolved by: disabling IDS/IPS, rebooting, re-enabling HW acceleration - 15-minute monitoring confirms stable operation ### Summary Prepared for Cox Tech > **Site:** Peaceful Spirit Country Club > **Circuit:** 300/30 Mbps | **IP:** 98.190.129.150 > **Modem:** New (replaced prior day) - same symptoms > > Download speeds start at 200+ then decay to 29-70 Mbps under load. Intermittent drops to <1 Mbps. 50% packet loss at all sizes. Upload stable at 28-29 Mbps. Modem intermittently achieves full provisioned speed, proving CMTS config is correct. > > Customer gateway fully eliminated: 15 min monitoring shows zero errors at every layer, hardware offload stable, zero CRC errors. > > Pattern consistent with marginal downstream DOCSIS channels bonding/de-bonding as signal conditions fluctuate. > > Tech should check: downstream signal levels/SNR, uncorrectable codewords, T3/T4 timeouts, tap/drop/connectors for corrosion, amplifier health, node health. --- ## SSH Access Reference - **Host:** 192.168.0.10 (via VPN) or 98.190.129.150 (WAN) - **User:** root - **Key:** `~/.ssh/ucg_peaceful_spirit` (ed25519) - **Public key:** `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKBw+BK25MXpm91XBtDsSp7K0nTcKwFDLFZDx7tAO/N8 claude@claudetools` - **Auth method:** Key added to `/root/.ssh/authorized_keys` (NOT via UniFi GUI) - **Note:** GUI-added keys require password; direct authorized_keys works with key-only ### Current UCG Config (post-changes) - Hardware Acceleration: ON - IDS/IPS: Disabled - MSS Clamping: Auto (clamp to PMTU on VPN tunnels) - Jumbo Frames: OFF - SNMP: OFF - ARP Cache: Min DHCP lease - Auto Firewall State Timeouts: ON - Global NAT: Auto - Connection Tracking: FTP, H.323, GRE, PPTP, TFTP --- ## Pending Tasks ### Peaceful Spirit - [ ] Cox tech visit -- confirm plant-side fix resolves speed issues - [ ] After Cox fix: re-test speeds to verify 300 Mbps sustained - [ ] Consider re-enabling IDS/IPS at Medium/Low after Cox plant is fixed - [ ] Monitor ECM stability over coming days - [ ] Investigate persistent high load (1.2-2.3) -- likely WireGuard related ### From Previous Session (2026-02-24) - [ ] Yealink: Get IP Discovery Tool from distributor for serial extraction - [ ] Yealink: Test browser-based scanner (tools/yealink-serial-scanner.html) - [ ] Yealink: Onboard remaining phones into YMCS - [ ] Yealink: Build OIT VoIP templates when ready for migration - [ ] Clean up tools/test-yealink.ps1