unifi-wifi: pfSense gateway access via SSH (pfSense-ssh.sh) + pfSense health section; layer OFF HOLD

DECISION (Mike, 2026-06-16): drop the RESTAPI package — VPN + SSH shell reads the same data and makes
changes. Confirmed Cascades pfSense is Plus 25.07-RELEASE (current; the "too old" premise was wrong) and
admin SSH = real shell (no menu). The upgrade/package blocker is moot; compat layer is off hold.

- NEW scripts/pfsense-ssh.sh: audit (version/WAN-media/gateway-events/DHCP-exhaustion/states/DNS/load/NIC),
  dhcp (pool utilization + no-free-leases), run "<cmd>" (arbitrary, incl changes; operator-gated). Cred
  from clients/<slug>/pfsense-firewall; system OpenSSH via askpass. Validated live on Cascades.
- audit report: added "pfSense health check (2026-06-16)" — DHCP NOT exhausted (192.168.0.0/22 pool 270/507,
  0 no-free-leases), DNS up, dual-WAN stable (no gateway flaps), states/load healthy => gateway is NOT a
  WiFi factor; the 2.4 GHz RF work is the sole fix. (Minor: igc3/WAN2 I225 2.5G counter quirk, not a fault.)
- ROADMAP §E + SKILL.md updated to the SSH backend decision; REST pfsense-backend.sh kept dormant/optional.
- Remaining: named gated CONTROL verbs over SSH (easyrule block-ips, pf/fw toggles) + optional gw-* dispatch.
- Closed obsolete coord todo (upgrade-pfSense-for-RESTAPI).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-16 18:42:54 -07:00
parent e42ad8f163
commit 58ecc5ad40
4 changed files with 119 additions and 17 deletions

View File

@@ -47,3 +47,30 @@ no UniFi gateway (pfSense firewall). All collectors ran clean.
(All changes via the gated `apply-radio`/`apply-wlan`/`channel-plan` scripts — per zone, with rollback +
live validation. Nothing applied in this audit.)
---
## pfSense health check (2026-06-16) — ruling out the gateway as a WiFi factor
Investigated the Cascades pfSense (`192.168.0.1`, **pfSense Plus 25.07-RELEASE**, Netgate) over the site
VPN via SSH, to confirm whether any gateway-side issue contributes to the "WiFi bad for some users"
symptom. **Verdict: pfSense is healthy and is NOT a contributor — the problem is RF-side (2.4 GHz).**
| Area | Finding | WiFi impact |
|---|---|---|
| **DHCP exhaustion** | **0** "no free leases" events in dhcpd.log. WiFi/AP pool `192.168.0.0/22` (range 192.168.2.23.254, cap ~507) only **270 active (~53%)**; per-unit /28s + `10.0.20/.50` all have headroom | **Ruled out** (was the top suspect) |
| **DNS** | unbound resolver running | Fine |
| **WAN** | Dual Cox — WAN1 `184.191.143.62/30`, WAN2 `72.211.21.217/27`, both active **full-duplex**, `WAN_Group` gateway group, **no loss/down events** logged | Fine |
| **Firewall states** | 28,368 / 790,000 limit | Fine |
| **CPU / mbuf / uptime** | load 0.6, mbufs nominal, 10-day uptime | Healthy |
**Architecture:** per-unit design — **199 DHCP subnets**, mostly `10.x.y.0/28` per apartment (assisted-
living L2 isolation) + the `192.168.0.0/22` staff/AP network (APs + most WiFi clients). Active DHCP
backend is **ISC** (Kea config present but dormant).
**Minor (not WiFi-related):** `igc3`/WAN2 logged 1707 input-errors + 1707 "collisions", but the link is
2.5GbE full-duplex/active with zero gateway loss — consistent with the known Intel I225/I226 2.5G counter
quirk, not a real fault. No action needed unless WAN2 misbehaves.
**Conclusion:** gateway/DHCP/DNS/WAN are not bottlenecking the wireless. The 2.4 GHz remediation
(power-down + coverage-redundancy disables) remains the correct and sole fix for the client-experience tail.