sync: auto-sync from HOWARD-HOME at 2026-06-15 20:40:48
Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-06-15 20:40:48
This commit is contained in:
@@ -1,8 +1,12 @@
|
|||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
# live-stats.sh — Plane-2 live RF/airtime from the UOS Network API (classic session API).
|
# live-stats.sh — Plane-2 live RF/airtime from the UOS Network API (classic session API).
|
||||||
# Gives CURRENT per-AP per-radio cu_total / cu_self / num_sta / satisfaction / tx_retries and the
|
# Gives CURRENT per-AP per-radio cu_total / cu_self / num_sta / retry% and device-level
|
||||||
# AP RF-neighbor table — for before/after validation of changes and (the neighbor table) the
|
# satisfaction, plus the worst-clients view (signal / retry% / satisfaction_reason) — for
|
||||||
# materials-aware AP-to-AP coverage graph that unlocks confident radio DISABLES.
|
# before/after validation of changes.
|
||||||
|
# NOTE: satisfaction is populated at the DEVICE level (per-radio is -1 on this controller),
|
||||||
|
# and tx_retries is a cumulative counter so we report radio_table_stats.tx_retries_pct (a rate).
|
||||||
|
# TODO: AP-to-AP RF-neighbor table (for confident radio DISABLES) — not a single API field;
|
||||||
|
# build it from the `rogue` collection by matching our own APs' vap_table BSSIDs. Not done yet.
|
||||||
#
|
#
|
||||||
# AUTH (provision once): the classic API needs a controller admin session. Create a dedicated
|
# AUTH (provision once): the classic API needs a controller admin session. Create a dedicated
|
||||||
# READ-ONLY admin in the UniFi UI (OS Settings -> Admins -> add a Viewer), then vault it:
|
# READ-ONLY admin in the UniFi UI (OS Settings -> Admins -> add a Viewer), then vault it:
|
||||||
@@ -52,21 +56,23 @@ echo "[INFO] site short=$SHORT"
|
|||||||
|
|
||||||
curl -sk -b "$CJ" "$base/proxy/network/api/s/$SHORT/stat/device" | python -c "
|
curl -sk -b "$CJ" "$base/proxy/network/api/s/$SHORT/stat/device" | python -c "
|
||||||
import sys,json
|
import sys,json
|
||||||
for d in json.load(sys.stdin).get('data',[]):
|
aps=[d for d in json.load(sys.stdin).get('data',[]) if d.get('type')=='uap']
|
||||||
if d.get('type')!='uap': continue
|
print('# APs reporting:',len(aps))
|
||||||
print('AP',d.get('name'),'clients=',d.get('num_sta'))
|
for d in sorted(aps,key=lambda a:str(a.get('name'))):
|
||||||
|
# device-level satisfaction is the populated one (per-radio satisfaction is -1 on this controller)
|
||||||
|
print('AP',d.get('name'),'clients=',d.get('num_sta'),'satisfaction=',d.get('satisfaction'))
|
||||||
for r in d.get('radio_table_stats',[]):
|
for r in d.get('radio_table_stats',[]):
|
||||||
print(' ',r.get('radio'),'ch',r.get('channel'),'cu_total',r.get('cu_total'),'cu_self_rx',r.get('cu_self_rx'),'cu_self_tx',r.get('cu_self_tx'),'num_sta',r.get('num_sta'),'tx_retries',r.get('tx_retries'),'satisfaction',r.get('satisfaction'))
|
print(' ',r.get('radio'),'ch',r.get('channel'),'cu_total',r.get('cu_total'),'cu_self_rx',r.get('cu_self_rx'),'cu_self_tx',r.get('cu_self_tx'),'num_sta',r.get('num_sta'),'retry%',r.get('tx_retries_pct'))
|
||||||
# RF neighbor table (materials-aware AP-to-AP visibility) if present
|
" 2>&1
|
||||||
for n in (d.get('radio_table') or []):
|
|
||||||
pass
|
|
||||||
" 2>&1 | head -60
|
|
||||||
|
|
||||||
if [ "$WANT_CLIENTS" = "--clients" ]; then
|
if [ "$WANT_CLIENTS" = "--clients" ]; then
|
||||||
echo "=== clients (rssi/rate/retries) ==="
|
echo "=== worst wireless clients by satisfaction (signal / retry% / why) ==="
|
||||||
curl -sk -b "$CJ" "$base/proxy/network/api/s/$SHORT/stat/sta" | python -c "
|
curl -sk -b "$CJ" "$base/proxy/network/api/s/$SHORT/stat/sta" | python -c "
|
||||||
import sys,json
|
import sys,json
|
||||||
for c in json.load(sys.stdin).get('data',[])[:40]:
|
cs=[c for c in json.load(sys.stdin).get('data',[]) if not c.get('is_wired')]
|
||||||
print(' ',c.get('hostname') or c.get('mac'),'ap',c.get('ap_mac'),'rssi',c.get('rssi'),'signal',c.get('signal'),'tx_rate',c.get('tx_rate'),'retries',c.get('tx_retries'),'sat',c.get('satisfaction'))
|
cs.sort(key=lambda c:(c.get('satisfaction') if isinstance(c.get('satisfaction'),(int,float)) else 999))
|
||||||
" 2>&1 | head -45
|
print('# wireless clients:',len(cs),' (worst 40 by satisfaction)')
|
||||||
|
for c in cs[:40]:
|
||||||
|
print(' ',(c.get('hostname') or c.get('mac')),'sat',c.get('satisfaction'),'signal',c.get('signal'),'noise',c.get('noise'),'retry%',c.get('wifi_tx_retries_percentage'),'band',c.get('radio'),'ch',c.get('channel'),'why',c.get('satisfaction_reason'))
|
||||||
|
" 2>&1
|
||||||
fi
|
fi
|
||||||
|
|||||||
@@ -71,3 +71,55 @@ Key audit output: 2.4 cu_total 74–94% / interf 61–81% / ~1 client; retry 40
|
|||||||
- unifi-wifi skill: `.claude/skills/unifi-wifi/` (methodology.md, data-access.md, interference-model.md).
|
- unifi-wifi skill: `.claude/skills/unifi-wifi/` (methodology.md, data-access.md, interference-model.md).
|
||||||
- Prior wireless log: `clients/cascades-tucson/session-logs/2026-05-16-howard-wireless-diagnostic.md`.
|
- Prior wireless log: `clients/cascades-tucson/session-logs/2026-05-16-howard-wireless-diagnostic.md`.
|
||||||
- UOS system wiki: `wiki/systems/uos-server.md`.
|
- UOS system wiki: `wiki/systems/uos-server.md`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Update: 20:40 PT — RW cred arrived, live (Plane 2) re-look, live-stats.sh accuracy fixes
|
||||||
|
|
||||||
|
Mike vaulted the RW controller admin (`infrastructure/uos-server-network-api-rw`) within ~30 min
|
||||||
|
of the request, so the live Network API (Plane 2) became available. Re-audited Cascades live and,
|
||||||
|
in doing so, found + fixed accuracy bugs in `unifi-wifi/scripts/live-stats.sh` that had skewed the
|
||||||
|
earlier read.
|
||||||
|
|
||||||
|
### live-stats.sh bugs fixed (held a coord lock; messaged Mike e8be889f)
|
||||||
|
1. `stat/device` output hard-capped at `head -60` -> only ~15 of 77 APs were shown (we'd been
|
||||||
|
judging the whole site from a 15-AP sample). Removed the cap; verified 77/77 now.
|
||||||
|
2. `satisfaction` was read per-radio (always `-1` on this controller). DEVICE-level satisfaction
|
||||||
|
IS populated -> switched to `d['satisfaction']`.
|
||||||
|
3. `tx_retries` was the raw cumulative counter (scales with traffic, misleading). Switched to
|
||||||
|
`radio_table_stats.tx_retries_pct` (a true rate).
|
||||||
|
4. `--clients` was `[:40]` unsorted (hid 90% of 574 clients). Now sorts worst-by-satisfaction and
|
||||||
|
prints signal/noise/retry%/`satisfaction_reason`.
|
||||||
|
5. RF-neighbor table left as a documented TODO in the header (not a single API field; must be built
|
||||||
|
from the `rogue` BSSID cross-ref vs each AP's `vap_table`). It's what unlocks confident radio
|
||||||
|
DISABLES; until then power-down/channel/width are the safe levers.
|
||||||
|
|
||||||
|
### Corrected diagnosis (the data fix changed a conclusion)
|
||||||
|
- Accurate avg retry RATE: **2.4GHz 11.2%** > 5GHz clear 9.0% ~= 5GHz DFS 8.4%. My mid-session
|
||||||
|
claim that "5GHz/DFS is now the #1 problem" was an ARTIFACT of the raw counter + 15-AP sample and
|
||||||
|
is WITHDRAWN. On the rate, DFS is NOT retrying worse than clear channels.
|
||||||
|
- **2.4GHz is the primary pain band** (highest retry; 27 of the 40 worst clients are on 2.4, retry
|
||||||
|
11-42%, mostly IoT/legacy: Ring cams, robotic cleaner, smart plugs, EPSON printer, Poly phone,
|
||||||
|
handheld scanners, Watch). The original 2.4 power-down/prune plan stands as #1.
|
||||||
|
- **DFS = resilience risk, not throughput killer.** 55/77 5GHz radios on DFS near Davis-Monthan =
|
||||||
|
radar-vacate exposure (client drops), worth moving off DFS for stability, but not urgent for
|
||||||
|
performance.
|
||||||
|
- **6GHz still dead (1 client of 574)** — top untapped clean/non-DFS capacity; steering remains a
|
||||||
|
key opportunity.
|
||||||
|
- **AP-level satisfaction 95-100 across the fleet** — network is healthy on average; pain is in the
|
||||||
|
client tail = consistent with "bad for SOME users."
|
||||||
|
|
||||||
|
### Cascades live snapshot (2026-06-15 ~20:30 PT)
|
||||||
|
- 77/77 APs reporting; 574 wireless clients. Band split: na (5GHz) ~87 / ng (2.4) ~21 sampled per
|
||||||
|
pull / 6e ~1.
|
||||||
|
- 2.4 cu_total 69-94% live (saturation confirmed). 5GHz cu_total mostly <40%.
|
||||||
|
- Worst clients: RingStickupCam (sat 60), Galaxy-A32-5G (60), DIRECTV (71), Samsung on ch1 (76,
|
||||||
|
retry 42%).
|
||||||
|
|
||||||
|
### Updated next steps
|
||||||
|
- [ ] Build Floor-pilot DRY-RUN: 2.4 power-down to Low on one floor (e.g. Floor 4), validate live
|
||||||
|
cu_total/retry% before+after via the fixed live-stats.sh. Get explicit go before `--apply`.
|
||||||
|
- [ ] Implement the AP-to-AP RF-neighbor table (rogue BSSID x vap_table) to enable safe DISABLEs.
|
||||||
|
- [ ] 6GHz steering plan; 5GHz 80->40MHz + non-DFS channel plan (resilience).
|
||||||
|
- [ ] Coord msgs this update: RW-cred request 6b98282f (+todo cbb355ef); live-stats fix e8be889f.
|
||||||
|
- [ ] pfSense `.ovpn` (Howard handling) — needed for per-AP watch-ap.sh live stream.
|
||||||
|
|||||||
Reference in New Issue
Block a user