sync: auto-sync from GURU-5070 at 2026-06-15 18:09:05
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-15 18:09:05
This commit is contained in:
@@ -27,7 +27,14 @@ controller knows and making prioritized, validated changes. Built for any site;
|
||||
```
|
||||
Outputs 2.4/5/6 config summary, the per-channel neighbor-density (interference) map, and flagged
|
||||
issues (2.4 over-provisioning, 40/80/160MHz width, off-1/6/11 channels, min-RSSI off, high power).
|
||||
2. **Interpret** the flags against `methodology.md` (fix order: prune 2.4 -> shrink cells/power ->
|
||||
2. **Rank airtime-reduction candidates** — which radios to disable / power down, from real history
|
||||
(`ace_stat`: per-AP airtime `cu_total`/`cu_interf`/`num_sta` + the `wifi_connectivity_event` roam
|
||||
graph). Works for any band:
|
||||
```bash
|
||||
bash .claude/skills/unifi-wifi/scripts/model-rank.sh <site> [days=7] [band=ng|na|6e|all]
|
||||
```
|
||||
(Cascades 2.4: 75 radios at 74–94% utilization, 61–81% interference, ~1 client each → disable/power-down.)
|
||||
3. **Interpret** the flags against `methodology.md` (fix order: prune 2.4 -> shrink cells/power ->
|
||||
min data rates -> manual 1/6/11 plan -> min-RSSI + roaming -> steer to 6GHz).
|
||||
3. **Recommend** a prioritized, per-zone change plan. Roll out per zone, not site-wide at once.
|
||||
|
||||
|
||||
@@ -10,8 +10,12 @@ the **Cascades** site (`site_id 685f39068e65331c46ef6dd2`) as the hard case (77
|
||||
|
||||
| Plane | Source | Reach | Holds |
|
||||
|---|---|---|---|
|
||||
| **Config + history** | Mongo `ace` via `uos-mongo.sh` | root SSH, fully available now | radio config, the interference map, channel-plan settings, AP/client inventory |
|
||||
| **Live RF/airtime** | Controller Network API (`stat/device`, `stat/sta`) | needs a session / integration key — NOT yet wired | current channel utilization, per-client RSSI/retries/tx-rate, AP satisfaction, num_sta |
|
||||
| **Config** | Mongo `ace` via `uos-mongo.sh` | root SSH, available now | radio config, foreign-interference map (`rogue`), channel-plan, device/floorplan |
|
||||
| **History** | Mongo `ace_stat` (`db.getSiblingDB('ace_stat')`) | root SSH, available now | per-AP/per-band airtime time-series (`<band>-cu_total/cu_interf/num_sta`, satisfaction, retries) in `stat_hourly`/`stat_daily`; the **roam graph** in `wifi_connectivity_event`; per-client history |
|
||||
| **Live (optional)** | Controller Network API (`stat/device`, `stat/sta`) | needs a session / integration key — not wired | *current* utilization + the live RF-neighbor table; nice for before/after validation |
|
||||
|
||||
> The accumulated **history plane (`ace_stat`)** is the key source for the interference / airtime
|
||||
> model — it already holds what the UniFi UI shows. See [interference-model.md](interference-model.md).
|
||||
|
||||
The live per-AP utilization and per-client RF stats are **NOT persisted in Mongo** (the `device`
|
||||
collection carries config but no `radio_table_stats`; the `user`/client collection only keeps
|
||||
|
||||
@@ -1,61 +1,66 @@
|
||||
# AP interference / airtime-reduction model — design + data feasibility
|
||||
# AP interference / airtime-reduction model — design + data
|
||||
|
||||
Goal (per Mike): a fleet/site-level model that decides **which AP radios to disable, and where to
|
||||
reduce power**, to cut total airtime contention while preserving client coverage. Per AP **per
|
||||
radio, all bands** (not 2.4-only, not per-client). Inputs: each AP's view of neighboring APs (RF)
|
||||
+ historical client connections.
|
||||
radio, all bands** (not 2.4-only, not per-client). Inputs: each AP's neighbor/overlap relationships
|
||||
+ historical client connections + airtime.
|
||||
|
||||
## Data feasibility (probed on Cascades 2026-06-15)
|
||||
| Signal the model wants | In Mongo `ace`? | Source to use |
|
||||
## Data — it's ALL in the controller (corrected 2026-06-15)
|
||||
First pass only checked the `ace` DB (config/current state) and wrongly concluded the history
|
||||
"isn't there." It is — in **`ace_stat`** (1.5 GB of accumulated time-series) and the
|
||||
`wifi_connectivity_event` collection. **No external collector is needed**; the controller already
|
||||
retains it. Three databases on the UOS Mongo (port 27117):
|
||||
|
||||
| DB | Holds | Use |
|
||||
|---|---|---|
|
||||
| **Our AP ↔ our AP RF visibility** (A hears B at RSSI r) | **NO** — `rogue` is FOREIGN APs only; our managed APs are filtered out (0 rows match our SSIDs) | **Live Network API** `stat/device` neighbor table / triggered RF scan (Plane 2) |
|
||||
| **Historical client→AP connections / roam overlap** | **NO** — `user` keeps only `last_uplink_mac` (last AP); no sessions, `alarm` empty, `stat` collections empty | **Accumulated `stat/sta` polling** over time (Plane 2 + a collector) |
|
||||
| **Physical AP coordinates** | **NO** — 0 APs placed on the 1 floorplan | derive coarse topology from **AP names** (room#/floor encoded) |
|
||||
| Radio config (channel/band/width/power/min_rssi) | **YES** | Mongo `device.radio_table` (Plane 1) |
|
||||
| Foreign interference per channel | **YES** | Mongo `rogue` aggregate (Plane 1) |
|
||||
| `ace` | config + current state | radio_table (channel/band/width/power/min_rssi), `rogue` (FOREIGN interference), `channelplan`, device/floorplan |
|
||||
| **`ace_stat`** | **time-series history** | the model's airtime + client history (below) |
|
||||
| `ace_audit` | audit log | — |
|
||||
|
||||
**Conclusion:** the interference graph the model needs (our-AP mutual RSSI + client overlap)
|
||||
**cannot** be built from Mongo. It requires **Plane 2 (the live Network API)** plus a **collector**
|
||||
that accumulates snapshots over time. Mongo gives config + foreign-interference + (via names) a
|
||||
coarse topology prior to seed the model before enough live data is collected.
|
||||
### `ace_stat.stat_hourly` / `stat_daily` (o:'ap') — per-AP per-band airtime history
|
||||
Flat per-band fields (`ng`=2.4, `na`=5, `6e`=6): **`<band>-cu_total`** (channel utilization %),
|
||||
**`<band>-cu_interf`** (airtime lost to interference %), **`<band>-cu_self_rx/tx`**, **`<band>-num_sta`**,
|
||||
`<band>-satisfaction`, `<band>-tx_retries`. Plus `ap` (AP mac), `site_id`, `time` (ms epoch). This
|
||||
is the historical airtime/interference/load profile per AP per band — the core "who's contending"
|
||||
signal. (Also o:'user' per-client retries/anomalies; o:'site' rollups.)
|
||||
|
||||
## Model design
|
||||
Per band `b` (ng/na/6e), build a weighted graph over AP radios:
|
||||
### `ace_stat.wifi_connectivity_event` — the roam graph (empirical AP adjacency)
|
||||
Each doc: `from_endpoint{mac(AP), channel, channel_width, band, rssi}`, `to_endpoint{mac(AP), ...}`,
|
||||
`client_mac`, `successful`, `time`, `_class:WIFI_ROAMING`. Every roam is an **edge between two APs a
|
||||
real client handed off between** → empirical coverage overlap/adjacency, weighted by volume, with
|
||||
the handoff RSSI (coverage quality). This is the "historical connections → which APs cover the same
|
||||
space" signal, and it's better than a raw RF scan because it reflects where clients actually move.
|
||||
|
||||
- **Nodes:** each AP's radio on band `b`.
|
||||
- **RF edges** `w_rf(A,B)`: from the live neighbor table — how strongly A hears B (and vice-versa),
|
||||
scaled up when same/overlapping channel. Strong mutual RSSI on the same channel = high
|
||||
co-channel interference.
|
||||
- **Overlap edges** `w_ov(A,B)`: fraction of clients that have associated with BOTH A and B over
|
||||
the collection window (built by snapshotting `stat/sta` every N minutes). High overlap = they
|
||||
cover the same space → one is redundant.
|
||||
- **Per-radio metrics:** `load` (num_sta, live `cu_total`), `unique_coverage` (clients only this
|
||||
radio serves at good RSSI), `interference_contribution` (Σ strong RF edges on same channel).
|
||||
### Still useful from `ace` / setup
|
||||
- `rogue` aggregate = FOREIGN co-channel interference per channel (Plane-1 audit).
|
||||
- **Floorplan coords**: not set today (0/77 APs placed). Mike is willing to use the floorplan
|
||||
feature → once APs are placed, `device` x/y gives true physical distance for adjacency edges
|
||||
(complements the roam graph, esp. for APs that rarely roam-share).
|
||||
- Live Network API (`stat/device`/`stat/sta`) still adds *current* utilization + the AP's live
|
||||
RF-neighbor table; nice-to-have for before/after validation, not required to build the model.
|
||||
|
||||
**Recommendation logic (greedy, coverage-safe):**
|
||||
1. **Disable a radio** when: high `interference_contribution` AND high `coverage_redundancy`
|
||||
(its clients keep good signal from neighbors) AND `unique_coverage ≈ 0`. Disable the worst
|
||||
offender, recompute the graph, repeat until a redundancy floor is hit (don't open holes).
|
||||
2. **Reduce power** when interference is high but `unique_coverage > 0` (can't disable without a
|
||||
hole) — shrink the cell to cut contention while keeping coverage.
|
||||
3. **Leave** radios that carry unique coverage and contribute little interference.
|
||||
Band weighting: 2.4 prunes most aggressively (most redundant + least capacity value); 5/6 lighter;
|
||||
6GHz usually keep (clean band, steer up). Output = ranked per-AP-per-radio actions with the metric
|
||||
that justified each, applied **per zone** with live before/after validation.
|
||||
## Model
|
||||
Per band `b`, over a history window (default 7d):
|
||||
- **Per-radio airtime**: avg `cu_total`, avg `cu_interf`, avg `num_sta` from `stat_hourly`.
|
||||
- **Overlap/redundancy**: roam volume per AP and per AP-pair from `wifi_connectivity_event` (high
|
||||
roam-share to neighbors ⇒ a client leaving this radio has somewhere to land ⇒ redundant).
|
||||
- **Score (v1, `model-rank.sh`)**: `(cu_total + cu_interf) * log(1+roams) / (clients+1)` — high =
|
||||
this radio burns airtime in a contended band AND its clients are mobile/redundant ⇒ shrink or
|
||||
disable. Hint: **DISABLE** when high roam + near-zero clients; **POWER-DOWN** when busy but holds
|
||||
some unique load.
|
||||
- **v2 (greedy, coverage-safe)**: iteratively disable the top candidate, recompute overlap so we
|
||||
never open a hole (stop when a radio's clients would lose their only good-RSSI neighbor), and add
|
||||
floorplan-distance edges once APs are placed. Output = ranked per-AP-per-radio actions with the
|
||||
metric that justified each, applied per zone with before/after validation.
|
||||
|
||||
## Prerequisites to build it (the real next step)
|
||||
1. **Wire Plane 2** — provision a dedicated **read-only UniFi admin or Network integration API key**
|
||||
on `.29` (doable with our root SSH), vault as `infrastructure/uos-server-network-api`. Gives
|
||||
`stat/device` (live neighbor RSSI, `cu_total`, `num_sta`, satisfaction) + `stat/sta` (client→AP).
|
||||
2. **Stand up a collector** — a periodic job (cron on `.30`/a fleet host) snapshotting
|
||||
`stat/device` + `stat/sta` into a small store (sqlite/postgres). The **overlap + RF matrix
|
||||
accrue over the collection window** (a week+ gives a usable model; longer = better). This is the
|
||||
"historical look at devices connected" Mike asked for — the controller doesn't retain it, so we
|
||||
accumulate it ourselves.
|
||||
3. **Build the model** on the accumulated data; seed early recommendations from the Mongo config +
|
||||
AP-name topology prior until enough live data exists.
|
||||
## v1 result (Cascades 2.4GHz, 7d) — the smoking gun
|
||||
`model-rank.sh cascades 7 ng`: 2.4 radios run **cu_total 74–94%, cu_interf 61–81%, ~0.3–2.6 clients
|
||||
each** across 75 APs. Translation: 2.4 is saturated and mostly interference, serving almost no one —
|
||||
a textbook case to disable 2.4 on most APs and power down the rest. Run `na`/`6e` for the 5/6GHz
|
||||
picture (expected: keep, with 6GHz the clean capacity band).
|
||||
|
||||
## Status
|
||||
Phase 1 (config + foreign-interference audit) is built (`scripts/audit-site.sh`). The interference
|
||||
model is **blocked on Plane 2 + the collector** — needs a go to provision the UniFi API account and
|
||||
stand up the collector.
|
||||
- `scripts/audit-site.sh` — config + foreign-interference audit (Plane 1).
|
||||
- `scripts/model-rank.sh` — **v1 airtime-reduction ranker from real history** (this doc). Works now.
|
||||
- Next: v2 greedy coverage-safe optimizer + floorplan-distance edges (after APs are placed on the
|
||||
floorplan) + optional live-API before/after validation.
|
||||
|
||||
63
.claude/skills/unifi-wifi/scripts/model-rank.sh
Normal file
63
.claude/skills/unifi-wifi/scripts/model-rank.sh
Normal file
@@ -0,0 +1,63 @@
|
||||
#!/usr/bin/env bash
|
||||
# model-rank.sh — rank AP radios as airtime-reduction candidates from ACCUMULATED history.
|
||||
# Data (all from ace_stat, already collected by the controller — no new collector needed):
|
||||
# - airtime/interference: stat_hourly (o:'ap') <band>-cu_total, <band>-cu_interf, <band>-num_sta
|
||||
# - coverage overlap: wifi_connectivity_event (client roams between AP pairs)
|
||||
# Per band (ng/na/6e). A radio is a strong DISABLE candidate when it carries high interference
|
||||
# airtime AND its clients heavily roam to other APs (redundant coverage); a POWER-DOWN candidate
|
||||
# when busy/interfering but with less roam redundancy. This is a v1 ranker, not the final greedy
|
||||
# optimizer — see references/interference-model.md.
|
||||
#
|
||||
# Usage: bash .claude/skills/unifi-wifi/scripts/model-rank.sh <site-name|site_id> [days=7] [band=ng|na|6e|all]
|
||||
set -euo pipefail
|
||||
REPO="$(git rev-parse --show-toplevel 2>/dev/null || echo .)"
|
||||
UOS="$REPO/.claude/scripts/uos-mongo.sh"
|
||||
arg="${1:?usage: model-rank.sh <site> [days] [band]}"; DAYS="${2:-7}"; BAND="${3:-ng}"
|
||||
if [[ "$arg" =~ ^[0-9a-f]{24}$ ]]; then SITE="$arg"; else
|
||||
SITE="$(bash "$UOS" --sites 2>/dev/null | grep -vi 'pq.html' | grep -i "$arg" | awk '{print $1}' | head -1)"
|
||||
[ -n "$SITE" ] || { echo "[ERROR] no site matching '$arg'"; exit 1; }
|
||||
fi
|
||||
echo "[INFO] site=$SITE window=${DAYS}d band=$BAND"
|
||||
|
||||
cat <<JS | bash "$UOS" 2>&1 | grep -viE 'pq.html|post-quantum|store now|server may need'
|
||||
var SITE='$SITE', DAYS=$DAYS, BAND='$BAND';
|
||||
var ace=db.getSiblingDB('ace'), st=db.getSiblingDB('ace_stat');
|
||||
var since = new Date().getTime() - DAYS*86400000;
|
||||
var bands = (BAND=='all') ? ['ng','na','6e'] : [BAND];
|
||||
// ap mac -> name
|
||||
var name={}; ace.device.find({site_id:SITE,type:'uap'},{mac:1,name:1}).forEach(function(a){name[a.mac]=a.name||a.mac;});
|
||||
// roam volume per AP (coverage-overlap proxy: high roam = redundant neighbors exist)
|
||||
var roam={};
|
||||
st.wifi_connectivity_event.find({site_id:SITE, time:{\$gte:since}},{from_endpoint:1,to_endpoint:1}).forEach(function(e){
|
||||
[e.from_endpoint&&e.from_endpoint.mac, e.to_endpoint&&e.to_endpoint.mac].forEach(function(m){ if(m) roam[m]=(roam[m]||0)+1; });
|
||||
});
|
||||
// airtime profile per AP per band from stat_hourly
|
||||
var prof={};
|
||||
st.stat_hourly.find({o:'ap', site_id:SITE, time:{\$gte:since}}).forEach(function(d){
|
||||
var ap=d.ap; if(!ap) return; if(!prof[ap])prof[ap]={};
|
||||
bands.forEach(function(b){
|
||||
var cu=d[b+'-cu_total'], intf=d[b+'-cu_interf'], sta=d[b+'-num_sta'];
|
||||
if(cu==null && intf==null) return;
|
||||
if(!prof[ap][b])prof[ap][b]={cu:0,intf:0,sta:0,n:0};
|
||||
var p=prof[ap][b]; p.cu+=(cu||0); p.intf+=(intf||0); p.sta+=(sta||0); p.n++;
|
||||
});
|
||||
});
|
||||
bands.forEach(function(b){
|
||||
var rows=[];
|
||||
for(var ap in prof){ var p=prof[ap][b]; if(!p||!p.n) continue;
|
||||
var avgCu=p.cu/p.n, avgIntf=p.intf/p.n, avgSta=p.sta/p.n, rm=roam[ap]||0;
|
||||
// score: airtime pressure (cu+interf) weighted, * redundancy(roam) / (load+1)
|
||||
var score = (avgCu + avgIntf) * Math.log(1+rm) / (1+avgSta);
|
||||
rows.push({ap:ap, name:name[ap]||ap, cu:avgCu, intf:avgIntf, sta:avgSta, roam:rm, score:score});
|
||||
}
|
||||
rows.sort(function(a,b){return b.score-a.score;});
|
||||
print("\\n==== band="+b+" top airtime-reduction candidates (disable/power-down) ====");
|
||||
print(" rank AP cu% interf% ~clients roams score hint");
|
||||
rows.slice(0,15).forEach(function(r,i){
|
||||
var hint = (r.roam>50 && r.sta<3) ? "DISABLE (redundant, low load)" : (r.cu+r.intf>40 ? "POWER-DOWN (busy)" : "review");
|
||||
print(" "+(i+1)+"\\t"+(r.name.substring(0,24)+" ").substring(0,24)+" "+r.cu.toFixed(0)+"\\t"+r.intf.toFixed(0)+"\\t"+r.sta.toFixed(1)+"\\t"+r.roam+"\\t"+r.score.toFixed(1)+"\\t"+hint);
|
||||
});
|
||||
print(" (APs profiled on "+b+": "+rows.length+")");
|
||||
});
|
||||
print("\\n[note] v1 heuristic: score = (cu_total + cu_interf) * log(1+roams) / (clients+1). High = busy+interfered AND clients have somewhere else to roam = safe to shrink/disable. Validate per-zone before applying. Full greedy coverage-safe optimizer = v2 (interference-model.md).");
|
||||
JS
|
||||
Reference in New Issue
Block a user