sync: auto-sync from HOWARD-HOME at 2026-06-18 12:31:06
Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-06-18 12:31:06
This commit is contained in:
@@ -0,0 +1,139 @@
|
||||
# Cascades — Voice VLAN 30 live migration (all Poly + desktop) + network-logging plan
|
||||
|
||||
## User
|
||||
- **User:** Howard Enos (howard)
|
||||
- **Machine:** Howard-Home
|
||||
- **Role:** tech
|
||||
|
||||
## Session Summary
|
||||
|
||||
Continuation of the 2026-06-17 VOICE VLAN 30 build (see `2026-06-17-howard-voice-vlan30-build.md`).
|
||||
This session executed the live device migration onto the isolated VLAN 30 and produced a spec for
|
||||
network observability. Work spanned the 06-17 -> 06-18 date boundary.
|
||||
|
||||
First, the Vertical-Remote management desktop was moved. Howard set USW-16-PoE port 16 native VLAN
|
||||
to VOICE, but the desktop kept its old `192.168.2.180` lease. Diagnosis (pfSense + UniFi
|
||||
controller) showed nothing misconfigured: re-VLANing a wired port does not bounce the NIC link, so
|
||||
Windows held its old lease and its unicast renewal to the old DHCP server was (correctly) blocked by
|
||||
the VOICE isolation rules. A UniFi client block/unblock is a MAC filter, not a link bounce, so it
|
||||
had no effect. Fixed by bouncing port 16 via the controller API (PUT rest/device port_overrides
|
||||
forward:disabled then restore, preserving ports 1-8) — the desktop re-DHCP'd to 10.0.30.201.
|
||||
|
||||
Second, Howard re-keyed the 22 Poly WiFi phones to the voice PPSK over ~2 hours. As each phone
|
||||
joined, the controller `/stat/sta` was polled to map the new 10.0.30.x lease to the phone's
|
||||
location/owner. A WiFi re-auth is itself a fresh DHCP, so the Poly phones needed no bounce. The
|
||||
first phone (Lauren Hasselman, Accounting Director) was validated end-to-end: dial tone + an
|
||||
outbound call to a cell phone. All 22 Poly phones plus the desktop (23 devices) ended up on VOICE,
|
||||
each pulling a clean lease and isolated from PHI/LAN/VLAN20/mgmt. A living inventory doc was created
|
||||
(`docs/network/voice-phone-inventory.md`) and the wiki Voice-VLAN entry flipped PLANNED -> IN
|
||||
PROGRESS.
|
||||
|
||||
Third, Howard raised the need for network logging to track devices that drop/get kicked and to
|
||||
root-cause the ongoing Cascades network issues. Investigation found the UniFi controller is
|
||||
retaining ZERO client events/alarms for the Cascades site over 7 days, and pfSense logs locally in
|
||||
tiny circular buffers — i.e., drop/kick history is not being captured at all. A "plan only" spec was
|
||||
written (`docs/network/network-logging-plan.md`) recommending the Synology cascadesDS (DSM Log
|
||||
Center syslog server) as the on-site collector (CS-SERVER ruled out as the fragile EOL DC), with
|
||||
pfSense + UniFi/AP syslog as sources and a 1-2 min client snapshotter to fill the controller's
|
||||
history gap.
|
||||
|
||||
Finally, a sync hit a rebase conflict because controller-query scratch files written to the repo
|
||||
CWD (.sta.json etc.) were swept into a commit by `git add -A`, and a stray locked curl.exe held the
|
||||
file. Killed the process, untracked .sta.json, gitignored the temp patterns, and pushed clean.
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- **Desktop cutover via port bounce, not NIC change.** Confirmed desktop is DHCP; the fix for a
|
||||
stuck lease after re-VLAN is a link bounce (port disable/enable or PoE power-cycle), not a NIC
|
||||
reconfig and not a UniFi client block/unblock.
|
||||
- **Read drop/kick state from the UniFi controller, not pfSense SSH**, after pfSense sshd began
|
||||
rate-limiting following many rapid SSH calls. Controller API (`/stat/sta`) was the healthy path
|
||||
and also gives AP/location hints.
|
||||
- **Track phones by MAC + location, not IP** (leases are dynamic; a phone may renew to a different
|
||||
10.0.30.x).
|
||||
- **Network-logging collector = Synology Log Center, NOT CS-SERVER.** CS-SERVER is the fragile
|
||||
EOL/degraded-RAID single DC; adding syslog ingestion is unacceptable. pfSense/UniFi are sources,
|
||||
not retention/search stores. Synology keeps it on-site and off the DC.
|
||||
- **Plan only for logging** (per Howard) — spec written, build scheduled later.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- **Desktop stuck on 192.168.2.180 after port moved to VLAN 30** — stale DHCP lease; renewal
|
||||
blocked by VOICE isolation. Resolved by bouncing port 16 via controller API -> re-DHCP to
|
||||
10.0.30.201.
|
||||
- **UniFi controller PUT returned HTTP 403** — UniFi OS requires a CSRF token on writes. Resolved by
|
||||
reading `x-updated-csrf-token` from the login response headers and sending `X-CSRF-Token`.
|
||||
- **pfSense SSH began failing (exit 255)** while ping still succeeded — sshd rate-limiting after many
|
||||
rapid `pfsense-ssh.sh` calls. Switched to the UniFi controller API for subsequent reads.
|
||||
- **Git-Bash `/tmp` path mismatch** — msys `curl -o /tmp/x.json` wrote where Windows python could not
|
||||
read (FileNotFoundError). Switched to CWD-relative scratch files.
|
||||
- **Scratch files committed + rebase blocked** — CWD-relative `.sta.json` got swept into a commit by
|
||||
sync's `git add -A`, and a stray locked `curl.exe` (PID 25252) held the file, blocking the rebase.
|
||||
Killed the process, `git rm --cached .sta.json`, gitignored `.sta.json`/`.dev.json`/`.q*`/etc.,
|
||||
committed, and pushed. (Lesson: write API scratch OUTSIDE the repo or use the ignored `.tmp-` prefix.)
|
||||
- **Earlier errorlog rebase conflict** (concurrent GURU-5070 entry) — resolved keeping both entries.
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
- **Created** `clients/cascades-tucson/docs/network/voice-phone-inventory.md` — living inventory, 23
|
||||
devices on VOICE (desktop + 22 Poly) with MAC/IP/location.
|
||||
- **Created** `clients/cascades-tucson/docs/network/network-logging-plan.md` — observability spec
|
||||
(build later).
|
||||
- **Updated** `clients/cascades-tucson/docs/network/voice-vlan-cutover.md` — added the
|
||||
bounce-to-re-DHCP CRITICAL step; fixed stale NIC-change/OpenVPN/reservation references.
|
||||
- **Updated** `wiki/clients/cascades-tucson.md` — Voice VLAN PLANNED -> IN PROGRESS, two locations.
|
||||
- **Updated** `.claude/memory/MEMORY.md` + created `project_cascades_isolated_vlan_pattern.md` (prior
|
||||
session, synced).
|
||||
- **Updated** `.gitignore` — ignore controller-query scratch patterns; `git rm --cached .sta.json`.
|
||||
- **pfSense (prior session, this thread):** VOICE rule protocol TCP -> Any via PHP config API.
|
||||
- **UniFi:** bounced USW-16-PoE port 16 (disable/restore) via controller API — temporary, restored
|
||||
to exact original (native VOICE, forward:customize).
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
- **VOICE PPSK key** `V0!c38863171` — vault `clients/cascades-tucson/wifi-voice-ppsk.sops.yaml`
|
||||
(created prior session, pushed). Entered on all 22 Poly phones this session.
|
||||
- UniFi controller RW: vault `infrastructure/uos-server-network-api-rw` (used for reads + the port
|
||||
bounce). pfSense admin: vault `clients/cascades-tucson/pfsense-firewall`. Synology: vault
|
||||
`clients/cascades-tucson/synology-cascadesds`.
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- **VOICE VLAN 30:** `10.0.30.0/24`, gw `10.0.30.1` (pfSense `igc1.30`/opt241), DHCP `.100-.250`,
|
||||
DNS `8.8.8.8/1.1.1.1`. Isolation = Guest-clone (any-proto quick blocks to 192.168.0.0/22 +
|
||||
10.0.0.0/8 + 172.16.0.0/12, then pass any).
|
||||
- **UniFi VOICE network** id `6a32e0194e709ad31ad161e6` (VLAN Only). USW-16-PoE mac
|
||||
`d8:b3:70:21:94:5f`, device_id `685f39078e65331c46ef7e90`. UOS `172.16.3.29:11443`, site
|
||||
`va6iba3v`.
|
||||
- **Synology cascadesDS** `192.168.0.120` (DSM up on :5001) — proposed logging collector.
|
||||
- **Jupiter** `172.16.3.20` (Unraid/Docker, hosts UniFi VM) — fallback collector host.
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
- Controller client poll (mapping phones): `POST /api/auth/login` -> GET
|
||||
`/proxy/network/api/s/va6iba3v/stat/sta`, filter `network==VOICE or vlan==30`.
|
||||
- Port bounce: `PUT /proxy/network/api/s/va6iba3v/rest/device/<id>` body
|
||||
`{"port_overrides":[... port16 forward:disabled ...]}` with `X-CSRF-Token`, then restore.
|
||||
- Drop/kick history check: `POST .../stat/event {"within":168,"_limit":5000}` and `.../stat/alarm`
|
||||
both returned **0** records for Cascades -> controller not retaining client history.
|
||||
- Result: 23 devices on VOICE (`10.0.30.201` desktop + `.202`-`.223` the 22 Poly).
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
- **8 wired AudioCodes** (USW-16-PoE ports 1-8) — flip port -> VOICE + **PoE power-cycle** each to
|
||||
re-DHCP. Not yet done.
|
||||
- **Christine's last name** (room 515, `10.0.30.220`, mac `48:25:67:64:95:6b`) — flagged VERIFY in
|
||||
the inventory (Howard unsure; "~Nyuda").
|
||||
- **Network logging build** — execute `network-logging-plan.md` (step 1: confirm Synology
|
||||
model/RAM/DSM -> Log Center-only vs Container Manager Graylog/Loki).
|
||||
- Confirm phones register to cloud PBX (assumed; dial-tone proven on one) — add Part A 5b pinhole
|
||||
only if a phone fails to register.
|
||||
|
||||
## Reference Information
|
||||
|
||||
- Runbook: `clients/cascades-tucson/docs/network/voice-vlan-cutover.md`
|
||||
- Inventory: `clients/cascades-tucson/docs/network/voice-phone-inventory.md`
|
||||
- Logging plan: `clients/cascades-tucson/docs/network/network-logging-plan.md`
|
||||
- Prior log: `clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md`
|
||||
- Memory: `.claude/memory/project_cascades_isolated_vlan_pattern.md`
|
||||
- Vault PPSK: `clients/cascades-tucson/wifi-voice-ppsk.sops.yaml`
|
||||
Reference in New Issue
Block a user