claudetools/.claude/memory/reference_pfsense_25_07_ops.md at dafcec5bcec2c3329ff451fb761820249e3ddddc

Files

Howard Enos f36fb97eb8 sync: auto-sync from HOWARD-HOME at 2026-06-17 22:46:27

Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-17 22:46:27

2026-06-17 22:46:37 -07:00

2.4 KiB

Raw Blame History

name, description, metadata

name

description

metadata

reference_pfsense_25_07_ops

pfSense Plus 25.07 operational quirks learned during the Cascades power-outage recovery — plain-text logs (NOT clog), clean dhcpd restart via pfSsh.php, reboot the upstream modem after a config restore, ZFS power-loss resilience

type
reference

Learned on the Cascades pfSense (192.168.0.1, Plus 25.07-RELEASE, ZFS) during the 2026-06-17 power-outage recovery. Access: bash .claude/skills/unifi-wifi/scripts/pfsense-ssh.sh cascades-tucson run "<cmd>" (admin SSH = real shell). Incident report: clients/cascades-tucson/reports/2026-06-17-power-outage-incident.md.

Logs are PLAIN TEXT (ASCII), not clog binary. clog /var/log/dhcpd.log returns EMPTY on 25.07 → do NOT conclude "logs are empty / service not logging." Read with tail/grep/cat directly. (Burned a whole hypothesis on this — the DHCP server was actually fine.) file /var/log/*.log → ASCII text.
Clean single-instance DHCP restart from shell: echo "services_dhcpd_configure();" | /usr/local/sbin/pfSsh.php (regenerates /var/dhcpd/etc/dhcpd.conf + restarts ONE dhcpd; kills duplicates). A power-loss/dirty boot can leave two dhcpd processes fighting → clients get DISCOVER→OFFER but never REQUEST/ACK. Verify: pgrep -f "dhcpd -user" | wc -l should be 1. Test config: dhcpd -t -cf /var/dhcpd/etc/dhcpd.conf.
pfSsh.php is SLOW to load (~20-40s). SSH commands that invoke it need a long timeout (50s+) or they time out mid-run and you can't tell if the action took.
After a pfSense config restore/replace, REBOOT the upstream modem (Cox at Cascades) to re-sync the WAN — skipping this prolongs post-restore issues. Add to any restore runbook.
ZFS root is power-loss resistant — zpool status -x → "all pools are healthy"; config.xml survived an unclean power-off intact. A 50x on the GUI right after a dirty boot is usually transient (services still starting).
DHCP "offers but never completes" on ONE segment/switch = asymmetric L2 forwarding (DISCOVER reaches pfSense + OFFER sent on the right iface/subnet, but REQUEST=0/ACK=0 because the reply doesn't reach the client). Root cause is the switch (re-adopted with stale forwarding/bad port profile), NOT pfSense — fix = reset/re-adopt that switch. See reference_cascades_fr_gpo_fix for other Cascades infra notes.

2.4 KiB Raw Blame History

2.4 KiB

Raw Blame History