From 373883fb4889b83cd4eae46c70b41cd2b4c38d7f Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Tue, 23 Jun 2026 21:04:04 -0700 Subject: [PATCH] sync: auto-sync from GURU-5070 at 2026-06-23 21:03:04 Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-23 21:03:04 --- ...ke-laptopninja-debloat-vpn-fix-fail2ban.md | 77 +++++++++++++++++++ clients/grabb-durando/vpn-fail2ban.ps1 | 25 ++++++ errorlog.md | 6 ++ 3 files changed, 108 insertions(+) create mode 100644 clients/grabb-durando/session-logs/2026-06/2026-06-23-mike-laptopninja-debloat-vpn-fix-fail2ban.md create mode 100644 clients/grabb-durando/vpn-fail2ban.ps1 diff --git a/clients/grabb-durando/session-logs/2026-06/2026-06-23-mike-laptopninja-debloat-vpn-fix-fail2ban.md b/clients/grabb-durando/session-logs/2026-06/2026-06-23-mike-laptopninja-debloat-vpn-fix-fail2ban.md new file mode 100644 index 00000000..96672698 --- /dev/null +++ b/clients/grabb-durando/session-logs/2026-06/2026-06-23-mike-laptopninja-debloat-vpn-fix-fail2ban.md @@ -0,0 +1,77 @@ +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +## Session Summary + +New-asset provisioning + remote-access work for Grabb & Durando's user Jeff Williams (laptop LapTopNinja, Workgroup, ARM64 ASUS Copilot+, at the "Jeff's House" site). All work via GuruRMM (LapTopNinja agent ccb55043; GND-SERVER agent cd086074). + +De-bloated LapTopNinja: killed + fully removed the "PC App Store" PUP (Fast Corporation LTD — processes, a "Watchdog" persistence helper, service, folder, registry, autostart; its interactive uninstaller hung under -Wait so removal was done manually). Removed ASUS OEM bloat via silent MSI uninstall (Virtual Assistant, GlideX, StoryCube) + Virtual Pet (run-key/folder) + leftover ASUS folders (AdobePromotion, GlideX/StoryCube residue, AsLogDumpTool). Removed new Outlook (Microsoft.OutlookForWindows, all-users). Left AsusScreenXpert (ScreenPad hardware dependency — pending Mike's confirm) and the Microsoft-signed promo-looking Appx (benign placeholders). + +Set up remote access. Investigation showed GND-SERVER (Windows Server 2019, DC for gd.local, 192.168.242.200) runs a Windows RRAS VPN with SSTP still exposed on 443 (PPTP + L2TP were correctly closed at the UniFi). SSTP uses a valid public Let's Encrypt cert (vpn.grabblaw.com). Recommended reusing SSTP over installing OpenVPN (same security posture, zero install, on the same DC). Enabled jwilliams' dial-in (msNPAllowDialin) and created the SSTP client profile on LapTopNinja. + +VPN then immediately disconnected after sign-in (reason code 829). Deep diagnosis: it authenticated, got a pool IP, fully connected, then RRAS tore it down ~2s later — reproducible for ALL users and a minimal profile (reproduced with the gd.local Administrator break-glass cred via rasdial). Ruled out cert (SstpSvc hash matched live 443 cert), IP-pool conflict (pool free), auth, and subnet overlap. Root cause: the RRAS RemoteAccess service was in a stale state; **restarting RemoteAccess fixed it** — the real "Grabb & Durando Office" profile then stayed connected (tunnel IP 192.168.242.251, office reachable). + +Hardened against the SSTP brute-force seen in the logs (user1/test/vpn/admin from Contabo/abuse VPS ranges): deployed a PowerShell fail2ban on GND-SERVER (scheduled task, auto-blocks repeat offenders via Windows Firewall) AND blocked the two attacker /24s at the UGW3 edge (after fixing the gw-control block-ips bug that made the rule proto=gre). + +Diagnosed application crashes on LapTopNinja (SystemSettings/TabTip/Explorer-Taskbar) as build-level shell bugs from the 26H1 / build-28000 preview Windows train the Copilot+ ARM laptop is on (not de-bloat, not corruption — store clean, SFC + Settings re-register done, crashes persisted). Added an RDP desktop shortcut to 192.168.242.61. Decision: jwill is travelling; the machine is functional as-is, and it will be **reinstalled to stable retail Win11 24H2 when he returns** (coord todo 24eb7cb8). + +## Key Decisions + +- **Reuse SSTP, not install OpenVPN** — SSTP is TLS-over-443 with a public LE cert (same security as Mike's OpenVPN goal), already working, zero install; OpenVPN would add attack surface on the same DC for no gain. +- **Fix VPN by restarting RemoteAccess** — after isolating the drop to a server-side, all-users, all-profiles teardown (not cert/IP/auth/profile), a service restart cleared the stale RRAS state. +- **Defense-in-depth brute-force block** — fail2ban on the server (catches new individual offenders) + edge /24 drop on the UGW3 (keeps the abuse ranges off the DC entirely). +- **Did NOT remove AsusScreenXpert** — it drives the ScreenPad secondary display; removing it blind could break hardware. Left the Microsoft-signed odd-named Appx alone (benign). +- **Defer LapTopNinja reinstall** — crashes are cosmetic/self-recovering build bugs; reinstall to stable 24H2 when jwill returns rather than disrupt his travel. +- **RDP shortcut on Public Desktop** — shows for jwill regardless of OneDrive desktop redirection. + +## Problems Encountered + +- **PC App Store interactive uninstaller hung** under Start-Process -Wait (no /S support as SYSTEM) -> manual removal; a "Watchdog" process held the folder open until killed. +- **VPN connect-then-drop (829):** stale RRAS service; fixed by restarting RemoteAccess. 829 is non-standard (net helpmsg blank); diagnosis required reproducing via rasdial + isolating client vs server. +- **gw-control block-ips created a proto=gre drop rule** (cloned the disabled PPTP GRE rule's schema) -> ineffective; fixed via controller REST PUT setting protocol=all. Logged as a skill bug. +- **vault get-field returned a wrong 4-char value** for bare `password` (recurring resolution bug) -> 401; fixed by using `credentials.password`. Logged (cites the 2026-06-22 gitea entry). +- **UniFi controller REST 401/403:** REST path needs the site SHORT name (resolve via /api/self/sites), not the _id; and the CSRF token must be read case-insensitively (resp.headers.get, not dict()). Both logged. + +## Configuration Changes + +- **LapTopNinja (ccb55043):** removed PC App Store + ASUS bloat + new Outlook; created SSTP VPN profile "Grabb & Durando Office"; re-registered immersivecontrolpanel + SFC; added RDP shortcut `C:\Users\Public\Desktop\Office PC (192.168.242.61).rdp`. +- **GND-SERVER (cd086074):** enabled `jwilliams` msNPAllowDialin=$true; restarted RemoteAccess service (VPN fix); deployed `C:\Scripts\vpn-fail2ban.ps1` + scheduled task "VPN Fail2Ban" (every 10 min, SYSTEM) + Windows Firewall rule `Fail2Ban-VPN-Block` (seeded 161.97.182.206, 185.141.216.32). (Administrator dial-in was temporarily toggled for testing and reverted.) +- **UniFi UGW3 (Grabb and Durando site, short ui1n7yfh):** address-group `gw-control blocklist` (161.97.182.0/24, 185.141.216.0/24) + WAN_IN drop rule `Block gw-control blocklist` (protocol corrected gre->all). +- **Repo:** `clients/grabb-durando/vpn-fail2ban.ps1` (version-controlled copy of the deployed script); this session log. + +## Credentials & Secrets + +- No new credentials created. Used existing vault: `clients/grabb-durando/gd-local-domain-admin.sops.yaml` (gd.local\Administrator break-glass — used for VPN rasdial repro; already noted as having transited the RMM command log) and `infrastructure/uos-server-network-api-rw.sops.yaml` (UOS controller admin `claudetools`, field `credentials.password`). +- VPN auth = users' gd.local AD credentials (jwilliams enters his on first connect; profile has RememberCredential). + +## Infrastructure & Servers + +- **GND-SERVER:** Windows Server 2019, DC for `gd.local`, LAN 192.168.242.200, single NIC. RRAS VPN: SSTP (443), RAS dial-in pool 192.168.242.249-254 (server RAS iface .249). PPTP/L2TP closed. +- **VPN endpoint:** `vpn.grabblaw.com` -> office WAN 174.76.185.203 (UGW3 forwards 80,443 -> .200). Let's Encrypt cert (win-acme, monthly renew; 3 certs present, current valid to 2026-09-20). +- **LapTopNinja:** Workgroup, ARM64 ASUS Copilot+, Win11 build 28000.2340 (DisplayVersion 26H1, br_release — preview train). User profile `jwill` = AD `jwilliams` (Jeff Williams, Domain Admin). Agent ccb55043-b310-47df-afe3-2671c8ff113c. +- **UniFi:** Grabb and Durando on the UOS controller 172.16.3.29:11443 (site _id 6080d597f91fdd010f7c7155, short `ui1n7yfh`), UGW3 gateway. NOT in the cloud Site Manager. +- **RDP target:** 192.168.242.61 (office, over VPN). + +## Commands & Outputs + +- RMM auth: `eval "$(bash .claude/scripts/rmm-auth.sh)"` -> $TOKEN/$RMM. Dispatch via POST /api/agents//command. +- VPN fix: `Restart-Service RemoteAccess -Force` on GND-SERVER -> tunnel stays up (validated: tunnel IP 192.168.242.251, Test-Connection 192.168.242.200 = True). +- Edge block fix: controller login (172.16.3.29:11443) -> resolve short name via /api/self/sites -> PUT /proxy/network/api/s//rest/firewallrule/ protocol=all (CSRF via resp.headers.get, case-insensitive). +- VPN profile: SSTP, MSChapv2, split-tunnel, route 192.168.242.0/24, DnsSuffix gd.local, RememberCredential. + +## Pending / Incomplete Tasks + +- **Reinstall LapTopNinja to stable Win11 24H2 when jwill returns** (coord todo 24eb7cb8) — then redo de-bloat + VPN profile + RDP shortcut. +- **AsusScreenXpert** removal — pending Mike's ScreenPad confirmation. +- **Prevent VPN recurrence:** add a post-Let's-Encrypt-renewal hook to auto-restart RemoteAccess (offered; not yet done) so the stale-RRAS drop doesn't recur ~monthly. +- **RDP shortcut username** not pre-filled (prompts on connect) — set if Mike wants. +- Security note: `jwilliams` is a Domain Admin VPNing from a workgroup laptop — worth tightening later. + +## Reference Information + +- Coord todo: 24eb7cb8 (LapTopNinja reinstall). RMM agents: LapTopNinja ccb55043, GND-SERVER cd086074. +- fail2ban: `clients/grabb-durando/vpn-fail2ban.ps1` (deployed at C:\Scripts\ on GND-SERVER). +- Vault: clients/grabb-durando/gd-local-domain-admin.sops.yaml, infrastructure/uos-server-network-api-rw.sops.yaml. +- gw-control: `.claude/skills/unifi-wifi/scripts/gw-control.sh "Grabb and Durando" pf-list|fw-list|block-ips`. diff --git a/clients/grabb-durando/vpn-fail2ban.ps1 b/clients/grabb-durando/vpn-fail2ban.ps1 new file mode 100644 index 00000000..ffdbb15a --- /dev/null +++ b/clients/grabb-durando/vpn-fail2ban.ps1 @@ -0,0 +1,25 @@ +# VPN Fail2Ban for RRAS/SSTP on GND-SERVER - auto-blocks brute-force source IPs. +# Deployed by ClaudeTools 2026-06-23 to C:\Scripts\vpn-fail2ban.ps1; runs via scheduled task +# "VPN Fail2Ban" every 10 min as SYSTEM. Scans RRAS failed-auth events (System/RemoteAccess id 20271, +# which carry the real public source IP via the UGW3 DNAT) and adds repeat offenders (>=5 fails/3h) +# to the Windows Firewall inbound block rule 'Fail2Ban-VPN-Block'. Blocklist persisted at +# C:\Scripts\vpn-blocklist.txt; actions logged to C:\Scripts\vpn-fail2ban.log. +$ErrorActionPreference='SilentlyContinue' +$LookbackMin=180; $Threshold=5 +$RuleName='Fail2Ban-VPN-Block' +$Dir='C:\Scripts'; $BlockFile=Join-Path $Dir 'vpn-blocklist.txt'; $LogFile=Join-Path $Dir 'vpn-fail2ban.log' +if(-not (Test-Path $Dir)){ New-Item -ItemType Directory -Path $Dir | Out-Null } +function IsPublic($ip){ -not ($ip -match '^(10\.|192\.168\.|172\.(1[6-9]|2[0-9]|3[01])\.|127\.|169\.254\.|0\.)') } +# 1) tally failed VPN auth source IPs from RRAS event 20271 +$since=(Get-Date).AddMinutes(-$LookbackMin); $counts=@{} +Get-WinEvent -FilterHashtable @{LogName='System';ProviderName='RemoteAccess';Id=20271;StartTime=$since} -ErrorAction SilentlyContinue | ForEach-Object { + if($_.Message -match 'from\s+(\d{1,3}(?:\.\d{1,3}){3})'){ $ip=$Matches[1]; $counts[$ip]=[int]$counts[$ip]+1 } +} +$offenders=@($counts.GetEnumerator() | Where-Object { $_.Value -ge $Threshold -and (IsPublic $_.Key) } | ForEach-Object { $_.Key }) +# 2) merge with persisted blocklist +$blocked=@(); if(Test-Path $BlockFile){ $blocked=@(Get-Content $BlockFile | Where-Object { $_ -and (IsPublic $_) }) } +$new=@($offenders | Where-Object { $_ -notin $blocked }) +if($new.Count -gt 0){ $blocked=@(($blocked+$new) | Select-Object -Unique); $blocked | Set-Content $BlockFile; "$(Get-Date -Format s) blocked: $($new -join ', ')" | Add-Content $LogFile } +# 3) (re)apply single inbound block rule +Remove-NetFirewallRule -DisplayName $RuleName -ErrorAction SilentlyContinue +if($blocked.Count -gt 0){ New-NetFirewallRule -DisplayName $RuleName -Direction Inbound -Action Block -RemoteAddress $blocked -Profile Any -Description 'Auto-blocked VPN brute-force sources (ClaudeTools fail2ban)' | Out-Null } diff --git a/errorlog.md b/errorlog.md index 6f7be373..3831a4c9 100644 --- a/errorlog.md +++ b/errorlog.md @@ -17,6 +17,12 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure · +2026-06-24 | GURU-5070 | unifi-wifi/controller-rest | [friction] CSRF token missed because read via dict(resp.headers) (case-sensitive); UniFi returns X-Csrf-Token mixed-case -> PUT got 403. Use resp.headers.get() (case-insensitive) to capture X-CSRF-Token/X-Updated-Csrf-Token + +2026-06-24 | GURU-5070 | unifi-wifi/gw-control block-ips | [friction] block-ips clones an existing WAN_IN rule's schema; if it clones the PPTP GRE rule it creates a DROP rule with proto=gre -> ineffective against TCP/UDP brute-force. Had to PUT protocol=all. Fix: block-ips should force protocol=all on the new rule + +2026-06-24 | GURU-5070 | vault/get-field | [friction] get-field password returned a wrong 4-char value (not credentials.password) -> caused 401 login; always use the FULL dotted path credentials.password, don't rely on bare key [ctx: ref=errorlog 2026-06-22 gitea same class; entry=uos-server-network-api-rw] + 2026-06-24 | GURU-5070 | agy/ask-gemini | gemini CLI auth/setup failure (throwIneligibleOrProjectIdError, _doSetupUser) - empty response after 3 attempts; needs interactive 'gemini' re-login [ctx: task=factorio-research] 2026-06-24 | GURU-5070 | agy | gemini returned no response (empty after 3 attempts) [ctx: mode=search err= at process.processTicksAndRejections (node:internal/process/task_queues:104:]