Diagnosed azcomputerguru.com 521 errors: Cox's BGP route to specific Cloudflare origin-pull prefixes (162.158.0.0/16, 172.64.0.0/13, 173.245.48.0/20, 141.101.64.0/18) is broken from 72.194.62.0/29. Confirmed by TCP probe matrix from pfSense WAN, traceroute latency comparison, and state-table showing 0 inbound CF connections while direct-internet traffic still reached origin. Deployed Cloudflare Tunnel 'acg-origin' on Jupiter Unraid as a Docker container. Routes 4 proxied hostnames (azcomputerguru.com, analytics., community., radio.) through the tunnel with HTTPS backend to IX 172.16.3.10:443 with per-ingress SNI matching. All 4 hostnames return 200 OK through CF edge after the cutover. Repo hygiene: - Merged clients/ix-server/ into clients/internal-infrastructure/ (IX is internal infra, not a paying-client account). Git detected the session-log files as renames so history is preserved. Updated 4 stale path references in 2 files. - Moved cox-bgp ticket draft out of projects/dataforth-dos/ (wrong project) to clients/internal-infrastructure/vendor-tickets/. - Relocated tunnel-setup helper scripts from projects/dataforth-dos/datasheet-pipeline/implementation/ to clients/internal-infrastructure/scripts/cloudflared-tunnel-setup/. Deleted superseded/abandoned login attempts. Sanitized hardcoded Jupiter/pfSense SSH passwords to pull from SOPS vault at runtime; Cloudflare token reads from env var (tokens still in 1Password, vault entry is metadata-only). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 KiB
Session Log — Internal Infrastructure — 2026-04-13
Cloudflare Tunnel deployment for azcomputerguru.com + Cox BGP diagnosis
Earlier 2026-04-13 work (SCMVAS git push, merge conflict resolution) is in
projects/dataforth-dos/session-logs/2026-04-12-session.md. This log picks up
when user reported azcomputerguru.com was still showing 521 after the initial
Cloudflare recovery.
Session Summary
User reported azcomputerguru.com returning 521 "Web server is down" through Cloudflare, despite:
- CF SSL mode being "Full" (not Strict)
- Origin IX server (172.16.3.10) responding 200 OK internally
- Origin reachable from external ISPs (non-CF path)
What was accomplished
-
Diagnosed root cause: Cox ISP has broken BGP routing from our netblock (72.194.62.0/29) to specific Cloudflare IP prefixes. TCP:443 from pfSense WAN succeeds to 104.16/17/26 ranges but times out to 162.158.0.0/16, 172.64.0.0/13, 173.245.48.0/20, 141.101.64.0/18. ICMP traceroute to affected prefixes shows ~173ms (cross-country peering) vs ~3.6ms for working prefixes — asymmetric/distant routing. Inbound CF→origin state count was 0 while direct-internet state count was 285, confirming only CF path was broken.
-
Deployed Cloudflare Tunnel on Jupiter (Unraid) as a permanent workaround. Tunnel reverses connection direction (outbound from container, using working CF prefixes), eliminating dependency on Cox's broken inbound routing.
-
Cut over 4 proxied hostnames to the tunnel via CF DNS API:
- azcomputerguru.com, analytics., community., radio.
- All 4 now return HTTP 200 OK through CF edge → tunnel → IX HTTPS vhost (SNI-matched)
-
Drafted Cox BGP escalation ticket with evidence (TCP matrix, traceroute comparison, state-table counts). Saved to
vendor-tickets/. -
Folder reorganization:
- Moved Cox ticket from
projects/dataforth-dos/datasheet-pipeline/implementation/(wrong — not a Dataforth file) →clients/internal-infrastructure/vendor-tickets/2026-04-13-cox-bgp-cloudflare-routing.md - Merged misnamed
clients/ix-server/intoclients/internal-infrastructure/(IX is internal infra, not a client). Session logs moved; folder removed; 4 stale path references updated across 2 files.
- Moved Cox ticket from
Key decisions & rationale
- Option C: tunnel on Jupiter Docker rather than pfSense (cloudflared isn't a pfSense package, firmware upgrades would wipe it) or IX (scoped to IX only; other internal origins would need separate tunnels). Jupiter already runs Unraid with many containers; cloudflared fits the existing pattern. One tunnel can route to any internal LAN IP.
- HTTPS backend (not HTTP) with
originServerName: <hostname>+noTLSVerify: true. Initial HTTP backend caused WordPress "force HTTPS" redirect loop on community/radio (they had HSTS/canonical-URL rules IX's other sites lacked). --user 65532(container default) withchown 65532:65532on host volume — earlier--user rootattempt wrote cert to/root/.cloudflared(outside bind mount) instead of/home/nonroot/.cloudflared.- Detached container for
tunnel login— earlier foreground attempts got killed when SSH exec_command hit its 9-minute timeout; detached container (cf-login) persists independent of SSH. - Didn't grey-cloud DNS (the quick-but-ugly fix); tunnel gives permanent architectural solution that survives future Cox BGP flaps.
Problems encountered and resolutions
| Problem | Resolution |
|---|---|
| Cloudflare token (Full DNS) lacks Zone Settings + Analytics permissions; couldn't read SSL/TLS mode or per-PoP origin-status | Used pfSense-side diagnostics (TCP probes + traceroute + state table) instead; conclusive without needing Analytics |
mkdir: no space left on device on /mnt/user/appdata/cloudflared despite cache showing 181GB free |
shfs (Unraid FUSE overlay) was being overly strict near 81% cache usage; bypassed by writing directly to /mnt/cache/appdata/cloudflared (raw cache pool, same physical SSD, skips shfs) |
cert.pem: permission denied writing to bind-mount volume |
Container runs as UID 65532 (nonroot), host dir was owned by nobody:users (99:100). Chowned host dir to 65532:65532 before retry |
--user root workaround wrote cert to /root/.cloudflared, outside the mount |
Dropped --user override after fixing host UID ownership |
Foreground docker run --rm for login got killed by SSH exec timeout after 9 min |
Used docker run -d --name cf-login (detached); container persists through SSH session endings |
| Tailscale was stopped mid-session (user moved to different network); lost all 172.16.x routes | User reconnected to local net; resumed |
| WordPress 301 redirect loop on community/radio after tunnel cutover | Switched tunnel origin from http://172.16.3.10:80 → https://172.16.3.10:443 with originServerName per ingress + noTLSVerify: true |
| Cox ticket draft initially saved under Dataforth project folder (wrong place) | User flagged; moved to clients/internal-infrastructure/vendor-tickets/ |
clients/ix-server/ existed as a separate folder when IX is internal infra |
Merged clients/ix-server/ (2 session logs) into clients/internal-infrastructure/session-logs/, removed empty folder, fixed 4 path references in 2 files |
Credentials
Cloudflare API tokens (from 1Password)
- Full DNS token:
DRRGkHS33pxAUjQfRDzDeVPtt6wwUU6FwtXqOzNj- Permissions: Zone:Read, DNS:Read/Edit (confirmed; actual scope narrower than 1Password note implies — lacks Zone Settings, Analytics, Tunnel)
- Token ID:
48607a8ba656e02050e97ae4b1b8fcdf
- Legacy token:
U1UTbBOWA4a69eWEBiqIbYh0etCGzrpTU4XaKp7w- Token ID:
162711358e386f178d81bb09ca800148 - Same limited scope (analytics.read also denied)
- Token ID:
- Account:
Mike@azcomputerguru.com's Account, Pro Website plan - Zone:
azcomputerguru.com, zone ID1beb9917c22b54be32e5215df2c227ce - Vault entry:
services/cloudflare.sops.yaml(contains metadata only — token values are in 1Password, not SOPS vault yet)
Jupiter (Unraid primary)
- SSH:
root / Th1nk3r^99##on 172.16.3.20:22 - Vault:
infrastructure/jupiter-unraid-primary.sops.yaml - iDRAC: 172.16.1.73,
root / Window123!@#-idrac
IX Server (origin)
- SSH:
root / Gptf*77ttb!@#!@#on 172.16.3.10:22 (internal) / 72.194.62.5 (public) - OS: CloudLinux 9.7 (RHEL 9 family), WHM/cPanel, Apache
- WHM: port 2087, cPanel: 2083
- Vault:
infrastructure/ix-server.sops.yaml
pfSense Firewall
- SSH:
admin / r3tr0gradE99!!on 172.16.0.1:2248 - OS: pfSense 2.8.1 (FreeBSD 15.0-CURRENT)
- WAN: 98.181.90.163/31, public IP block 72.194.62.2-.10 (all bound to igc0)
- Vault:
infrastructure/pfsense-firewall.sops.yaml - Note: no IDS/IPS installed (no suricata/snort/pfBlockerNG), firewalld disabled, 5706 states at time of diag
Infrastructure & Servers
Tunnel deployment
| Component | Value |
|---|---|
| Tunnel name | acg-origin |
| Tunnel UUID | 78d3e58f-1979-4f0e-a28b-98d6b3c3d867 |
| Tunnel target hostname | 78d3e58f-1979-4f0e-a28b-98d6b3c3d867.cfargotunnel.com |
| Host | Jupiter (172.16.3.20) |
| Docker container name | cloudflared (restart=unless-stopped) |
| Docker image | cloudflare/cloudflared:latest |
| Host volume | /mnt/cache/appdata/cloudflared/ (direct cache SSD, chowned 65532:65532) |
| Config file | /mnt/cache/appdata/cloudflared/config.yml |
| Cert file | /mnt/cache/appdata/cloudflared/cert.pem |
| Credentials file | /mnt/cache/appdata/cloudflared/78d3e58f-1979-4f0e-a28b-98d6b3c3d867.json |
| Active CF PoPs | phx01 ×2, lax11 (4 tunnel connections) |
DNS records updated (all proxied, zone azcomputerguru.com)
| Hostname | Before | After |
|---|---|---|
| azcomputerguru.com | A 72.194.62.5 (not proxied — was a bug; now is) | CNAME 78d3e58f-...cfargotunnel.com proxied |
| analytics.azcomputerguru.com | A 72.194.62.5 proxied | CNAME 78d3e58f-...cfargotunnel.com proxied |
| community.azcomputerguru.com | A 72.194.62.5 proxied | CNAME 78d3e58f-...cfargotunnel.com proxied |
| radio.azcomputerguru.com | A 72.194.62.5 proxied | CNAME 78d3e58f-...cfargotunnel.com proxied |
Note: azcomputerguru.com was proxied=False before the cutover (record ID c865ce7849e3567383433d74e5845f99). That's odd — it was serving through CF (as evidenced by the 521 responses which only CF serves) but the A record flag was False. Possibly via www CNAME + CF magic. Replaced with a proper proxied CNAME.
Paths this session
- Local:
D:\claudetools\clients\internal-infrastructure\(new target after reorg) - Local (old, removed):
D:\claudetools\clients\ix-server\ - Local scripts:
D:\claudetools\projects\dataforth-dos\datasheet-pipeline\implementation\jupiter_tunnel_*.py(should eventually move; they're tunnel-setup helpers, not Dataforth) - Jupiter:
/mnt/cache/appdata/cloudflared/(tunnel config/cert) - IX: No changes persisted (
cloudflaredbriefly installed via dnf then removed;/root/.cloudflared/deleted)
Commands & Outputs
Diagnostic cascade (definitive answer)
From pfSense (172.16.0.1):
$ for ip in 104.16.0.1 104.17.0.1 104.26.0.1 162.158.0.1 162.158.100.1 172.64.0.1 172.67.0.1 173.245.48.1 141.101.64.1; do
printf "%-16s " $ip; nc -z -v -w 2 $ip 443 2>&1 | head -1
done
104.16.0.1 OK Connection succeeded
104.17.0.1 OK Connection succeeded
104.26.0.1 OK Connection succeeded
162.158.0.1 FAIL Operation timed out
162.158.100.1 FAIL Operation timed out
172.64.0.1 FAIL Operation timed out
172.67.0.1 FAIL Operation timed out
173.245.48.1 FAIL Operation timed out
141.101.64.1 FAIL Operation timed out
$ pfctl -s states | grep "172.16.3.10:443" | wc -l
285 # non-CF users reaching origin fine
$ pfctl -s states | egrep "^[^|]*(104\.(2[6-9])|162\.(158|159)|172\.(64|67))" | head
# 0 results for 162.158.x inbound; 162.159.x outbound-only (initiated from LAN)
Tunnel completion (final state)
=== [2] create tunnel acg-origin ===
Created tunnel acg-origin with id 78d3e58f-1979-4f0e-a28b-98d6b3c3d867
=== [4] DNS cutover (A -> CNAME) ===
[azcomputerguru.com] current: type=A content=72.194.62.5 proxied=False id=c865ce7849e3567383433d74e5845f99
[OK] -> CNAME 78d3e58f-1979-4f0e-a28b-98d6b3c3d867.cfargotunnel.com proxied
[analytics.azcomputerguru.com] ... [OK]
[community.azcomputerguru.com] ... [OK]
[radio.azcomputerguru.com] ... [OK]
=== [6] wait for tunnel connections ===
[try 14] connections registered: 4
=== after HTTPS backend switch ===
azcomputerguru.com: HTTP 200 Server=cloudflare
analytics.azcomputerguru.com: HTTP 200 Server=cloudflare
community.azcomputerguru.com: HTTP 200 Server=cloudflare
radio.azcomputerguru.com: HTTP 200 Server=cloudflare
Cloudflare auth URLs issued (4 rounds before success)
Only the final one mattered — fresh container after chown fix:
https://dash.cloudflare.com/argotunnel?aud=&callback=https%3A%2F%2Flogin.cloudflareaccess.org%2F7RFAWDCIvWpHtiq0TsoMGEjV9zALX0xwmy1HZssO7mk%3D
Configuration Changes
On Jupiter (172.16.3.20)
New: /mnt/cache/appdata/cloudflared/config.yml
tunnel: 78d3e58f-1979-4f0e-a28b-98d6b3c3d867
credentials-file: /home/nonroot/.cloudflared/78d3e58f-1979-4f0e-a28b-98d6b3c3d867.json
ingress:
- hostname: azcomputerguru.com
service: https://172.16.3.10:443
originRequest:
originServerName: azcomputerguru.com
noTLSVerify: true
- hostname: analytics.azcomputerguru.com
service: https://172.16.3.10:443
originRequest:
originServerName: analytics.azcomputerguru.com
noTLSVerify: true
- hostname: community.azcomputerguru.com
service: https://172.16.3.10:443
originRequest:
originServerName: community.azcomputerguru.com
noTLSVerify: true
- hostname: radio.azcomputerguru.com
service: https://172.16.3.10:443
originRequest:
originServerName: radio.azcomputerguru.com
noTLSVerify: true
- service: http_status:404
New container: cloudflared (auto-restart via --restart=unless-stopped). Run command:
docker run -d --name cloudflared --restart=unless-stopped \
-v /mnt/cache/appdata/cloudflared:/home/nonroot/.cloudflared \
cloudflare/cloudflared:latest \
tunnel --config /home/nonroot/.cloudflared/config.yml run
Repo reorganization
| Action | From | To |
|---|---|---|
| Moved | projects/dataforth-dos/datasheet-pipeline/implementation/cox-bgp-ticket-draft.md |
clients/internal-infrastructure/vendor-tickets/2026-04-13-cox-bgp-cloudflare-routing.md |
| Moved | clients/ix-server/session-logs/2026-03-16-ix-account-cleanup.md |
clients/internal-infrastructure/session-logs/ |
| Moved | clients/ix-server/session-logs/2026-04-11-smart-slider-security-scan.md |
clients/internal-infrastructure/session-logs/ |
| Removed | clients/ix-server/ (empty after moves) |
— |
| Edited | session-logs/2026-04-11-session.md |
3x clients/ix-server/ → clients/internal-infrastructure/ |
| Edited | clients/internal-infrastructure/session-logs/2026-04-11-smart-slider-security-scan.md |
1x path update |
Scripts in projects/dataforth-dos/datasheet-pipeline/implementation/ relevant to tunnel setup but not yet moved (next session decision):
jupiter_tunnel_login5.py,jupiter_tunnel_login4.py,jupiter_tunnel_login3.py,jupiter_tunnel_login2.py,jupiter_tunnel_login.py(multiple login attempts, keep only the detached one)jupiter_tunnel_complete.py— the one that did the full cutoverjupiter_tunnel_fix_https.py— the HTTPS backend switchoverix_install_cloudflared.py,ix_tunnel_login.py(IX-side, abandoned)cf_analytics.py— GraphQL probe (showed analytics.read permission missing)pfsense_diag.py,pfsense_diag2.py,pfsense_trace.py— the diagnostic cascadecox-bgp-ticket-draft.md— already moved
Pending / Incomplete / Open Items
Action items for user
-
Submit Cox BGP ticket (file ready at
clients/internal-infrastructure/vendor-tickets/2026-04-13-cox-bgp-cloudflare-routing.md). Fixing their routing is the permanent root-cause fix; until then the tunnel is the mitigation. No SLA for this. -
Populate Cloudflare token in SOPS vault. Currently
services/cloudflare.sops.yamlhas metadata only — nocredentials:block. Token values live in 1Password. For pipeline automation it would be nicer to have them in SOPS like everything else:bash D:/vault/scripts/vault.sh edit services/cloudflare.sops.yaml # add credentials: { api_token_full_dns: DRRGkHS33pxAUjQfRDzDeVPtt6wwUU6FwtXqOzNj, api_token_legacy: U1UTbBOWA4a69eWEBiqIbYh0etCGzrpTU4XaKp7w, dns_zone_id: 1beb9917c22b54be32e5215df2c227ce } -
Consider expanding tunnel ingress to cover more proxied hostnames (if Cox BGP stays broken, other proxied hostnames would intermittently 521 too):
plex.azcomputerguru.com→ 72.194.62.4 (Jupiter NPM) — could route through tunnel tohttps://172.16.3.20:18443(NPM is already on Jupiter, could bypass public IP entirely)plexrequest.azcomputerguru.com,rustdesk.,sync.,secure.,backups.,enterpriseenrollment.,enterpriseregistration.,info.,mail.,store.,ui.— most are external-proxied CNAMEs, don't need tunnel; a few to Jupiter (.4) could benefit- Not urgent unless 521 recurs on one of them
-
Script cleanup — move tunnel-setup helper scripts out of
projects/dataforth-dos/datasheet-pipeline/implementation/(wrong project). Candidate targets:clients/internal-infrastructure/scripts/cloudflared/or similar. Not touched today. -
Commit this work — the tunnel DNS changes are already live. Local file changes (moves, log, ticket draft) not yet committed.
Vault hygiene (from earlier today, still pending)
clients/dataforth/ad2.sops.yaml: stale shell-escape backslash incredentials.password(storesPaper123\!@#; real isPaper123!@#).
Dataforth follow-ups (unrelated to today but still open)
- Verify
C:\Shares\test\scripts\Sync-FromNAS-rsync.ps1includes theVASLOG - Engineering Testedsubfolder for ongoing Engineering-tested .txt ingestion.
Reference Information
Cloudflare Tunnel management
To view logs:
ssh root@172.16.3.20 'docker logs cloudflared --tail 30'
To list tunnels:
docker run --rm -v /mnt/cache/appdata/cloudflared:/home/nonroot/.cloudflared cloudflare/cloudflared:latest tunnel list
To restart after config change:
docker restart cloudflared
# or stop + start for a fresh container state
To rotate the tunnel (delete + recreate):
docker run --rm -v /mnt/cache/appdata/cloudflared:/home/nonroot/.cloudflared cloudflare/cloudflared:latest tunnel delete -f acg-origin
# then re-run create + config steps
Cloudflare API one-liners
List DNS records for a hostname:
curl -H "Authorization: Bearer $CF_TOKEN" "https://api.cloudflare.com/client/v4/zones/$ZONE/dns_records?name=azcomputerguru.com"
Quick site probe:
curl -sI -A "Mozilla/5.0 Chrome/120.0" https://azcomputerguru.com/
# Expect: HTTP/1.1 200 OK Server=cloudflare
Useful paths and ports
| Resource | Value |
|---|---|
| Jupiter appdata | /mnt/cache/appdata/cloudflared/ |
| IX internal | http://172.16.3.10:80, https://172.16.3.10:443 |
| pfSense SSH | ssh admin@172.16.0.1 -p 2248 |
| Cloudflare API base | https://api.cloudflare.com/client/v4/zones/1beb9917c22b54be32e5215df2c227ce |
Cloudflare-IP prefix status (as of 2026-04-13 ~08:30)
| Prefix | Route via Cox | TCP:443 from pfSense |
|---|---|---|
| 104.16.0.0/13 | local/short path | OK |
| 104.24.0.0/14 | local/short path | OK |
| 162.158.0.0/16 | distant/broken | FAIL (timeout) |
| 172.64.0.0/13 | distant/broken | FAIL (timeout) |
| 173.245.48.0/20 | distant/broken | FAIL (timeout) |
| 141.101.64.0/18 | distant/broken | FAIL (timeout) |
Related Logs
- Earlier today:
projects/dataforth-dos/session-logs/2026-04-12-session.md(SCMVAS deploy finish + git merge conflict resolution) - Earlier related:
session-logs/2026-04-06-session.md(ScreenConnect redirect + UniFi OS VM) — shows public IP block context - Earlier related:
clients/internal-infrastructure/session-logs/2026-04-11-smart-slider-security-scan.md(IX WP audit, originally atclients/ix-server/) - Remote (pulled today): commit
499fd5d"Session log: Gitea recovery (Jupiter cache full)" — explains earlier intermittent Gitea 502s and Jupiter cache pressure seen today
Last Updated: 2026-04-13 Next Actions: submit Cox ticket; consider populating Cloudflare vault entry; monitor tunnel for 24h; cleanup misplaced helper scripts.