sync: auto-sync from GURU-5070 at 2026-06-18 12:49:38
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-18 12:49:38
This commit is contained in:
@@ -34,6 +34,7 @@
|
||||
- [reference_sqlx_migrations_immutable](reference_sqlx_migrations_immutable.md) -- NEVER edit an already-applied sqlx migration file — even a comment. sqlx::migrate! checksums each file at compile time and validates against _sqlx_migrations at startup; a changed checksum crash-loops the server with "migration N was previously applied but has been modified". Code review MUST flag any edit to an applied migration.
|
||||
- [AD2 SSH MTU blackhole](ad2-ssh-mtu-blackhole.md) — AD2 SSH "lockouts"/mid-session read-errors over the Dataforth OpenVPN were a PMTU blackhole (tunnel PMTU ~1424 vs adapter MTU 1500), NOT a ban/account-lockout/flaky tunnel. Fix: pin the OpenVPN adapter MTU to 1400 (done on GURU-5070 via its SYSTEM RMM agent); permanent = `mssfix 1360` on the OpenVPN server. Diagnose over RMM, not SSH.
|
||||
- [DSCA33/45 resolved via Hoffman](project_dsca33_45_resolved_via_hoffman.md) — The "lost" DSCA33/45 spec files are recoverable from the Hoffman API (original certs survived the wipe); do NOT ask John. 56/58 models mined into projects/dataforth-dos/dsca33-45-templates.json; only DSCA33-1948 + DSCA45-1746 (24 units) lack an original. AD2 handoff: DSCA33-45-HOFFMAN-RECOVERY-2026-06-18.md.
|
||||
- [AD2 comms via sync only](ad2-comms-via-sync-only.md) — The AD2 Dataforth-box Claude session is coord-API-isolated (Gitea only); coord msg/lock/todo never reach it. Coordinate with AD2 ONLY via git /sync (committed docs + ## Note blocks).
|
||||
|
||||
## Users
|
||||
- [Howard Enos](user_howard.md) — Mike's brother, technician, full access. Machines: ACG-TECH03L, Howard-Home (authoritative in users.json).
|
||||
|
||||
21
.claude/memory/ad2-comms-via-sync-only.md
Normal file
21
.claude/memory/ad2-comms-via-sync-only.md
Normal file
@@ -0,0 +1,21 @@
|
||||
---
|
||||
name: ad2-comms-via-sync-only
|
||||
description: The AD2 (Dataforth) Claude session is coord-API-isolated — reach it ONLY via git /sync (committed notes/docs), never coord messages/locks
|
||||
metadata:
|
||||
type: feedback
|
||||
---
|
||||
|
||||
The AD2 Dataforth-box Claude session is **network-isolated from the ACG coord API** (172.16.3.30 —
|
||||
the Dataforth network can't reach it); it only has Gitea/git access. So coord-API **messages, locks,
|
||||
and todos NEVER reach AD2**. ALL inter-session coordination with AD2 must go through git **`/sync`**:
|
||||
committed handoff docs and `## Note for <user>` blocks in synced session logs, which AD2 reads when
|
||||
it pulls. A coord lock on an AD2-only file (e.g. `datasheet-exact.js`) is also meaningless — only the
|
||||
AD2 session edits that box.
|
||||
|
||||
**Why:** burned a round of `coord msg send AD2` + lock that were silent no-ops (Mike: "You can't
|
||||
coord with AD2 — all comms needs to be via sync").
|
||||
|
||||
**How to apply:** to hand work to or coordinate with the AD2 session, write it into a committed doc
|
||||
(e.g. `projects/dataforth-dos/*HANDOFF*.md`) and/or a `## Note for <user>` block in a session log,
|
||||
then `/sync`. Do NOT use the coord skill for AD2. (Coord API is still fine for non-isolated ACG
|
||||
machines.) [[prefer-ssh-over-rmm]]
|
||||
@@ -0,0 +1,119 @@
|
||||
# 2026-06-18 — Darrell Delphen — Outlook email links failing (ISP SNI block)
|
||||
|
||||
## User
|
||||
- **User:** Mike Swanson (mike)
|
||||
- **Machine:** GURU-5070
|
||||
- **Role:** admin
|
||||
|
||||
## Session Summary
|
||||
|
||||
Darrell Delphen reported that links in Outlook email would not open on one workstation
|
||||
(DDDOffice072023), failing with "can't reach this page" / `ERR_CONNECTION_ABORTED` against a
|
||||
`url.emailprotection.link` URL, while links opened from Gmail worked fine. The failing host is
|
||||
Intermedia's Email Protection "Safe Link" rewriter — every link in Intermedia-protected mail is
|
||||
rewritten to `https://url.emailprotection.link/...`, so all Outlook-delivered links routed through it.
|
||||
|
||||
Diagnosis was done entirely over GuruRMM against agent `000ed57d-fd05-4001-871c-244f43155c16`. DNS
|
||||
resolved correctly and TCP 443 connected, but the TLS handshake died with SChannel `0x80090326`
|
||||
(SEC_E_ILLEGAL_MESSAGE — "message received was unexpected or badly formatted"). The endpoint's TLS
|
||||
stack was clean: FIPS off, no SCHANNEL protocol/cipher overrides, no cipher-suite GPO, only Windows
|
||||
Defender (no third-party AV/proxy/VPN/LSP/WFP callout). The same node `199.193.205.140` handshook
|
||||
successfully with the real SNI from GURU-5070 on a different network, proving the origin was healthy
|
||||
and the interference was on the endpoint's path. A blast-radius sweep showed only
|
||||
`url.emailprotection.link` failed while `google.com`, `microsoft.com`, `outlook.office365.com`,
|
||||
`cloudflare.com`, `badssl.com`, and even `login.serverdata.net` (same Intermedia infra) all succeeded;
|
||||
MTU was fine. An SNI-varied handshake to the same IP isolated it conclusively: `example.com` and
|
||||
`login.serverdata.net` SNIs succeeded 12/12 while `url.emailprotection.link` failed 12/12, across
|
||||
interleaved source ports — deterministic, SNI-keyed, not flow-hash/LAG. Root cause: a network device
|
||||
on the path performing SNI-based content inspection that corrupted the handshake for that one hostname.
|
||||
|
||||
The gateway turned out to be an ISP-provided Extreme **EXOS** device the client had no login for. The
|
||||
fix path was therefore ISP escalation. An escalation packet was drafted, and Cloudflare WARP was
|
||||
installed on the workstation as an interim workaround — tunneling past the SNI block. With WARP
|
||||
connected, egress moved to Cloudflare (104.28.152.216) and the real-SNI handshake succeeded (TLS 1.2,
|
||||
HTTP 200). The ISP then disabled a "NetIQ" web/URL-filtering feature on the gateway, which cleared the
|
||||
block at the source. After WARP was disconnected the native ISP path was verified working (5/5
|
||||
handshakes + HTTP 200, egress back to 167.89.210.225), so WARP was uninstalled and the machine
|
||||
returned to normal. Work was documented and billed on Syncro ticket #32437 — private technical note,
|
||||
customer-facing/emailed summary, and 1.0h remote labor ($150.00, invoice #1650728058).
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- Diagnosed exclusively via repeated GuruRMM `SslStream`/`Test-NetConnection` probes rather than
|
||||
asking the client to run tools — faster and reproducible.
|
||||
- Used an SNI-varied handshake to the *same fixed IP* as the decisive test. It separated
|
||||
destination/server problems from path interference and proved the block was keyed on the SNI string.
|
||||
- Ran a 12x repeatability test per SNI to rule out a faulty LAG/ECMP member (intermittent, 5-tuple
|
||||
keyed) vs deliberate content matching (deterministic). Result was 0/12 vs 12/12 — deterministic.
|
||||
- Chose Cloudflare WARP as the interim workaround because the block is SNI-based; any tunnel that
|
||||
encrypts egress past the EXOS hides the SNI. WARP is the lightest deploy.
|
||||
- Installed/connected WARP in stages (install, then connect, then verify) so each step could be
|
||||
confirmed before flipping the tunnel, given the agent bounces on network-stack changes.
|
||||
- Emailed the customer a plain-language summary (do_not_email:false) and kept the technical detail in
|
||||
a hidden note.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- **Shell state not persisting between Bash calls** — `$TOKEN`/`$RMM` from `rmm-auth.sh` were gone on
|
||||
the next call (first dispatch produced no output). Fixed by `eval "$(rmm-auth.sh)"` inside every Bash
|
||||
invocation.
|
||||
- **PowerShell single-quotes collided with bash single-quoted `SCRIPT='...'`** — embedded
|
||||
`'C:\Program Files\...'` terminated the bash string (`FilesCloudflareCloudflare: command not found`).
|
||||
Fixed by defining the script via a quoted heredoc `SCRIPT=$(cat <<'PS' ... PS)`.
|
||||
- **WARP install/connect bounced the RMM agent** — commands returned `interrupted` ("Agent restarted
|
||||
during execution") because installing/connecting WARP resets the network stack/WFP. The agent
|
||||
auto-reconnected (through WARP after connect); verified state with follow-up commands.
|
||||
- **First WARP uninstall found nothing** — the product registers as **"Cloudflare One Client"**, not
|
||||
"Cloudflare WARP", so the DisplayName filter missed it. Found the GUID
|
||||
`{9E49837E-2971-413F-9587-119FA819E572}` and removed via `msiexec /x`.
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
- **Endpoint DDDOffice072023:** Cloudflare WARP (Cloudflare One Client 2026.4.1390.0) installed,
|
||||
registered, connected, then later disconnected and **fully uninstalled**. Net change to the machine: none.
|
||||
- **ISP gateway (not us):** ISP disabled the "NetIQ" web/URL-filtering feature on the Extreme EXOS device.
|
||||
- No changes to the ClaudeTools repo code.
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
None discovered, created, or rotated this session.
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- **Endpoint:** DDDOffice072023 — Windows, GuruRMM agent `000ed57d-fd05-4001-871c-244f43155c16`
|
||||
(v0.6.66), RMM client "AZ Computer Guru" / site "Discovery test site". LAN gateway 192.168.1.1,
|
||||
ISP egress 167.89.210.225, WARP egress (while active) 104.28.152.216.
|
||||
- **ISP gateway:** Extreme Networks **EXOS** device, ISP-provided/managed (client has no login). Was
|
||||
running a "NetIQ" URL-filtering feature doing SNI inspection.
|
||||
- **Blocked destination:** `url.emailprotection.link` → CNAME `urlrs.gslb.serverdata.net` → A
|
||||
199.193.205.140 (Intermedia Email Protection link-rewriter). GSLB pool also advertises
|
||||
199.193.200.65 / 64.78.20.65 / 162.244.196.65, all TCP-dead from any network (not an ISP block).
|
||||
- **GuruRMM API:** http://172.16.3.30:3001 (JWT via `rmm-auth.sh`).
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
- Decisive SNI test (same IP, varied SNI), via RMM PowerShell:
|
||||
- SNI `url.emailprotection.link` → `0x80090326` SEC_E_ILLEGAL_MESSAGE (0/12 ok)
|
||||
- SNI `example.com` / `login.serverdata.net` → TLS 1.2 AES256 (12/12 ok)
|
||||
- Off-network control (GURU-5070) to 199.193.205.140 with real SNI → OK TLS 1.2.
|
||||
- Post-ISP-fix native verify: real-SNI handshake 5/5 OK + `Invoke-WebRequest` HEAD → HTTP 200,
|
||||
egress 167.89.210.225.
|
||||
- WARP removal: `msiexec /x {9E49837E-2971-413F-9587-119FA819E572} /qn /norestart` → exit 0;
|
||||
warp-svc/warp-cli/install-dir/uninstall-entry all gone.
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
- None functional. Block is resolved at the ISP. If the issue recurs, suspect the EXOS "NetIQ"
|
||||
feature being re-enabled — re-run the SNI-varied handshake test to confirm.
|
||||
- Optional: if WARP is ever rolled to more machines as a workaround, harden it (force auto-connect,
|
||||
lock client) and note egress moves to Cloudflare (breaks office-static-IP allowlisting).
|
||||
|
||||
## Reference Information
|
||||
|
||||
- **Syncro ticket:** #32437 (id 112766479) — https://computerguru.syncromsp.com/tickets/112766479
|
||||
- Private note id 419714810; public/emailed summary id 419714813
|
||||
- Line item id 42925426 (Labor - Remote Business 1.0h @ $150)
|
||||
- Invoice #1650728058, total $150.00; status Invoiced
|
||||
- **Customer:** Darrell Delphen (Syncro customer_id 35996725), no prepaid block.
|
||||
- **SChannel error:** 0x80090326 = SEC_E_ILLEGAL_MESSAGE (handshake message malformed/unexpected) —
|
||||
signature of in-path TLS/SNI tampering when paired with same-IP success off-network.
|
||||
@@ -21,6 +21,8 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure ·
|
||||
|
||||
2026-06-18 | Howard-Home | rmm | [friction] agent returns exit -1 'Failed to execute command' on a ~7KB multi-line powershell body sent as one command; split into <2KB section scripts and each ran fine [ctx: host=DESKTOP-TRCIEJA agent=0.6.66]
|
||||
|
||||
2026-06-18 | GURU-5070 | coord/ad2-comms | [correction] tried to coordinate with the AD2 session via coord API msg+lock; AD2 is network-isolated (Gitea only, no coord API) so those were no-ops. ALL inter-session comms with AD2 must go via git /sync (committed notes/docs).
|
||||
|
||||
2026-06-18 | GURU-5070 | syncro | comment POST piped straight to jq failed with 'jq: parse error: Invalid numeric literal at line 1 col 10' and left it AMBIGUOUS whether the note posted (GET-verify showed it had NOT); per no-retry rule had to GET first, then re-post. Robust pattern that worked: jq -n payload to a file, POST with --data-binary @file, capture response to a file, then GET-verify by subject. Skill's curl|jq comment pattern should adopt this. [ctx: ticket=32441 skill=syncro pattern=curl-pipe-jq]
|
||||
|
||||
2026-06-18 | GURU-5070 | post-bot-alert | Discord POST failed (non-200/unreachable) [ctx: channel=#bot-alerts http=400 resp={"message": "The request body contains invalid JSON.", "code": 50109}]
|
||||
|
||||
Reference in New Issue
Block a user