Files
claudetools/session-logs/2026-04-30-session.md

433 lines
33 KiB
Markdown

# 2026-04-30 — cPanel CVE-2026-41940 incident response on IX + WebSvr + 1Password skill hardening
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-BEAST-ROG
- **Role:** admin
- **Session span:** 2026-04-29 ~14:30 PT rolling into 2026-04-30 ~07:15 PT (~6 hours of active engagement)
## Session Summary
The session began with a follow-up email to Michelle Sora, pitching a migration from GoDaddy-resold M365 to direct billing with consolidation of `pro-techservices.co` and `pro-techhelps.com` under one tenant. The tone was adjusted to avoid technical jargon and emphasize minimal user impact, with a soft close and call-to-action for a quick call. Final draft was saved to a temp file and opened in Notepad for Mike's review and send.
The bulk of the session focused on responding to **cPanel CVE-2026-41940**, a CRLF injection authentication bypass with CVSS 9.8 actively exploited in the wild since approximately 2026-02-23 per public security research. After verifying both ACG cPanel servers (WebSvr on CentOS 7 + cPanel 11.110.0.97; IX on CloudLinux 9 + cPanel 11.134.0.20) were already patched via daily auto-update, the cPanel-provided IOC detection script was run on both servers. Initial findings showed 7 of 7 flagged sessions on WebSvr and 11 of 16 on IX, including root sessions, plus four uncommented RSA keys in IX's `/root/.ssh/authorized_keys` — the classic attacker-persistence fingerprint.
Critical remediation followed Mike's authorization: forensic preservation of all flagged session files + access logs + last/wtmp output, removal of the four uncommented SSH keys with server-side backup, purge of all 16 flagged session files, root password rotation via `chpasswd`, and creation of new WHM API tokens. SOPS vault entries were updated with the new credentials, committed and pushed (`abfa955`). Per Mike's directive, three `transfer-*` tokens (leftover from past account migrations) were revoked from WebSvr; clustering tokens (`reverse_trust_*`, `NS2DNS`, `IX_DNS_Token`, `WEBSVR_DNS`, `ConfigCluster`, `PARENT-DO_NOT_DELETE-*`) and undocumented `Claude`/`ClaudeToken` tokens were kept.
A subsequent forensic deep-dive cleared the picture significantly. **All seven actual CVE exploit attempts across both servers returned HTTP 403** (the source IPs were DigitalOcean and similar cloud-VPS botnet scanners hitting `/json-api/version` with the injected token) — the patch is working as designed. The "multi-line pass" IOC hits on user sessions turned out to be false positives — those sessions had `method=handle_form_login` origins with normal cPanel UI traffic flagged by an IOC check that has poor specificity on cPanel 134. The unidentified `76.18.103.222` root session on IX (1203 hits Apr 29) was traced to routine SSL maintenance work — 1113 dashboard auto-refresh polls plus one `installssl` call, zero sensitive endpoints touched. Verdict: **patch held, all CVE attempts blocked at the HTTP layer; credential rotation served as defense-in-depth, not breach response.**
The session closed with two pieces of process improvement. First, the new credentials were synced to 1Password's Infrastructure vault — but Mike pushed back on my use of the desktop-app-integrated `op` session, which prompts to unlock the app in agent flows. He pointed out the SOPS-vaulted service account token (`infrastructure/1password-service-account.sops.yaml`) that should be used. After verifying the service token works prompt-free, I saved a feedback memory entry — and Mike pushed back again that memory files alone don't bind agent behavior reliably ("you ignore memory files"). The directive was then baked directly into `.claude/commands/1password.md` as a MANDATORY section at the top of the skill, with exact commands, vault path resolution from `identity.json`, scope details, and failure-mode guidance. Skill is in the synced ClaudeTools repo so when Howard syncs, his workstation gets the same enforcement.
A late report from Mike that the new IX password "doesn't seem to work" was investigated and confirmed to be a copy error on his end — server-side SSH and WHM web login both succeed with the rotated password.
## Key Decisions
- **WHM Transfer Tool migration over in-place ELevate** for any future WebSvr CentOS 7 → AlmaLinux move. Lower risk profile, parallel testing, easy rollback. ELevate makes more sense for AL8→AL9 single-hop later.
- **Preserve forensic evidence before purging sessions** — downloaded raw session files locally before any rm operation. Without that, the deep-dive analysis (which proved exploits were blocked) wouldn't have been possible.
- **Remove uncommented SSH keys despite uncertainty about Rob's identity.** Mike accepted the risk of accidentally cutting Rob off SSH on the basis that re-adding a known key is trivial, while leaving an attacker-style persistence vector in place is not.
- **Keep clustering tokens (reverse_trust_*, NS2DNS, IX_DNS_Token, WEBSVR_DNS, ConfigCluster, PARENT-DO_NOT_DELETE-*).** These are load-bearing for inter-server DNS clustering and account transfers between IX and WebSvr. The initial JSON parse error that turned the bulk-revoke into a no-op was, on reflection, the correct outcome.
- **Kill transfer-\* tokens** (transfer-1749689378, transfer-1765466491, transfer-1766779535) on WebSvr per Mike's directive. These are leftover from past T2T account migrations and serve no current purpose.
- **Howard granted Owner on ACG Azure subscription** (carried forward from prior session, but the rationale stands): matches CLAUDE.md trust model; one-time grant with `gururmm-signing-rg` resource lock + cost alert as guardrails removes Mike as a permanent bottleneck.
- **Use the SOPS-vaulted 1Password service account token for all `op` invocations**, never the desktop-app session. The desktop integration's unlock prompts are unacceptable in agent flows.
- **Bake directives that govern agent behavior into the SKILL files, not memory.** Memory entries are advisory; skill content is loaded at the moment of use and harder to ignore. Confirmed by Mike — "you ignore memory files."
## Problems Encountered
- **`whmapi1 api_token_list_v2` returned "Unknown app" error.** cPanel's API method name was different from what the skill docs implied. Worked around by reading `/var/cpanel/authn/api_tokens_v2/whostmgr/root.json` directly via SFTP+Python.
- **JSON parse error in initial token-revocation script.** I iterated the outer `tokens` key as if it were an item rather than a container of hashed-key items. The result was a no-op revoke (sent the literal name "tokens" to `api_token_revoke`, which silently succeeded as nothing-matched). On reflection this was the safe outcome — a correct parse would have revoked the legitimate clustering tokens and broken the IX↔WebSvr cluster. Caught and re-implemented correctly later.
- **`sops set` flags `--value-stdin` and `--value-file` are not implemented in sops 3.12.2 on Windows** despite being documented in `--help`. Worked around by using the `EDITOR` env var pattern with a small Python script that performs the YAML field replacement, then sops re-encrypts on close.
- **EDITOR path mangling in Git Bash.** Both backslash (`C:\path\script.py`) and forward-slash (`C:/path/script.py`) had different failure modes; forward slashes ultimately worked because Python on Windows accepts them and Git Bash didn't translate them mid-argument.
- **Git Bash MSYS path translation of `/cpsess...` arguments.** When passing a cPanel session token (which begins with `/`) as a command-line argument, Git Bash interpreted it as a path needing translation. Fixed by passing the token without leading slash.
- **`vault.sh get-field` requires PyYAML** which is missing in the wrapper's Python fallback path. Worked around by direct `sops -d` + grep + sed for the rest of the session. Filed mentally as a follow-up to fix the wrapper.
- **Memory file alone wasn't enough.** Mike confirmed I had been ignoring memory entries that documented preferred patterns. Real fix was baking the directive into the skill content itself so it loads at the moment the skill is invoked.
- **IX password "doesn't seem to work" alarm** — investigated end-to-end (SOPS, 1Password, SSH login, WHM HTTP login all verified working with the rotated password). Resolved as a copy-paste error on Mike's end.
## Configuration Changes
### Files created
- `.claude/memory/feedback_1password_service_token.md` — feedback memory entry on always using OP_SERVICE_ACCOUNT_TOKEN
- `session-logs/2026-04-30-session.md` — this file
### Files modified (ClaudeTools repo)
- `.claude/commands/1password.md` — added MANDATORY service-token section near the top, with vault resolution patterns and failure-mode guidance
- `.claude/memory/MEMORY.md` — added pointer to the new feedback entry
### Files modified (vault repo) — committed `f4d3554` rebased to `abfa955`, pushed
- `infrastructure/ix-server.sops.yaml` — root password updated to rotated value
- `infrastructure/websvr-legacy-hosting.sops.yaml` — root password updated, api-token replaced with new value
### 1Password Infrastructure vault items modified
- `WebSvr (Legacy Hosting)` (id `7tv3sgyhzbfpyhld6pyt5gn4li`): `password` and `API Token` fields updated
- `IX Server` (id `brsoqhoalrb4d53jn4lxcj4xdq`): `password` updated, **new `API Token` field added** (didn't exist before)
### Server-side changes (IX, 172.16.3.10)
- 9 flagged session files purged from `/var/cpanel/sessions/raw/`
- 4 uncommented SSH keys removed from `/root/.ssh/authorized_keys` (server-side backup at `/root/.ssh/authorized_keys.bak.20260430T132232Z`)
- Root password rotated via `chpasswd`
- New WHM API token created: `acg_post_cve_20260430T132232Z` = `PA42FSUXASFC0IO9MKH1DUHQ7L5G67PQ`
### Server-side changes (WebSvr, websvr.acghosting.com)
- 7 flagged session files purged from `/var/cpanel/sessions/raw/`
- Root password rotated via `chpasswd`
- New WHM API token created: `acg_post_cve_20260430T132423Z` = `81YBBUPPHZ2EEMWJ42WUN2VIOS01X9U5`
- Three transfer-* tokens revoked: `transfer-1749689378`, `transfer-1765466491`, `transfer-1766779535`
### Temp files (forensic evidence — preserved on local workstation)
- `C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/ix/` — 9 raw session files, access_log snapshot (~110 MB), last/wtmp dump, authorized_keys backup, NEW_ROOT_PASSWORD + NEW_API_TOKEN value files, IOC + forensic logs
- `C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/websvr/` — same structure, 7 session files, ~253 MB access_log
- `C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/session_76.18.103.222_dump.log` — extracted activity timeline for the unidentified IX root session
### Tooling artifacts (also preserved in temp)
- `ssh_check.py` — paramiko-based IOC scan + triage runner
- `ssh_remediate.py` — preserves evidence + removes keys + purges sessions + rotates root + creates new API token
- `fix_api_tokens.py` — secondary token enumeration via direct `root.json` read
- `revoke_transfer_tokens.py` — targeted whmapi1 revoke for specific token names
- `forensic_pass.py` — read-only deep-dive (cp_security_token usage analysis, suspect IP access logs, /etc/passwd + sudoers + SUID + system mod audit, webshell heuristic)
- `session_activity_dive.py` — full access_log extraction for a specific cpsess+IP combo with endpoint clustering and sensitive-keyword flagging
- `op_sync_creds.py` — subprocess-based 1Password item updater (avoids shell quoting issues)
- `sops_editor.py` — EDITOR-mode YAML field setter for SOPS-vaulted files
## Credentials & Secrets (UNREDACTED)
### Rotated 2026-04-30
| Service | Username | Value |
|---|---|---|
| IX server SSH/WHM root | root | `t4qygLl7{1zJcUj#022W^FBQ>}qYp-Od` |
| WebSvr SSH/WHM root | root | `[3H+_f.Yh4c0>@egH[6L!?u]S3s[9C82` |
| IX WHM API token (`acg_post_cve_20260430T132232Z`) | n/a | `PA42FSUXASFC0IO9MKH1DUHQ7L5G67PQ` |
| WebSvr WHM API token (`acg_post_cve_20260430T132423Z`) | n/a | `81YBBUPPHZ2EEMWJ42WUN2VIOS01X9U5` |
All four are stored in 1Password Infrastructure vault and SOPS vault entries. The IX vault entry doesn't currently have an `api-token` field structure — the IX API token is in 1Password but not yet in SOPS.
### Existing references confirmed (not rotated this session)
- `infrastructure/1password-service-account.sops.yaml` — Agentic-RW service token, kind=`api-key`. Used for prompt-free `op` CLI access. Scope: Clients, Infrastructure, Internal Sites, Managed Websites, MSP Tools, Projects, Sorting (Private intentionally excluded).
- WebSvr stale API token (now revoked from server, no longer in vault): `8ZPYVM6R0RGOHII7EFF533MX6EQ17M7O`
## Infrastructure & Servers
### IX Server (172.16.3.10 / `ix.azcomputerguru.com` / public 72.194.62.5)
- OS: CloudLinux 9.7 (TuxCare ELS kernel)
- cPanel: 11.134.0 build 20 (patched for CVE-2026-41940)
- 72 hosting accounts (per `/etc/trueuserdomains`)
- Auth: PAM via /etc/shadow (single password store; SSH and WHM share)
- WHM port 2087, cPanel port 2083, SSH port 22
### WebSvr (`websvr.acghosting.com`)
- External IPs: **162.248.93.78, 162.248.93.81, 162.248.93.233** (the vault entry shows .81; the public-IP probe returned .233)
- OS: CloudLinux 7.9 (CentOS 7 base — past EOL)
- cPanel: 11.110.0 build 97 (patched for CVE-2026-41940)
- 26 hosting accounts
- Same auth model as IX
### Suspect IPs encountered (history captured for future reference)
- `129.222.129.230` — confirmed Mike (today's WHM session on IX)
- `129.222.143.18` — likely Rob (web guy), Dec 15 2025 SSH burst on IX, no WHM activity
- `76.18.103.222` — unidentified WHM root session Apr 29 on IX (1203 hits, all benign — SSL maintenance), historical TPS support workflows match support pubkey ticket IDs (95714774, 95758605); could be Howard, Mike on different network, or a cPanel support tech
- `64.139.88.249` — WebSvr Feb 24 2026 root login from unfamiliar Cox AZ IP (5h 11m), logs rotated out, cannot characterize
- `23.180.120.132`, `143.198.113.39`, `159.65.217.152`, `149.102.229.144`, `195.177.94.161` — botnet scanner IPs (DigitalOcean, etc.) hitting `/json-api/version` with injected tokens. **All HTTP 403, all blocked.**
## Commands & Outputs
### CVE patched-version verification
```bash
/usr/local/cpanel/cpanel -V
# IX: 134.0 (build 20) <-- patched build for v134 stream
# WebSvr: 110.0 (build 97) <-- patched build for v110 stream
cat /etc/cpupdate.conf
# Both: UPDATES=daily <-- patches arrived automatically 2026-04-28
```
### IOC scan (script provided by cPanel)
- Saved at `/tmp/ioc_checksessions_files.sh` on each server (and locally in `cpanel-ioc-evidence/`)
- Output: 9 flagged on IX, 7 on WebSvr; 4 of the 16 were pre-auth `badpass` injection attempts that all later returned HTTP 403 in access_log
### Critical remediation key commands
```bash
# Backup + key removal (IX)
cp -p /root/.ssh/authorized_keys /root/.ssh/authorized_keys.bak.<TS>
sed -i.removed-<TS> '<line_nums>d' /root/.ssh/authorized_keys
# Session purge
for f in <flagged_files>; do rm -f "/var/cpanel/sessions/raw/$f"; done
# Password rotation
echo 'root:<NEWPW>' | chpasswd
# WHM API token rotation
whmapi1 api_token_create token_name='acg_post_cve_<TS>'
whmapi1 api_token_revoke token_name='<old_name>'
```
### 1Password CLI prompt-free auth (the new pattern)
```bash
SVC=$(sops -d /c/Users/guru/vault/infrastructure/1password-service-account.sops.yaml 2>/dev/null \
| grep -E '^\s*credential:' | sed -E 's/^\s*credential:\s*//' | head -1)
export OP_SERVICE_ACCOUNT_TOKEN="$SVC"
op whoami # User Type: SERVICE_ACCOUNT — no prompts
op item get "<id>" --vault Infrastructure --format json # NB: --vault required for service accounts
```
### Verification of rotated IX password (after Mike's "not working" alarm)
```
SSH (paramiko): OK — connected as root, /etc/shadow shows hash dated 20573
WHM web login: HTTP 200, status:1, /cpsess3383026374 issued
SOPS vault value: exact match
1Password value: exact match
```
→ password is correct end-to-end; Mike's "not working" was a copy error on his side.
## Pending / Incomplete Tasks
### Mike's outstanding items
- [ ] Send the Michelle Sora email when ready (`C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt`)
- [ ] Verify identity of `76.18.103.222` (likely Howard or Mike on a different network — once confirmed, case fully closed on IX)
- [ ] Decide whether to add `api-token` field structure to `infrastructure/ix-server.sops.yaml` and back-fill the new IX WHM API token there (it's in 1Password already)
- [ ] Decide whether to document the `Claude` and `ClaudeToken` WHM API tokens in SOPS or revoke them (currently undocumented + broad ACLs, kept per directive but flagged)
- [ ] Re-add Rob's SSH key to IX once he confirms which of the 4 removed keys was his (server-side backup at `/root/.ssh/authorized_keys.bak.20260430T132232Z`)
### Tracked TODOs (not blocking)
- [ ] Fix `vault.sh get-field` PyYAML dependency in the Python fallback path so the wrapper works for service-account-style entries
- [ ] Eventually run a fuller webshell/integrity scan across all hosted sites (the quick heuristic this session was minimal — only flagged WP core files)
- [ ] Long-term: WebSvr CentOS 7 → AlmaLinux migration via WHM Transfer Tool (separate project, not blocked)
### Items completed this session that close prior threads
- [x] CVE-2026-41940 IOC scan + remediation (both servers)
- [x] Credentials synced to 1Password Infrastructure vault
- [x] 1Password skill hardened with mandatory service-token directive
- [x] Memory entry added for service-token preference
- [x] Vault entries (SOPS) updated and pushed
## Reference Information
### Files / paths
- 1Password skill: `.claude/commands/1password.md` (project-local, takes precedence over the npm openclaw skill)
- Memory entry: `.claude/memory/feedback_1password_service_token.md`
- Memory index: `.claude/memory/MEMORY.md`
- Service account vault entry: `infrastructure/1password-service-account.sops.yaml`
- IX vault entry: `infrastructure/ix-server.sops.yaml`
- WebSvr vault entry: `infrastructure/websvr-legacy-hosting.sops.yaml`
- Forensic evidence directory: `C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/`
### Vault scope (Agentic-RW service account, verified 2026-04-30)
Visible: Clients, Infrastructure, Internal Sites, Managed Websites, MSP Tools, Projects, Sorting
Excluded: Private (intentional)
### CVE-2026-41940 reference
- CVSS 9.8, CRLF injection authentication bypass via session loading
- Affects all cPanel versions after 11.40 including DNSOnly
- Patched in: 11.110.0.97, 11.118.0.63, 11.126.0.54, 11.132.0.29, 11.134.0.20, 11.136.0.5
- Public PoC at watchTowr; active exploitation since ~2026-02-23
- Fix command: `/scripts/upcp --force`
### Why the deep-dive verdict was "patch held"
- Each of the 7 actual exploit attempts had distinct cp_security_tokens that, when grep'd against access_log, appeared exactly once each with HTTP 403 against `/json-api/version` (and /applist, /listwwwacctconf, /get_tweaksetting on one). No HTTP 200 with an injected token from any external IP. The patch's session-validation logic is doing its job.
---
## Update: 11:25 — Multi-client MSP work (Tedards, Bardach, Dataforth, Cascades/Golden Corral)
## User
- **User:** Mike Swanson (mike)
- **Machine:** DESKTOP-0O8A1RL
- **Role:** admin
## Session Summary
The session opened with a request to reset the webmail password for `accounting@tucsongoldencorral.com` on the Neptune Exchange server (67.206.163.124, Exchange 2016). WinRM was firewalled even after a VPN change, and browser automation via ECP at `https://neptune.acghosting.com/ecp` was attempted but interrupted — Mike resolved the password directly via Active Directory on DC16.
The session then shifted to Dataforth M365, granting Dan Center (`dcenter@dataforth.com`) FullAccess to Joel Lohr's (`jlohr@dataforth.com`) mailbox. This was executed via Exchange Operator InvokeCommand (`Add-MailboxPermission`) and completed cleanly with AutoMapping enabled.
Significant remediation tool work followed. The `onboard-tenant.sh` script was patched to assign the **Conditional Access Administrator** directory role to the Tenant Admin service principal at onboard time (resolving a 403 on CA policy Graph endpoints), and Howard's independently discovered `Policy.Read.All` backfill block was retained. A `# TODO(howard)` comment was added to the `role_assigned()` function documenting the PIM roleAssignmentSchedules gap. tedards.net was fully onboarded to the remediation tool suite.
Bardach client work: confirmed Barbara Bardach (`barbara@bardach.net`) holds Exchange Online Plan 2 + EXCHANGEARCHIVE licenses (100GB primary, 110GB archive). Auto-expanding archive was enabled via Exchange Operator InvokeCommand (`Enable-Mailbox -AutoExpandingArchive`), returning `AutoExpandingArchiveEnabled: true`. The bardach.net tenant was freshly onboarded this session.
QuickBooks Desktop 2024 "Missing PDF component" error on Yvonne Tedards' Windows 11 machine: the Amyuni PDF Converter virtual printer was missing entirely. Root cause identified as Windows 11 Protected Print Mode blocking legacy unsigned printer drivers. Steps given: disable Protected Print Mode in Settings, then run QB Repair from Programs and Features. Awaiting confirmation.
Syncro ticket management for Tedards: logged 30 min Remote Business ($75) on ticket #32219 (QB error), and created new ticket #32228 for the email delivery issue with `lindsay@agencyzoomify.com` (no billing yet).
Full DKIM setup for tedards.net was completed end-to-end via automation: selector1/selector2 CNAME values retrieved from M365 Exchange Online, added to the tedards.net DNS zone via WHM API (zone lives directly in WHM on the ACG IX server — no separate cPanel account), and DKIM enabled via `Set-DkimSigningConfig`. Final status: `Enabled: true, Status: Valid`. A `p=none` DMARC record was also added. A cron job was scheduled at 1:17 PM to auto-escalate DMARC to `p=quarantine` if DNS validation passes.
## Key Decisions
- **ECP browser automation for Neptune password reset abandoned** in favor of AD on DC16. WinRM blocked externally; AD reset is the correct tool for Exchange 2016 on-prem.
- **onboard-tenant.sh CA Admin fix via script** rather than ad-hoc patching. Idempotent; safe to re-run against existing tenants.
- **role_assigned() PIM gap flagged as TODO for Howard** — fix requires querying `roleAssignmentSchedules` in addition to `roleAssignments`; deferred to Howard who discovered it.
- **tedards.net DKIM handled via WHM API directly** — no separate cPanel account exists; zone is in WHM under ACG server account. Full automation, no browser required.
- **DMARC escalation deferred 2 hours** to allow propagation verification before moving from `p=none` to `p=quarantine`.
- **Bardach auto-expanding archive** chosen over additional Archive licenses. Exchange Online Plan 2 includes auto-expanding at no extra cost; archive quota becomes unlimited.
## Problems Encountered
- **investigator-exo 401 on tedards.net Exchange Online**: Security Investigator app returns 401 on InvokeCommand. Resolved by switching to exchange-op tier which has `full_access_as_app` Exchange role.
- **WHM account search returned no results for tedards**: tedards.net DNS zone managed directly in WHM (no cPanel account). Confirmed via `dumpzone` API.
- **Get-DkimSigningConfig with Domain parameter returned null**: M365 InvokeCommand rejects the `Domain` parameter on this cmdlet. Resolved by calling with empty parameters and filtering client-side.
- **M365 returned CnameMissing immediately after enabling DKIM**: stale negative cache. Records resolved correctly from 8.8.8.8. Re-running enable after 5 seconds returned `Enabled: true, Status: Valid`.
## Infrastructure and DNS Changes
### tedards.net DNS (WHM on 72.194.62.5)
| Record | Type | Value | Action |
|---|---|---|---|
| selector1._domainkey.tedards.net | CNAME | selector1-tedards-net._domainkey.tedards.w-v1.dkim.mail.microsoft | Added |
| selector2._domainkey.tedards.net | CNAME | selector2-tedards-net._domainkey.tedards.w-v1.dkim.mail.microsoft | Added |
| _dmarc.tedards.net | TXT | v=DMARC1; p=none; sp=none; adkim=r; aspf=r; | Added |
### M365 Changes
| Tenant | Action |
|---|---|
| dataforth.com | dcenter FullAccess to jlohr mailbox (Exchange Online) |
| bardach.net | Auto-expanding archive enabled for barbara@bardach.net |
| tedards.net | DKIM enabled (Enabled: true, Status: Valid) |
## Syncro Tickets
| Ticket | Client | Action |
|---|---|---|
| #32219 (ID 109545451) | Bill/Yvonne Tedards | 30 min Remote Business logged — QB PDF component fix ($75) |
| #32228 (ID 109697650) | Bill/Yvonne Tedards | Created — email delivery issue with lindsay@agencyzoomify.com (no billing yet) |
## Pending Tasks
- **QuickBooks PDF fix confirmation**: Yvonne Tedards, Win11. Steps given (disable Protected Print Mode + QB Repair). Awaiting result.
- **Tedards DMARC escalation**: cron scheduled 1:17 PM to escalate p=none to p=quarantine. Session-only — if Claude exits, run manually.
- **Tedards email issue** (ticket #32228): inability to send/receive email to/from lindsay@agencyzoomify.com. Not yet investigated.
- **Backfill onboard-tenant.sh** against 6 ACG tenants: bg-builders, cascades-tucson, cw-concrete, dataforth, heieck-org, mvan. Scheduled for 21:00 PT per note to Howard.
- **Howard TODO**: Fix `role_assigned()` in onboard-tenant.sh to also query `roleAssignmentSchedules` for PIM-managed assignments.
- **Cascades**: Grant Howard Contributor on `rg-audit-cascadestucson` once he creates the RG.
## Reference
- Neptune Exchange ECP: https://neptune.acghosting.com/ecp (Exchange 2016, on-prem)
- WHM API base: https://72.194.62.5:2087 (credentials in vault: infrastructure/ix-server.sops.yaml)
- tedards.net tenant ID: 4fcbb1f4-fbf9-4548-a93e-7d14a3c091e6
- bardach.net tenant ID: dd4a82e8-85a3-44ac-8800-07945ab4d95f
- Syncro API base: https://computerguru.syncromsp.com/api/v1 (vault: msp-tools/syncro.sops.yaml)
- onboard-tenant.sh: D:/claudetools/.claude/skills/remediation-tool/scripts/onboard-tenant.sh
---
## Note for Howard
**TIME TRACKING WORKFLOW ISSUE: 31 Tickets Missing Time Entries**
Mike clarified the business rule after my investigation: **ALL tickets should have time entries logged in Syncro** (even warranty/free work), with only cancelled tickets excepted. Time tracking is required for reporting/metrics, separate from billing decisions.
**What I Found:**
**Billing Status: ✅ WORKING CORRECTLY**
- 29 out of 31 tickets have proper invoices attached
- 2 tickets correctly have no invoices (Non-Billable/Cancelled)
- Revenue is being captured properly
**Time Tracking Status: ❌ BYPASSED ENTIRELY**
- All 29 invoiced tickets used **manual invoice line items** instead of time tracking
- Hours were typed into invoice descriptions ("Applied 1.5 Prepay Hours")
- Syncro time tracking system shows 00:00:00 for all 31 tickets
- This prevents time-based reporting (hours per client, technician productivity, etc.)
**Pattern Analysis:**
Current workflow appears to be:
1. Do work
2. Create invoice with manual line items
3. Type hours into description text
4. Close ticket without logging time
Proper workflow should be:
1. Do work
2. **Log time entry on ticket** (records hours in Syncro)
3. Time entry auto-generates invoice line item
4. Invoice sent
**Breakdown of 29 Invoiced Tickets:**
- **20 line items:** Prepay hour deductions (hours in description: "Applied X.0 Prepay Hours")
- **16 line items:** Direct charges (labor + products, billed at standard rates)
**Examples:**
- #32223 (Kittle): Manual "$75.0 - M365 user provisioning...0.5 hrs" — should have 0.5hr time entry
- #32218 (Instrumental): "Applied 1.5 Prepay Hours" — should have 1.5hr time entry
- #32167 (Cascades): "Applied 1.0 Prepay Hours" + "Applied 2.0 Prepay Hours" — should have 3.0hr time entry
- #32156 (Cascades): "Applied 8.0 Prepay Hours" — should have 8.0hr time entry
**Action Required:**
Per Mike's clarification:
1. **All tickets need time entries** — even warranty/free work should log time (mark as "Warranty" or appropriate type)
2. **Review workflow** — ensure time tracking discipline going forward
3. **Reporting impact** — missing time data means we can't accurately report on:
- Hours spent per client
- Technician productivity
- Average ticket resolution time
- Prepay hour burn rates
**Note on the 2 Non-Invoiced Tickets:**
- #32083 (DAnaise.com) — "Onsite - Alicia's computer freezing" — Non-Billable (still needs time entry if work was done)
- #32022 (Michael Johnson) — "*Cancelled* Onsite - Printer error" — Cancelled (no time entry needed)
**Note on Sombra (#32225):** Per Mike, RMM enrollment doesn't require billing, but if any actual work was done, it should have a time entry.
---
## Update: 17:10 — Tedards email diagnosis, DMARC escalation, billing
## User
- **User:** Mike Swanson (mike)
- **Machine:** DESKTOP-0O8A1RL
- **Role:** admin
## Session Summary
Diagnosed an email delivery issue for Tedards where emails from `lindsay@agencyzoomify.com` were routing to trash without any client-side rule. Checked Exchange Online inbox rules for `y226@tedards.net` (29 rules found, none targeting agencyzoomify.com) and reviewed the junk email configuration (blocked senders list did not include agencyzoomify.com). DNS email authentication for agencyzoomify.com was checked: SPF covers Titan Email and M365 with `~all` fallback, DMARC is set to `p=quarantine`, but DKIM records (selector1/selector2 CNAMEs) are entirely absent. Root cause identified as DMARC quarantine policy with no DKIM alignment — EOP at the receiving side quarantines messages that fail DMARC. Recommended adding `lindsay@agencyzoomify.com` to Yvonne's trusted senders as an immediate workaround, and advised that Lindsay's IT needs to enable DKIM in M365 for agencyzoomify.com. Mike has not yet confirmed the trusted senders add — still pending.
The tedards.net DMARC escalation cron job fired at 1:17 PM. DKIM was confirmed still `Enabled: true, Status: Valid` in M365. The `_dmarc.tedards.net` TXT record was resolving cleanly from public DNS (`p=none`). The old record (WHM zone line 19) was removed via `removezonerecord` and a new `p=quarantine` record was added via `addzonerecord`. Verification via nslookup from 8.8.8.8 confirmed the new record live.
Sync pulled Howard's new client stub for Sombra Residential LLC — a Windows Server 2012 box (labelled Server2013, actually WS2012 build 9200) enrolled in GuruRMM today. Machine is EOL since 2023-10-10 and running unpatched. Howard flagged it for Mike to discuss migration path with the client.
Billing was logged for the DKIM/DMARC work after showing Mike a preview: new Syncro ticket #32231 created (status Resolved), 1hr Remote Business at $150.
## Key Decisions
- **Trusted senders add pending explicit confirmation** — adding to the junk bypass list is a tenant-side change that affects mail filtering posture; held for Mike's yes.
- **DMARC escalated to p=quarantine rather than p=reject** — quarantine is a safe production policy; p=reject requires higher confidence in DKIM/SPF coverage and should be a deliberate next step.
- **Billing preview shown before submitting** — after missing the preview on the QB ticket earlier in the session, adopted pattern of showing subject/description/labor/amount before any Syncro POST.
## Problems Encountered
- **agencyzoomify.com has no DKIM** — `selector1._domainkey.agencyzoomify.com` returns NXDOMAIN. Their DMARC is `p=quarantine` which means any message failing DMARC alignment (likely on DKIM since SPF alignment depends on envelope-from) gets quarantined at the recipient. Not a tedards.net issue — it is entirely on the sending side.
## Infrastructure and DNS Changes
### tedards.net DNS (WHM on 72.194.62.5)
| Record | Change |
|---|---|
| `_dmarc.tedards.net` TXT | Updated: `p=none``p=quarantine; sp=quarantine; adkim=r; aspf=r;` |
## Syncro Tickets
| Ticket | Client | Action |
|---|---|---|
| #32231 (ID 109712846) | Bill/Yvonne Tedards | Created + 1hr Remote Business — DKIM/DMARC setup ($150) |
## Pending Tasks
- **Trusted senders add for Yvonne** — add `lindsay@agencyzoomify.com` to `y226@tedards.net` trusted senders via `Set-MailboxJunkEmailConfiguration`. Mike to confirm.
- **lindsay@agencyzoomify.com DKIM** — advise Yvonne to pass to Lindsay: enable DKIM in M365 Defender portal for agencyzoomify.com. Without it, their `p=quarantine` DMARC will continue causing delivery issues at other recipients too.
- **Sombra Residential WS2012 EOL** — Server2013 (actually WS2012, EOL 2023-10-10) enrolled by Howard. Needs migration path discussion with client. sysadmin account password also needs to be captured in vault.
- **QB PDF fix** (Yvonne Tedards) — awaiting confirmation that disabling Protected Print Mode + QB Repair resolved the issue.
- **Tedards email issue ticket #32228** — `lindsay@agencyzoomify.com` delivery problem. Root cause found; fix pending.
## Reference
- tedards.net Exchange mailboxes: `bt@tedards.net` (Bill), `y226@tedards.net` (Yvonne)
- tedards.net tenant ID: `4fcbb1f4-fbf9-4548-a93e-7d14a3c091e6`
- WHM API: `https://72.194.62.5:2087` (vault: `infrastructure/ix-server.sops.yaml`)
- agencyzoomify.com DKIM status: NO RECORDS — selector1/selector2 NXDOMAIN
- agencyzoomify.com DMARC: `v=DMARC1; p=quarantine; rua=mailto:lindsay@agencyzoomify.com`
- Sombra Residential vault: `clients/sombra-residential/server2013.sops.yaml`
- Syncro ticket #32228: Tedards email issue (no billing yet)
- Syncro ticket #32231: Tedards DKIM/DMARC ($150 logged)