Files
claudetools/session-logs/2026-04-30-session.md

33 KiB

2026-04-30 — cPanel CVE-2026-41940 incident response on IX + WebSvr + 1Password skill hardening

User

  • User: Mike Swanson (mike)
  • Machine: GURU-BEAST-ROG
  • Role: admin
  • Session span: 2026-04-29 ~14:30 PT rolling into 2026-04-30 ~07:15 PT (~6 hours of active engagement)

Session Summary

The session began with a follow-up email to Michelle Sora, pitching a migration from GoDaddy-resold M365 to direct billing with consolidation of pro-techservices.co and pro-techhelps.com under one tenant. The tone was adjusted to avoid technical jargon and emphasize minimal user impact, with a soft close and call-to-action for a quick call. Final draft was saved to a temp file and opened in Notepad for Mike's review and send.

The bulk of the session focused on responding to cPanel CVE-2026-41940, a CRLF injection authentication bypass with CVSS 9.8 actively exploited in the wild since approximately 2026-02-23 per public security research. After verifying both ACG cPanel servers (WebSvr on CentOS 7 + cPanel 11.110.0.97; IX on CloudLinux 9 + cPanel 11.134.0.20) were already patched via daily auto-update, the cPanel-provided IOC detection script was run on both servers. Initial findings showed 7 of 7 flagged sessions on WebSvr and 11 of 16 on IX, including root sessions, plus four uncommented RSA keys in IX's /root/.ssh/authorized_keys — the classic attacker-persistence fingerprint.

Critical remediation followed Mike's authorization: forensic preservation of all flagged session files + access logs + last/wtmp output, removal of the four uncommented SSH keys with server-side backup, purge of all 16 flagged session files, root password rotation via chpasswd, and creation of new WHM API tokens. SOPS vault entries were updated with the new credentials, committed and pushed (abfa955). Per Mike's directive, three transfer-* tokens (leftover from past account migrations) were revoked from WebSvr; clustering tokens (reverse_trust_*, NS2DNS, IX_DNS_Token, WEBSVR_DNS, ConfigCluster, PARENT-DO_NOT_DELETE-*) and undocumented Claude/ClaudeToken tokens were kept.

A subsequent forensic deep-dive cleared the picture significantly. All seven actual CVE exploit attempts across both servers returned HTTP 403 (the source IPs were DigitalOcean and similar cloud-VPS botnet scanners hitting /json-api/version with the injected token) — the patch is working as designed. The "multi-line pass" IOC hits on user sessions turned out to be false positives — those sessions had method=handle_form_login origins with normal cPanel UI traffic flagged by an IOC check that has poor specificity on cPanel 134. The unidentified 76.18.103.222 root session on IX (1203 hits Apr 29) was traced to routine SSL maintenance work — 1113 dashboard auto-refresh polls plus one installssl call, zero sensitive endpoints touched. Verdict: patch held, all CVE attempts blocked at the HTTP layer; credential rotation served as defense-in-depth, not breach response.

The session closed with two pieces of process improvement. First, the new credentials were synced to 1Password's Infrastructure vault — but Mike pushed back on my use of the desktop-app-integrated op session, which prompts to unlock the app in agent flows. He pointed out the SOPS-vaulted service account token (infrastructure/1password-service-account.sops.yaml) that should be used. After verifying the service token works prompt-free, I saved a feedback memory entry — and Mike pushed back again that memory files alone don't bind agent behavior reliably ("you ignore memory files"). The directive was then baked directly into .claude/commands/1password.md as a MANDATORY section at the top of the skill, with exact commands, vault path resolution from identity.json, scope details, and failure-mode guidance. Skill is in the synced ClaudeTools repo so when Howard syncs, his workstation gets the same enforcement.

A late report from Mike that the new IX password "doesn't seem to work" was investigated and confirmed to be a copy error on his end — server-side SSH and WHM web login both succeed with the rotated password.

Key Decisions

  • WHM Transfer Tool migration over in-place ELevate for any future WebSvr CentOS 7 → AlmaLinux move. Lower risk profile, parallel testing, easy rollback. ELevate makes more sense for AL8→AL9 single-hop later.
  • Preserve forensic evidence before purging sessions — downloaded raw session files locally before any rm operation. Without that, the deep-dive analysis (which proved exploits were blocked) wouldn't have been possible.
  • Remove uncommented SSH keys despite uncertainty about Rob's identity. Mike accepted the risk of accidentally cutting Rob off SSH on the basis that re-adding a known key is trivial, while leaving an attacker-style persistence vector in place is not.
  • Keep clustering tokens (reverse_trust_, NS2DNS, IX_DNS_Token, WEBSVR_DNS, ConfigCluster, PARENT-DO_NOT_DELETE-). These are load-bearing for inter-server DNS clustering and account transfers between IX and WebSvr. The initial JSON parse error that turned the bulk-revoke into a no-op was, on reflection, the correct outcome.
  • Kill transfer-* tokens (transfer-1749689378, transfer-1765466491, transfer-1766779535) on WebSvr per Mike's directive. These are leftover from past T2T account migrations and serve no current purpose.
  • Howard granted Owner on ACG Azure subscription (carried forward from prior session, but the rationale stands): matches CLAUDE.md trust model; one-time grant with gururmm-signing-rg resource lock + cost alert as guardrails removes Mike as a permanent bottleneck.
  • Use the SOPS-vaulted 1Password service account token for all op invocations, never the desktop-app session. The desktop integration's unlock prompts are unacceptable in agent flows.
  • Bake directives that govern agent behavior into the SKILL files, not memory. Memory entries are advisory; skill content is loaded at the moment of use and harder to ignore. Confirmed by Mike — "you ignore memory files."

Problems Encountered

  • whmapi1 api_token_list_v2 returned "Unknown app" error. cPanel's API method name was different from what the skill docs implied. Worked around by reading /var/cpanel/authn/api_tokens_v2/whostmgr/root.json directly via SFTP+Python.
  • JSON parse error in initial token-revocation script. I iterated the outer tokens key as if it were an item rather than a container of hashed-key items. The result was a no-op revoke (sent the literal name "tokens" to api_token_revoke, which silently succeeded as nothing-matched). On reflection this was the safe outcome — a correct parse would have revoked the legitimate clustering tokens and broken the IX↔WebSvr cluster. Caught and re-implemented correctly later.
  • sops set flags --value-stdin and --value-file are not implemented in sops 3.12.2 on Windows despite being documented in --help. Worked around by using the EDITOR env var pattern with a small Python script that performs the YAML field replacement, then sops re-encrypts on close.
  • EDITOR path mangling in Git Bash. Both backslash (C:\path\script.py) and forward-slash (C:/path/script.py) had different failure modes; forward slashes ultimately worked because Python on Windows accepts them and Git Bash didn't translate them mid-argument.
  • Git Bash MSYS path translation of /cpsess... arguments. When passing a cPanel session token (which begins with /) as a command-line argument, Git Bash interpreted it as a path needing translation. Fixed by passing the token without leading slash.
  • vault.sh get-field requires PyYAML which is missing in the wrapper's Python fallback path. Worked around by direct sops -d + grep + sed for the rest of the session. Filed mentally as a follow-up to fix the wrapper.
  • Memory file alone wasn't enough. Mike confirmed I had been ignoring memory entries that documented preferred patterns. Real fix was baking the directive into the skill content itself so it loads at the moment the skill is invoked.
  • IX password "doesn't seem to work" alarm — investigated end-to-end (SOPS, 1Password, SSH login, WHM HTTP login all verified working with the rotated password). Resolved as a copy-paste error on Mike's end.

Configuration Changes

Files created

  • .claude/memory/feedback_1password_service_token.md — feedback memory entry on always using OP_SERVICE_ACCOUNT_TOKEN
  • session-logs/2026-04-30-session.md — this file

Files modified (ClaudeTools repo)

  • .claude/commands/1password.md — added MANDATORY service-token section near the top, with vault resolution patterns and failure-mode guidance
  • .claude/memory/MEMORY.md — added pointer to the new feedback entry

Files modified (vault repo) — committed f4d3554 rebased to abfa955, pushed

  • infrastructure/ix-server.sops.yaml — root password updated to rotated value
  • infrastructure/websvr-legacy-hosting.sops.yaml — root password updated, api-token replaced with new value

1Password Infrastructure vault items modified

  • WebSvr (Legacy Hosting) (id 7tv3sgyhzbfpyhld6pyt5gn4li): password and API Token fields updated
  • IX Server (id brsoqhoalrb4d53jn4lxcj4xdq): password updated, new API Token field added (didn't exist before)

Server-side changes (IX, 172.16.3.10)

  • 9 flagged session files purged from /var/cpanel/sessions/raw/
  • 4 uncommented SSH keys removed from /root/.ssh/authorized_keys (server-side backup at /root/.ssh/authorized_keys.bak.20260430T132232Z)
  • Root password rotated via chpasswd
  • New WHM API token created: acg_post_cve_20260430T132232Z = PA42FSUXASFC0IO9MKH1DUHQ7L5G67PQ

Server-side changes (WebSvr, websvr.acghosting.com)

  • 7 flagged session files purged from /var/cpanel/sessions/raw/
  • Root password rotated via chpasswd
  • New WHM API token created: acg_post_cve_20260430T132423Z = 81YBBUPPHZ2EEMWJ42WUN2VIOS01X9U5
  • Three transfer-* tokens revoked: transfer-1749689378, transfer-1765466491, transfer-1766779535

Temp files (forensic evidence — preserved on local workstation)

  • C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/ix/ — 9 raw session files, access_log snapshot (~110 MB), last/wtmp dump, authorized_keys backup, NEW_ROOT_PASSWORD + NEW_API_TOKEN value files, IOC + forensic logs
  • C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/websvr/ — same structure, 7 session files, ~253 MB access_log
  • C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/session_76.18.103.222_dump.log — extracted activity timeline for the unidentified IX root session

Tooling artifacts (also preserved in temp)

  • ssh_check.py — paramiko-based IOC scan + triage runner
  • ssh_remediate.py — preserves evidence + removes keys + purges sessions + rotates root + creates new API token
  • fix_api_tokens.py — secondary token enumeration via direct root.json read
  • revoke_transfer_tokens.py — targeted whmapi1 revoke for specific token names
  • forensic_pass.py — read-only deep-dive (cp_security_token usage analysis, suspect IP access logs, /etc/passwd + sudoers + SUID + system mod audit, webshell heuristic)
  • session_activity_dive.py — full access_log extraction for a specific cpsess+IP combo with endpoint clustering and sensitive-keyword flagging
  • op_sync_creds.py — subprocess-based 1Password item updater (avoids shell quoting issues)
  • sops_editor.py — EDITOR-mode YAML field setter for SOPS-vaulted files

Credentials & Secrets (UNREDACTED)

Rotated 2026-04-30

Service Username Value
IX server SSH/WHM root root t4qygLl7{1zJcUj#022W^FBQ>}qYp-Od
WebSvr SSH/WHM root root [3H+_f.Yh4c0>@egH[6L!?u]S3s[9C82
IX WHM API token (acg_post_cve_20260430T132232Z) n/a PA42FSUXASFC0IO9MKH1DUHQ7L5G67PQ
WebSvr WHM API token (acg_post_cve_20260430T132423Z) n/a 81YBBUPPHZ2EEMWJ42WUN2VIOS01X9U5

All four are stored in 1Password Infrastructure vault and SOPS vault entries. The IX vault entry doesn't currently have an api-token field structure — the IX API token is in 1Password but not yet in SOPS.

Existing references confirmed (not rotated this session)

  • infrastructure/1password-service-account.sops.yaml — Agentic-RW service token, kind=api-key. Used for prompt-free op CLI access. Scope: Clients, Infrastructure, Internal Sites, Managed Websites, MSP Tools, Projects, Sorting (Private intentionally excluded).
  • WebSvr stale API token (now revoked from server, no longer in vault): 8ZPYVM6R0RGOHII7EFF533MX6EQ17M7O

Infrastructure & Servers

IX Server (172.16.3.10 / ix.azcomputerguru.com / public 72.194.62.5)

  • OS: CloudLinux 9.7 (TuxCare ELS kernel)
  • cPanel: 11.134.0 build 20 (patched for CVE-2026-41940)
  • 72 hosting accounts (per /etc/trueuserdomains)
  • Auth: PAM via /etc/shadow (single password store; SSH and WHM share)
  • WHM port 2087, cPanel port 2083, SSH port 22

WebSvr (websvr.acghosting.com)

  • External IPs: 162.248.93.78, 162.248.93.81, 162.248.93.233 (the vault entry shows .81; the public-IP probe returned .233)
  • OS: CloudLinux 7.9 (CentOS 7 base — past EOL)
  • cPanel: 11.110.0 build 97 (patched for CVE-2026-41940)
  • 26 hosting accounts
  • Same auth model as IX

Suspect IPs encountered (history captured for future reference)

  • 129.222.129.230 — confirmed Mike (today's WHM session on IX)
  • 129.222.143.18 — likely Rob (web guy), Dec 15 2025 SSH burst on IX, no WHM activity
  • 76.18.103.222 — unidentified WHM root session Apr 29 on IX (1203 hits, all benign — SSL maintenance), historical TPS support workflows match support pubkey ticket IDs (95714774, 95758605); could be Howard, Mike on different network, or a cPanel support tech
  • 64.139.88.249 — WebSvr Feb 24 2026 root login from unfamiliar Cox AZ IP (5h 11m), logs rotated out, cannot characterize
  • 23.180.120.132, 143.198.113.39, 159.65.217.152, 149.102.229.144, 195.177.94.161 — botnet scanner IPs (DigitalOcean, etc.) hitting /json-api/version with injected tokens. All HTTP 403, all blocked.

Commands & Outputs

CVE patched-version verification

/usr/local/cpanel/cpanel -V
# IX:     134.0 (build 20)   <-- patched build for v134 stream
# WebSvr: 110.0 (build 97)   <-- patched build for v110 stream
cat /etc/cpupdate.conf
# Both:   UPDATES=daily      <-- patches arrived automatically 2026-04-28

IOC scan (script provided by cPanel)

  • Saved at /tmp/ioc_checksessions_files.sh on each server (and locally in cpanel-ioc-evidence/)
  • Output: 9 flagged on IX, 7 on WebSvr; 4 of the 16 were pre-auth badpass injection attempts that all later returned HTTP 403 in access_log

Critical remediation key commands

# Backup + key removal (IX)
cp -p /root/.ssh/authorized_keys /root/.ssh/authorized_keys.bak.<TS>
sed -i.removed-<TS> '<line_nums>d' /root/.ssh/authorized_keys

# Session purge
for f in <flagged_files>; do rm -f "/var/cpanel/sessions/raw/$f"; done

# Password rotation
echo 'root:<NEWPW>' | chpasswd

# WHM API token rotation
whmapi1 api_token_create token_name='acg_post_cve_<TS>'
whmapi1 api_token_revoke  token_name='<old_name>'

1Password CLI prompt-free auth (the new pattern)

SVC=$(sops -d /c/Users/guru/vault/infrastructure/1password-service-account.sops.yaml 2>/dev/null \
  | grep -E '^\s*credential:' | sed -E 's/^\s*credential:\s*//' | head -1)
export OP_SERVICE_ACCOUNT_TOKEN="$SVC"
op whoami    # User Type: SERVICE_ACCOUNT — no prompts
op item get "<id>" --vault Infrastructure --format json   # NB: --vault required for service accounts

Verification of rotated IX password (after Mike's "not working" alarm)

SSH (paramiko):     OK — connected as root, /etc/shadow shows hash dated 20573
WHM web login:      HTTP 200, status:1, /cpsess3383026374 issued
SOPS vault value:   exact match
1Password value:    exact match

→ password is correct end-to-end; Mike's "not working" was a copy error on his side.

Pending / Incomplete Tasks

Mike's outstanding items

  • Send the Michelle Sora email when ready (C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt)
  • Verify identity of 76.18.103.222 (likely Howard or Mike on a different network — once confirmed, case fully closed on IX)
  • Decide whether to add api-token field structure to infrastructure/ix-server.sops.yaml and back-fill the new IX WHM API token there (it's in 1Password already)
  • Decide whether to document the Claude and ClaudeToken WHM API tokens in SOPS or revoke them (currently undocumented + broad ACLs, kept per directive but flagged)
  • Re-add Rob's SSH key to IX once he confirms which of the 4 removed keys was his (server-side backup at /root/.ssh/authorized_keys.bak.20260430T132232Z)

Tracked TODOs (not blocking)

  • Fix vault.sh get-field PyYAML dependency in the Python fallback path so the wrapper works for service-account-style entries
  • Eventually run a fuller webshell/integrity scan across all hosted sites (the quick heuristic this session was minimal — only flagged WP core files)
  • Long-term: WebSvr CentOS 7 → AlmaLinux migration via WHM Transfer Tool (separate project, not blocked)

Items completed this session that close prior threads

  • CVE-2026-41940 IOC scan + remediation (both servers)
  • Credentials synced to 1Password Infrastructure vault
  • 1Password skill hardened with mandatory service-token directive
  • Memory entry added for service-token preference
  • Vault entries (SOPS) updated and pushed

Reference Information

Files / paths

  • 1Password skill: .claude/commands/1password.md (project-local, takes precedence over the npm openclaw skill)
  • Memory entry: .claude/memory/feedback_1password_service_token.md
  • Memory index: .claude/memory/MEMORY.md
  • Service account vault entry: infrastructure/1password-service-account.sops.yaml
  • IX vault entry: infrastructure/ix-server.sops.yaml
  • WebSvr vault entry: infrastructure/websvr-legacy-hosting.sops.yaml
  • Forensic evidence directory: C:/Users/guru/AppData/Local/Temp/cpanel-ioc-evidence/

Vault scope (Agentic-RW service account, verified 2026-04-30)

Visible: Clients, Infrastructure, Internal Sites, Managed Websites, MSP Tools, Projects, Sorting Excluded: Private (intentional)

CVE-2026-41940 reference

  • CVSS 9.8, CRLF injection authentication bypass via session loading
  • Affects all cPanel versions after 11.40 including DNSOnly
  • Patched in: 11.110.0.97, 11.118.0.63, 11.126.0.54, 11.132.0.29, 11.134.0.20, 11.136.0.5
  • Public PoC at watchTowr; active exploitation since ~2026-02-23
  • Fix command: /scripts/upcp --force

Why the deep-dive verdict was "patch held"

  • Each of the 7 actual exploit attempts had distinct cp_security_tokens that, when grep'd against access_log, appeared exactly once each with HTTP 403 against /json-api/version (and /applist, /listwwwacctconf, /get_tweaksetting on one). No HTTP 200 with an injected token from any external IP. The patch's session-validation logic is doing its job.

Update: 11:25 — Multi-client MSP work (Tedards, Bardach, Dataforth, Cascades/Golden Corral)

User

  • User: Mike Swanson (mike)
  • Machine: DESKTOP-0O8A1RL
  • Role: admin

Session Summary

The session opened with a request to reset the webmail password for accounting@tucsongoldencorral.com on the Neptune Exchange server (67.206.163.124, Exchange 2016). WinRM was firewalled even after a VPN change, and browser automation via ECP at https://neptune.acghosting.com/ecp was attempted but interrupted — Mike resolved the password directly via Active Directory on DC16.

The session then shifted to Dataforth M365, granting Dan Center (dcenter@dataforth.com) FullAccess to Joel Lohr's (jlohr@dataforth.com) mailbox. This was executed via Exchange Operator InvokeCommand (Add-MailboxPermission) and completed cleanly with AutoMapping enabled.

Significant remediation tool work followed. The onboard-tenant.sh script was patched to assign the Conditional Access Administrator directory role to the Tenant Admin service principal at onboard time (resolving a 403 on CA policy Graph endpoints), and Howard's independently discovered Policy.Read.All backfill block was retained. A # TODO(howard) comment was added to the role_assigned() function documenting the PIM roleAssignmentSchedules gap. tedards.net was fully onboarded to the remediation tool suite.

Bardach client work: confirmed Barbara Bardach (barbara@bardach.net) holds Exchange Online Plan 2 + EXCHANGEARCHIVE licenses (100GB primary, 110GB archive). Auto-expanding archive was enabled via Exchange Operator InvokeCommand (Enable-Mailbox -AutoExpandingArchive), returning AutoExpandingArchiveEnabled: true. The bardach.net tenant was freshly onboarded this session.

QuickBooks Desktop 2024 "Missing PDF component" error on Yvonne Tedards' Windows 11 machine: the Amyuni PDF Converter virtual printer was missing entirely. Root cause identified as Windows 11 Protected Print Mode blocking legacy unsigned printer drivers. Steps given: disable Protected Print Mode in Settings, then run QB Repair from Programs and Features. Awaiting confirmation.

Syncro ticket management for Tedards: logged 30 min Remote Business ($75) on ticket #32219 (QB error), and created new ticket #32228 for the email delivery issue with lindsay@agencyzoomify.com (no billing yet).

Full DKIM setup for tedards.net was completed end-to-end via automation: selector1/selector2 CNAME values retrieved from M365 Exchange Online, added to the tedards.net DNS zone via WHM API (zone lives directly in WHM on the ACG IX server — no separate cPanel account), and DKIM enabled via Set-DkimSigningConfig. Final status: Enabled: true, Status: Valid. A p=none DMARC record was also added. A cron job was scheduled at 1:17 PM to auto-escalate DMARC to p=quarantine if DNS validation passes.

Key Decisions

  • ECP browser automation for Neptune password reset abandoned in favor of AD on DC16. WinRM blocked externally; AD reset is the correct tool for Exchange 2016 on-prem.
  • onboard-tenant.sh CA Admin fix via script rather than ad-hoc patching. Idempotent; safe to re-run against existing tenants.
  • role_assigned() PIM gap flagged as TODO for Howard — fix requires querying roleAssignmentSchedules in addition to roleAssignments; deferred to Howard who discovered it.
  • tedards.net DKIM handled via WHM API directly — no separate cPanel account exists; zone is in WHM under ACG server account. Full automation, no browser required.
  • DMARC escalation deferred 2 hours to allow propagation verification before moving from p=none to p=quarantine.
  • Bardach auto-expanding archive chosen over additional Archive licenses. Exchange Online Plan 2 includes auto-expanding at no extra cost; archive quota becomes unlimited.

Problems Encountered

  • investigator-exo 401 on tedards.net Exchange Online: Security Investigator app returns 401 on InvokeCommand. Resolved by switching to exchange-op tier which has full_access_as_app Exchange role.
  • WHM account search returned no results for tedards: tedards.net DNS zone managed directly in WHM (no cPanel account). Confirmed via dumpzone API.
  • Get-DkimSigningConfig with Domain parameter returned null: M365 InvokeCommand rejects the Domain parameter on this cmdlet. Resolved by calling with empty parameters and filtering client-side.
  • M365 returned CnameMissing immediately after enabling DKIM: stale negative cache. Records resolved correctly from 8.8.8.8. Re-running enable after 5 seconds returned Enabled: true, Status: Valid.

Infrastructure and DNS Changes

tedards.net DNS (WHM on 72.194.62.5)

Record Type Value Action
selector1._domainkey.tedards.net CNAME selector1-tedards-net._domainkey.tedards.w-v1.dkim.mail.microsoft Added
selector2._domainkey.tedards.net CNAME selector2-tedards-net._domainkey.tedards.w-v1.dkim.mail.microsoft Added
_dmarc.tedards.net TXT v=DMARC1; p=none; sp=none; adkim=r; aspf=r; Added

M365 Changes

Tenant Action
dataforth.com dcenter FullAccess to jlohr mailbox (Exchange Online)
bardach.net Auto-expanding archive enabled for barbara@bardach.net
tedards.net DKIM enabled (Enabled: true, Status: Valid)

Syncro Tickets

Ticket Client Action
#32219 (ID 109545451) Bill/Yvonne Tedards 30 min Remote Business logged — QB PDF component fix ($75)
#32228 (ID 109697650) Bill/Yvonne Tedards Created — email delivery issue with lindsay@agencyzoomify.com (no billing yet)

Pending Tasks

  • QuickBooks PDF fix confirmation: Yvonne Tedards, Win11. Steps given (disable Protected Print Mode + QB Repair). Awaiting result.
  • Tedards DMARC escalation: cron scheduled 1:17 PM to escalate p=none to p=quarantine. Session-only — if Claude exits, run manually.
  • Tedards email issue (ticket #32228): inability to send/receive email to/from lindsay@agencyzoomify.com. Not yet investigated.
  • Backfill onboard-tenant.sh against 6 ACG tenants: bg-builders, cascades-tucson, cw-concrete, dataforth, heieck-org, mvan. Scheduled for 21:00 PT per note to Howard.
  • Howard TODO: Fix role_assigned() in onboard-tenant.sh to also query roleAssignmentSchedules for PIM-managed assignments.
  • Cascades: Grant Howard Contributor on rg-audit-cascadestucson once he creates the RG.

Reference


Note for Howard

TIME TRACKING WORKFLOW ISSUE: 31 Tickets Missing Time Entries

Mike clarified the business rule after my investigation: ALL tickets should have time entries logged in Syncro (even warranty/free work), with only cancelled tickets excepted. Time tracking is required for reporting/metrics, separate from billing decisions.

What I Found:

Billing Status: WORKING CORRECTLY

  • 29 out of 31 tickets have proper invoices attached
  • 2 tickets correctly have no invoices (Non-Billable/Cancelled)
  • Revenue is being captured properly

Time Tracking Status: BYPASSED ENTIRELY

  • All 29 invoiced tickets used manual invoice line items instead of time tracking
  • Hours were typed into invoice descriptions ("Applied 1.5 Prepay Hours")
  • Syncro time tracking system shows 00:00:00 for all 31 tickets
  • This prevents time-based reporting (hours per client, technician productivity, etc.)

Pattern Analysis:

Current workflow appears to be:

  1. Do work
  2. Create invoice with manual line items
  3. Type hours into description text
  4. Close ticket without logging time

Proper workflow should be:

  1. Do work
  2. Log time entry on ticket (records hours in Syncro)
  3. Time entry auto-generates invoice line item
  4. Invoice sent

Breakdown of 29 Invoiced Tickets:

  • 20 line items: Prepay hour deductions (hours in description: "Applied X.0 Prepay Hours")
  • 16 line items: Direct charges (labor + products, billed at standard rates)

Examples:

  • #32223 (Kittle): Manual "$75.0 - M365 user provisioning...0.5 hrs" — should have 0.5hr time entry
  • #32218 (Instrumental): "Applied 1.5 Prepay Hours" — should have 1.5hr time entry
  • #32167 (Cascades): "Applied 1.0 Prepay Hours" + "Applied 2.0 Prepay Hours" — should have 3.0hr time entry
  • #32156 (Cascades): "Applied 8.0 Prepay Hours" — should have 8.0hr time entry

Action Required:

Per Mike's clarification:

  1. All tickets need time entries — even warranty/free work should log time (mark as "Warranty" or appropriate type)
  2. Review workflow — ensure time tracking discipline going forward
  3. Reporting impact — missing time data means we can't accurately report on:
    • Hours spent per client
    • Technician productivity
    • Average ticket resolution time
    • Prepay hour burn rates

Note on the 2 Non-Invoiced Tickets:

  • #32083 (DAnaise.com) — "Onsite - Alicia's computer freezing" — Non-Billable (still needs time entry if work was done)
  • #32022 (Michael Johnson) — "Cancelled Onsite - Printer error" — Cancelled (no time entry needed)

Note on Sombra (#32225): Per Mike, RMM enrollment doesn't require billing, but if any actual work was done, it should have a time entry.


Update: 17:10 — Tedards email diagnosis, DMARC escalation, billing

User

  • User: Mike Swanson (mike)
  • Machine: DESKTOP-0O8A1RL
  • Role: admin

Session Summary

Diagnosed an email delivery issue for Tedards where emails from lindsay@agencyzoomify.com were routing to trash without any client-side rule. Checked Exchange Online inbox rules for y226@tedards.net (29 rules found, none targeting agencyzoomify.com) and reviewed the junk email configuration (blocked senders list did not include agencyzoomify.com). DNS email authentication for agencyzoomify.com was checked: SPF covers Titan Email and M365 with ~all fallback, DMARC is set to p=quarantine, but DKIM records (selector1/selector2 CNAMEs) are entirely absent. Root cause identified as DMARC quarantine policy with no DKIM alignment — EOP at the receiving side quarantines messages that fail DMARC. Recommended adding lindsay@agencyzoomify.com to Yvonne's trusted senders as an immediate workaround, and advised that Lindsay's IT needs to enable DKIM in M365 for agencyzoomify.com. Mike has not yet confirmed the trusted senders add — still pending.

The tedards.net DMARC escalation cron job fired at 1:17 PM. DKIM was confirmed still Enabled: true, Status: Valid in M365. The _dmarc.tedards.net TXT record was resolving cleanly from public DNS (p=none). The old record (WHM zone line 19) was removed via removezonerecord and a new p=quarantine record was added via addzonerecord. Verification via nslookup from 8.8.8.8 confirmed the new record live.

Sync pulled Howard's new client stub for Sombra Residential LLC — a Windows Server 2012 box (labelled Server2013, actually WS2012 build 9200) enrolled in GuruRMM today. Machine is EOL since 2023-10-10 and running unpatched. Howard flagged it for Mike to discuss migration path with the client.

Billing was logged for the DKIM/DMARC work after showing Mike a preview: new Syncro ticket #32231 created (status Resolved), 1hr Remote Business at $150.

Key Decisions

  • Trusted senders add pending explicit confirmation — adding to the junk bypass list is a tenant-side change that affects mail filtering posture; held for Mike's yes.
  • DMARC escalated to p=quarantine rather than p=reject — quarantine is a safe production policy; p=reject requires higher confidence in DKIM/SPF coverage and should be a deliberate next step.
  • Billing preview shown before submitting — after missing the preview on the QB ticket earlier in the session, adopted pattern of showing subject/description/labor/amount before any Syncro POST.

Problems Encountered

  • agencyzoomify.com has no DKIMselector1._domainkey.agencyzoomify.com returns NXDOMAIN. Their DMARC is p=quarantine which means any message failing DMARC alignment (likely on DKIM since SPF alignment depends on envelope-from) gets quarantined at the recipient. Not a tedards.net issue — it is entirely on the sending side.

Infrastructure and DNS Changes

tedards.net DNS (WHM on 72.194.62.5)

Record Change
_dmarc.tedards.net TXT Updated: p=nonep=quarantine; sp=quarantine; adkim=r; aspf=r;

Syncro Tickets

Ticket Client Action
#32231 (ID 109712846) Bill/Yvonne Tedards Created + 1hr Remote Business — DKIM/DMARC setup ($150)

Pending Tasks

  • Trusted senders add for Yvonne — add lindsay@agencyzoomify.com to y226@tedards.net trusted senders via Set-MailboxJunkEmailConfiguration. Mike to confirm.
  • lindsay@agencyzoomify.com DKIM — advise Yvonne to pass to Lindsay: enable DKIM in M365 Defender portal for agencyzoomify.com. Without it, their p=quarantine DMARC will continue causing delivery issues at other recipients too.
  • Sombra Residential WS2012 EOL — Server2013 (actually WS2012, EOL 2023-10-10) enrolled by Howard. Needs migration path discussion with client. sysadmin account password also needs to be captured in vault.
  • QB PDF fix (Yvonne Tedards) — awaiting confirmation that disabling Protected Print Mode + QB Repair resolved the issue.
  • Tedards email issue ticket #32228lindsay@agencyzoomify.com delivery problem. Root cause found; fix pending.

Reference

  • tedards.net Exchange mailboxes: bt@tedards.net (Bill), y226@tedards.net (Yvonne)
  • tedards.net tenant ID: 4fcbb1f4-fbf9-4548-a93e-7d14a3c091e6
  • WHM API: https://72.194.62.5:2087 (vault: infrastructure/ix-server.sops.yaml)
  • agencyzoomify.com DKIM status: NO RECORDS — selector1/selector2 NXDOMAIN
  • agencyzoomify.com DMARC: v=DMARC1; p=quarantine; rua=mailto:lindsay@agencyzoomify.com
  • Sombra Residential vault: clients/sombra-residential/server2013.sops.yaml
  • Syncro ticket #32228: Tedards email issue (no billing yet)
  • Syncro ticket #32231: Tedards DKIM/DMARC ($150 logged)