Files
claudetools/session-logs/2026-04-29-session.md
Mike Swanson 447b90e092 Session log: Cascades audit retention design + Pro-Tech Services email investigation
Cascades:
- Approved Howard's corrected 4-policy CA bypass design
- Caught + fixed policy 3 GDAP bug (Service provider users exclusion)
- Decided hybrid LAW + Storage Account audit retention (ACG-billed,
  reuse existing Trusted Signing Azure subscription, westus2)
- Wrote full audit retention runbook for Howard
- Reshaped break-glass to two accounts (split-storage YubiKeys)
- Documented Cascades M365 admin model (admin@/sysadmin@ Connect-excluded
  by design; local AD Administrator separate identity layer)
- Decided Howard gets Owner on ACG sub with guardrails (resource lock +
  cost alert) instead of per-RG Contributor

Pro-Tech Services:
- DNS recon of pro-techhelps.com + pro-techservices.co
- Diagnosed calendar invite delivery issue (DKIM domain mismatch +
  no DMARC = strict receivers silently drop invites)
- Drafted non-technical IT-provider migration email to Michelle Sora

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 17:05:41 -07:00

187 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 2026-04-29 — Cascades audit retention design + Pro-Tech Services email investigation
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-BEAST-ROG
- **Role:** admin
- **Session span:** 2026-04-29 ~13:30 PT to ~14:50 PT (~1.5 hours, two work threads)
## Session Summary
The session began with the Cascades Tucson HIPAA work, starting with a `/sync` pull that included recent commits from Howard and Mike, along with a local radio-show commit pushed. Howard's note on the Cascades CA fix Path B execution and his `onboard-tenant.sh` patch were surfaced for review. The Conditional Access bypass design was reviewed and approved, with a critical bug identified in policy 3 that would block GDAP admins during the CA cutover. The bug was corrected by explicitly excluding service provider users from the policy. A concern about the "device bootstrap chicken-and-egg" was raised but later resolved upon clarification that bootstrap users are always admins with MFA on Cascades Wi-Fi.
The session then transitioned to the Audit Retention Design, where a hybrid architecture of Log Analytics Workspace and Storage Account with lifecycle policy was decided. The billing model was set to ACG-billed within the HIPAA-tier MRR, and the existing ACG Azure subscription `e507e953-2ce9-4887-ba96-9b654f7d3267` (the GuruRMM Trusted Signing sub) was reused with isolated `rg-audit-*` resource groups. A full runbook was written to document the design and implementation steps. Mike's question about how break-glass relates to existing admin@/sysadmin@/GDAP partner access prompted a reshape from one break-glass account to two, with FIDO2 YubiKeys split between Cascades and ACG. Mike clarified that admin@ and sysadmin@ are intentionally Connect-excluded (cloud-only by design); this was documented in `clients/cascades-tucson/CONTEXT.md` as a permanent reference.
For RBAC, Mike asked whether Howard could grant himself Contributor on the new audit RGs rather than Mike doing it. Decision: grant Howard Owner on the entire ACG subscription, matching the existing trust model in `CLAUDE.md`, with two guardrails — a `CanNotDelete` lock on `gururmm-signing-rg` and a $50/mo Cost Management alert. This unblocks all future MSP-side Azure self-service and removes Mike as a permanent bottleneck.
A separate thread emerged when Mike asked to check DNS for `pro-techhelps.com`. After pulling records, Mike provided email headers from a "test" message from Jenny Holtsclaw at `pro-techservices.co` (a different domain). DNS recon on both showed they are GoDaddy-resold M365 (`NETORG6702699.onmicrosoft.com` tenant) with no custom DKIM and no DMARC. Inky flagged the test message PCL=4 (high phishing suspicion) but did not quarantine. The diagnosis: DKIM signs only with the `.onmicrosoft.com` fallback selector (not aligned with the From: domain), and combined with absent DMARC, strict receivers (Google Workspace, healthcare, finance) silently drop calendar invites while regular email passes — explaining the "some recipients never receive invites" symptom Jenny had reported. Mike confirmed both domains belong to one company that only sends from `.co`, and decided to pitch Michelle Sora (the business contact) on migrating off GoDaddy onto direct M365 with both domains consolidated under one cleanly-managed tenant. Drafted an email for Michelle — first version too technical/peer-tone, revised to non-technical IT-provider framing once Mike clarified Michelle is "stubborn and afraid of change." Final draft written to a temp file and opened in Notepad for Mike's review.
## Key Decisions
- **Hybrid audit retention architecture** (LAW 90d interactive + Storage Account 6yr cold archive). Both fed by the same Diagnostic Settings export — single ingest, two retention tiers. Rationale: balance live forensics against compliance archive cost.
- **ACG-billed audit retention**, bundled into HIPAA-tier MRR. ~$0.501.00/mo per tenant. Avoids Cascades Azure subscription friction; supports "we handle compliance, you don't think about it" MSP positioning.
- **Reuse existing ACG Azure subscription** `e507e953-2ce9-4887-ba96-9b654f7d3267` (the GuruRMM Trusted Signing sub) rather than provisioning a new one. RG isolation between signing and audit. Future split to a dedicated compliance subscription deferred until 3+ HIPAA tenants or audit ask demands it.
- **Two break-glass accounts at Cascades** (`breakglass1-csc@` and `breakglass2-csc@`) per Microsoft official guidance. FIDO2 YubiKeys split-stored: one at Cascades sealed envelope, one in ACG safe. Cost $50 vs $25 single — trivial.
- **Howard granted Owner on the ACG subscription** rather than per-RG Contributor. Matches `CLAUDE.md` trust model; one-time grant unblocks all future self-service. Guardrails: `CanNotDelete` lock on `gururmm-signing-rg` + $50/mo Cost Management alert.
- **CA policy 3 user targeting revised** to exclude "Service provider users" (GDAP foreign principals) in addition to `SG-External-Signin-Allowed` and `SG-Break-Glass`. Without this, ACG GDAP partner admins lose remote MSP access at CA cutover.
- **admin@ and sysadmin@ at Cascades are intentionally Connect-excluded** (cloud-only by design). Local AD `Administrator` is a separate on-prem identity layer, not part of the M365 admin model. Documented in CONTEXT.md.
- **UAL handling deferred**: poll-based Office 365 Management Activity API harvester to write JSON blobs to the per-tenant Storage Account. Build after pilot CA cutover validated and Cascades audit retention runs cleanly for 30 days. Estimated 4-6 hours dev + test.
- **Backfill sweep moved to Howard's column.** Mike originally took ownership; reallocated when it became clear Howard can run it locally during his next session and surface results in his own log.
- **Email to Michelle framed as IT-provider follow-up, not technical pitch.** No DNS/DKIM/DMARC jargon. GoDaddy positioned as the external constraint that locks everyone (including Mike) out of fixing it properly. Reassurance front-and-center: same email addresses, same Outlook, nothing changes for the team.
## Problems Encountered
- **Policy 3 GDAP bug** (Cascades CA design). Howard's policy 3 ("All users" minus `SG-External-Signin-Allowed`) would block ACG GDAP partner admins because Microsoft's "All users" CA target includes service-provider foreign principals. Caught while walking through Mike's question on how break-glass relates to admin@/sysadmin@/GDAP. Resolved: explicit "Service provider users" exclusion added to the design.
- **Bootstrap chicken-and-egg false alarm.** Initially raised concern that new shared phones (non-compliant until first user sign-in) would be trapped by policy 2 (require MFA on non-compliant device). Mike pointed out the bootstrap user is always an admin (sysadmin@ or Howard) on Cascades Wi-Fi — admin has MFA, policy 1 doesn't fire on-network, policy 2 prompts MFA which admin satisfies, device flips compliant. No CA exception needed. Concern retracted in favor of operational rule: phones don't get handed to caregivers until SDM bootstrap is done.
- **Misread of "schedule with Howard."** When Mike said "schedule with Howard to get that going," I interpreted it as invoking the `/schedule` skill (remote agent on cron). Started the AskUserQuestion flow. Mike clarified he meant coordinate with Howard via session log when convenient for him, not create a remote agent. Backed out. Coordination is now via the existing session log + cross-user note hook.
- **First Michelle email draft was too technical.** Initial draft used peer-MSP tone with DKIM/DMARC explanation, ~450 words. Mike clarified he's already her IT provider and Michelle is "stubborn and afraid of change" — needed non-technical, reassurance-forward framing without DNS jargon. Revised to ~280 words with GoDaddy-as-constraint framing, "nothing changes for users" emphasis, and a soft close on scheduling a call.
## Configuration Changes
### Files created
- `.claude/skills/remediation-tool/references/audit-retention-runbook.md` — full design + per-tenant runbook (Phase 1 ACG-side resources, Phase 2 customer-tenant Diagnostic Settings, Phase 3 verification, Phase 4 deferred OMA harvester, operational + cost notes)
- `clients/cascades-tucson/session-logs/2026-04-29-mike-audit-retention-design-and-handoff.md` — full handoff log for Howard (covers all 6 decisions, runbook reference, order of operations, four-paths failure-domain map)
- `session-logs/2026-04-29-session.md` — this file
### Files modified
- `clients/cascades-tucson/CONTEXT.md` — added "M365 admin model" section (full admin landscape: on-prem AD Administrator, Connect-excluded admin@/sysadmin@, GDAP partner principals, two break-glass accounts; CA targeting consequences)
### Files NOT touched (orthogonal to this session, deliberately not committed in this log's commit)
- `projects/radio-show/audio-processor/server/main.py` — pre-existing modification, separate radio-show Q&A scoring work
- `projects/radio-show/audio-processor/classify_qa_quality.py` — pre-existing untracked file from radio-show work
- `.claude/scheduled_tasks.lock` — transient runtime file
### Temp files (not part of repo)
- `C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt` — Michelle Sora email draft (Mike opened in Notepad to review/send)
- `C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt` — Ollama prompt for this log
- `C:/Users/guru/AppData/Local/Temp/save_narrative_output.md` — Ollama narrative output
## Commands & Outputs
### /sync at session start
- Pulled 3 commits (Howard cascades close-out, Mike earlier auto-sync, Mike PIM gap note)
- Pushed 1 local commit (radio-show `import_to_sqlite.py` from earlier work)
- Vault: clean both directions
- HEAD after sync: `6b63c15`
### DNS reconnaissance — pro-techhelps.com
```
MX: protechhelps-com0i.mail.protection.outlook.com (pri 0)
SPF: "v=spf1 include:spf.protection.outlook.com -all" [direct M365]
DMARC: NXDOMAIN
DKIM: selector1/selector2 NXDOMAIN
WHOIS: registered 2016-12-09, GoDaddy registrar, NS69/NS70.DOMAINCONTROL.COM
A: 3.33.130.190, 15.197.148.33 (AWS-hosted)
```
### DNS reconnaissance — pro-techservices.co
```
MX: protechservices-co0i.mail.protection.outlook.com (pri 0)
SPF: "v=spf1 include:secureserver.net -all" [GoDaddy-resold path]
"NETORG6702699.onmicrosoft.com" [marker]
"v=verifydomain MS=9047794" [M365 verification]
DMARC: NXDOMAIN
DKIM: selector1/selector2 NXDOMAIN
A: 76.223.105.230, 13.248.243.5 (AWS-hosted)
Tenant: NETORG6702699.onmicrosoft.com (GoDaddy-resold M365)
```
### Email header analysis (Jenny Holtsclaw test message)
- From: `Jenny Holtsclaw <jholtsclaw@pro-techservices.co>` to `mike@azcomputerguru.com`
- Subject: `test`, sent 2026-04-29 22:31:34 UTC
- DKIM: `pass header.d=NETORG6702699.onmicrosoft.com` (fallback signer, not aligned with From: domain)
- SPF: pass on first hop, fail on Inky relay (normal forwarding artifact)
- DMARC: bestguesspass (no policy)
- compauth: `pass reason=130`
- Inky verdict: `X-IPW-Inky-PCL: 4` (high phish suspicion), `X-IPW-Inky-SCL: 0` (not spam), `X-Inky-Quarantine: False`
## Credentials & Secrets
No new credentials created or rotated this session. Existing references confirmed:
- **ACG Azure subscription**: `e507e953-2ce9-4887-ba96-9b654f7d3267`
- Vault: `services/azure-trusted-signing.sops.yaml`
- Existing usage: `gururmm-signing-rg` for GuruRMM Trusted Signing cert profile
- New planned usage: `rg-audit-*` resource groups for HIPAA-tier audit retention
- **Cascades tenant ID**: `207fa277-e9d8-4eb7-ada1-1064d2221498`
- **Cascades Tenant Admin SP appId**: `709e6eed-0711-4875-9c44-2d3518c47063` (objectId in Cascades: `a5fa89a9-b735-4e10-b664-f042e265d137`)
- **Cascades Conditional Access Administrator role template ID**: `b1be1c3e-b65d-4f19-8427-f6fa0d97feb9`
- **Microsoft Graph appRole `Policy.Read.All`**: `246dd0d5-5bd0-4def-940b-0421030a5b68`
- **Pro-Tech Services tenant**: `NETORG6702699.onmicrosoft.com` (GoDaddy-resold; not under ACG management)
## Infrastructure & Servers
- ACG Azure subscription `e507e953-2ce9-4887-ba96-9b654f7d3267` — westus2 region, owner: Mike Swanson, currently single resource group `gururmm-signing-rg`. Planned expansion: `rg-audit-cascadestucson` and future `rg-audit-*` per HIPAA-tier client.
- Cascades on-prem: CS-SERVER `192.168.2.254` (DC + file server, domain `cascades.local`); pfSense `192.168.0.1`; Synology `192.168.0.120:5000`. Cascades named location id `061c6b06-b980-40de-bff9-6a50a4071f6f` (both WANs trusted: `72.211.21.217/32` + `184.191.143.62/32`).
- Pro-Tech Services external: M365 EOP MX endpoints (`protechservices-co0i.mail.protection.outlook.com`, `protechhelps-com0i.mail.protection.outlook.com`); Inky phishfence (`ipw.inkyphishfence.com` 100.24.129.5) in inbound mail path to ACG.
## Pending / Incomplete Tasks
### Mike's outstanding items (Cascades audit retention)
- [ ] Order TWO YubiKeys ($50 total — one for Cascades sealed envelope via Howard, one for ACG safe)
- [ ] One-time: grant Howard Owner on ACG subscription
```bash
az role assignment create \
--assignee howard.enos@azcomputerguru.com \
--role "Owner" \
--scope "/subscriptions/e507e953-2ce9-4887-ba96-9b654f7d3267"
```
- [ ] One-time: resource lock on signing RG
```bash
az lock create --name signing-protect --lock-type CanNotDelete \
--resource-group gururmm-signing-rg \
--notes "Protect GuruRMM Trusted Signing infra from accidental deletion"
```
- [ ] One-time: $50/mo cost alert via Cost Management UI
### Mike's outstanding items (Pro-Tech Services)
- [ ] Review the Michelle Sora email draft in Notepad (`C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt`)
- [ ] Send the email when ready; possibly draft a parallel Jenny variant if needed
- [ ] If Michelle accepts: schedule discovery call covering user count, tenant access verification (admin.microsoft.com test), other domains, third-party integrations
- [ ] If migration proceeds: build proposal, T2T migration plan, license cutover sequencing
### Howard's outstanding items (Cascades, in order)
1. Backfill sweep against 6 ACG tenants (~5 min, idempotent): bg-builders, cascades-tucson, cw-concrete, dataforth, heieck-org, mvan
2. Pilot Outlook + LinkRx/Helpany apps + first phone compliance flip (resume from prior session)
3. Pilot user + cloud group `SG-Caregivers-Pilot`
4. Audit retention buildout (RG, Storage Account with lifecycle, LAW, Diagnostic Settings) per the runbook
5. Two break-glass accounts + YubiKey enrollment + `SG-Break-Glass` group + sign-in alert KQL rule on LAW
6. CA policies in Report-only → 24-48h review → flip to On
7. Phase 3 production rollout (after AD prereq cleanup + Entra Connect staging exit)
### Tracked TODOs (not blocking)
- [ ] Teach `role_assigned` helper about `roleAssignmentSchedules` (cosmetic noise only on PIM-managed assignments)
- [ ] Build OMA Activity API harvester for UAL (4-6 hours dev when ready, after Cascades audit retention runs cleanly for 30d)
- [ ] Codify `onboard-tenant.sh --enable-audit-archive` flag (after pilot validated)
- [ ] Codify `onboard-tenant.sh --enable-breakglass` flag (after 3-4 HIPAA tenants on pattern)
## Reference Information
### Files / paths
- Audit retention runbook: `.claude/skills/remediation-tool/references/audit-retention-runbook.md`
- Cascades handoff log: `clients/cascades-tucson/session-logs/2026-04-29-mike-audit-retention-design-and-handoff.md`
- Cascades CONTEXT (M365 admin model added): `clients/cascades-tucson/CONTEXT.md`
- Howard's prior session log: `clients/cascades-tucson/session-logs/2026-04-29-howard-cascades-bypass-pilot-phase-b-buildout.md`
- onboard-tenant.sh (now patched with CA Admin role + Policy.Read.All backfill): `.claude/skills/remediation-tool/scripts/onboard-tenant.sh`
- Michelle email draft (temp): `C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt`
### Vault entries referenced
- `services/azure-trusted-signing.sops.yaml` — ACG Azure subscription
- `clients/cascades-tucson/m365-admin.sops.yaml` — Cascades M365 admin
- `clients/cascades-tucson/m365-sysadmin.sops.yaml` — Cascades M365 sysadmin
- `clients/cascades-tucson/wifi-cscnet.sops.yaml`, `mdm-service-account.sops.yaml`, `pfsense-firewall.sops.yaml`
### Future vault entries (not yet created)
- `clients/cascades-tucson/breakglass1.sops.yaml` — primary break-glass (when account created)
- `clients/cascades-tucson/breakglass2.sops.yaml` — secondary break-glass
### External
- Microsoft Diagnostic Settings on Entra: ARM endpoint `https://management.azure.com/providers/microsoft.aadiam/diagnosticSettings/{name}?api-version=2017-04-01-preview` (NOT a Graph endpoint — Howard validates exact call shape during dry-run)
- Microsoft directory role template — Conditional Access Administrator: `b1be1c3e-b65d-4f19-8427-f6fa0d97feb9`
- Office 365 Management Activity API (UAL harvester future work): `/api/v1.0/{tenantId}/activity/feed/subscriptions/content?contentType=...`
### Cost model (audit retention)
- Per HIPAA-tier tenant: ~$0.501.00/mo (LAW $0.23 ingest + $0.03 retention + Storage $0.15 lifecycle blended)
- ACG cumulative at 5 HIPAA tenants: ~$510/mo
- Forensics rehydration (archive blob): ~$50100 per incident retrieval, one-time per event