Session log: Cascades audit retention design + Pro-Tech Services email investigation

Cascades:
- Approved Howard's corrected 4-policy CA bypass design
- Caught + fixed policy 3 GDAP bug (Service provider users exclusion)
- Decided hybrid LAW + Storage Account audit retention (ACG-billed,
  reuse existing Trusted Signing Azure subscription, westus2)
- Wrote full audit retention runbook for Howard
- Reshaped break-glass to two accounts (split-storage YubiKeys)
- Documented Cascades M365 admin model (admin@/sysadmin@ Connect-excluded
  by design; local AD Administrator separate identity layer)
- Decided Howard gets Owner on ACG sub with guardrails (resource lock +
  cost alert) instead of per-RG Contributor

Pro-Tech Services:
- DNS recon of pro-techhelps.com + pro-techservices.co
- Diagnosed calendar invite delivery issue (DKIM domain mismatch +
  no DMARC = strict receivers silently drop invites)
- Drafted non-technical IT-provider migration email to Michelle Sora

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-29 17:05:41 -07:00
parent 6b63c154d2
commit 447b90e092
4 changed files with 706 additions and 0 deletions

View File

@@ -0,0 +1,280 @@
# Audit Retention Runbook (HIPAA-tier)
ACG-side architecture for capturing and retaining 6-year audit logs from customer M365 tenants. First implementation: Cascades Tucson.
## Why this exists
HIPAA §164.312(b) requires audit controls; §164.316(b)(2)(i) requires 6-year retention.
M365 native retention falls short of 6 years on every relevant log source:
| Source | Native | Gap to 6yr |
|---|---|---|
| Entra sign-in / audit / provisioning logs | 30d | 5y 11m |
| Purview Unified Audit Log (Exchange/SP/OD/Teams) | 180d | 5.5y |
| Intune audit | 1y | 5y |
| Defender alerts | 30d | 5y 11m |
We close the gap by exporting via Diagnostic Settings to ACG-owned destinations and supplementing UAL with a poll-based harvester.
## Architecture
Hybrid: Log Analytics for live forensics + Storage Account for cold archive.
```
Customer Tenant (Cascades, etc.)
Diagnostic Settings ──┬──> [LAW] law-<short>-audit (90d interactive)
└──> [SA] stor<short>audit (lifecycle: hot 30d -> cool 60d -> archive 6y -> delete)
Customer Tenant
/v1.0/auditLogs (UAL) ──> ACG Function (poll q4h, per tenant) ──> SA blob path /ual/{yyyy}/{MM}/{dd}/...
```
Both LAW and SA receive the same stream from Diagnostic Settings — one ingest path, two retention tiers. The LAW is for human queries; the SA is for compliance archive.
UAL lacks a Diagnostic Settings hook, so we poll the Office 365 Management Activity API on a schedule and write JSON to the same Storage Account.
## Cost model
Per HIPAA-tier tenant per month: **~$0.501.00**
- LAW ingest: ~$2.30/GB × ~0.1 GB/mo = ~$0.23/mo
- LAW retention (90d): ~$0.10/GB × peak ~0.3 GB = ~$0.03/mo
- Storage Account (cool/archive blended over 6y): ~$0.15/mo
- Function compute (shared across tenants): rounded to zero
- Egress (only on forensics retrieval): pay-per-use, typically zero/mo
ACG cumulative at 5 HIPAA tenants: ~$510/mo. Budget headroom for forensics rehydration: ~$50100 per incident retrieval (one-time).
## Prerequisites
### ACG-side (one-time)
- **Azure subscription:** reuse existing `e507e953-2ce9-4887-ba96-9b654f7d3267` — the ACG-owned subscription set up for GuruRMM Trusted Signing (cert profile `gururmm-public-trust` under `gururmm-signing-rg`). Vault entry: `services/azure-trusted-signing.sops.yaml`.
- Rationale: Mike already has Owner on this sub; no new billing relationship needed; single tenant boundary; Azure RBAC + RG-level tagging keeps audit data isolated from signing data.
- **Existing usage in this sub:** `gururmm-signing-rg` (Trusted Signing for GuruRMM agent binaries). Audit RGs (`rg-audit-*`) will be RG-isolated from signing.
- Future split: when we have 3+ HIPAA tenants or a compliance audit requires hard boundary, move audit RGs to a dedicated `acg-msp-compliance` subscription via `az resource move`.
- **RBAC for Howard:** Owner at the subscription level — matches the existing operational trust model (Howard has "Full trust — same access as admin" per `CLAUDE.md`). One-time grant unblocks all future MSP-side Azure self-service. Mike runs:
```bash
az role assignment create \
--assignee howard.enos@azcomputerguru.com \
--role "Owner" \
--scope "/subscriptions/e507e953-2ce9-4887-ba96-9b654f7d3267"
```
Guardrails to keep Owner-Howard low-risk:
- Resource lock on `gururmm-signing-rg`: `az lock create --name signing-protect --lock-type CanNotDelete --resource-group gururmm-signing-rg`
- PAYG cost alert at ~$50/mo via Cost Management (UI task)
- **Region:** `westus2` for all audit resources. Latency-friendly to Tucson, mature service availability, no HIPAA-relevant cost difference vs other US regions.
### Customer-tenant side (per onboarded HIPAA tenant)
- **Tenant Admin SP must have** `Policy.Read.All` (already in updated `onboard-tenant.sh`)
- **Tenant Admin SP must have** the directory role **Security Administrator** OR a custom role with `Microsoft.Insights/diagnosticSettings/write` to create Diagnostic Settings on Entra. (Conditional Access Administrator alone does NOT cover Monitor scope.)
- **Tenant Admin app manifest must include** `AuditLog.Read.All` and either Graph's `IdentityRiskyUser.Read.All` (already present per `SEC_INV_GRAPH_ROLES`) or follow-on for Defender export
### Tag schema (apply to every resource)
```
client = cascadestucson
tier = hipaa
service = audit
cost-center = msp-audit
created-by = howard | mike | onboard-tenant.sh
```
## Per-tenant onboarding — Cascades example
Substitute `<short>` = `cascades` (lowercase, no punctuation, ≤8 chars). Substitute `<full>` = `cascadestucson`.
### Phase 1: ACG-side resource provisioning
Howard runs from his workstation with az CLI logged into ACG home tenant:
```bash
SUB="e507e953-2ce9-4887-ba96-9b654f7d3267"
SHORT="cascades"
FULL="cascadestucson"
REGION="westus2"
RG="rg-audit-${FULL}"
az account set --subscription "$SUB"
# Resource group
az group create --name "$RG" --location "$REGION" \
--tags client="$FULL" tier=hipaa service=audit cost-center=msp-audit created-by=howard
# Storage Account (must be globally unique, lowercase alphanumeric, 3-24 chars)
SA_NAME="stor${SHORT}audit"
az storage account create \
--name "$SA_NAME" \
--resource-group "$RG" \
--location "$REGION" \
--sku Standard_LRS \
--kind StorageV2 \
--access-tier Cool \
--min-tls-version TLS1_2 \
--allow-blob-public-access false \
--tags client="$FULL" tier=hipaa service=audit cost-center=msp-audit
# Containers
SA_KEY=$(az storage account keys list -g "$RG" -n "$SA_NAME" --query '[0].value' -o tsv)
for c in entra-signin entra-audit entra-provisioning intune-audit defender-alerts ual; do
az storage container create --name "$c" --account-name "$SA_NAME" --account-key "$SA_KEY"
done
# Lifecycle policy: hot 30d -> cool 60d -> archive 6y -> delete
cat > /tmp/lifecycle.json <<'EOF'
{
"rules": [{
"name": "hipaa-6y-tier-down",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": { "blobTypes": ["blockBlob"] },
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 },
"tierToArchive": { "daysAfterModificationGreaterThan": 90 },
"delete": { "daysAfterModificationGreaterThan": 2190 }
}
}
}
}]
}
EOF
az storage account management-policy create \
--account-name "$SA_NAME" \
--resource-group "$RG" \
--policy @/tmp/lifecycle.json
# Immutability (legal hold) — defer until pilot validated.
# When ready: az storage container immutability-policy create ...
# Log Analytics Workspace
LAW_NAME="law-${SHORT}-audit"
az monitor log-analytics workspace create \
--resource-group "$RG" \
--workspace-name "$LAW_NAME" \
--location "$REGION" \
--retention-time 90 \
--tags client="$FULL" tier=hipaa service=audit cost-center=msp-audit
```
### Phase 2: Customer-tenant Diagnostic Settings
Performed against Cascades tenant using Tenant Admin token:
```bash
CASCADES_TENANT="207fa277-e9d8-4eb7-ada1-1064d2221498"
TOKEN=$(bash .claude/skills/remediation-tool/scripts/get-token.sh "$CASCADES_TENANT" tenant-admin)
LAW_RESOURCE="/subscriptions/${SUB}/resourceGroups/${RG}/providers/Microsoft.OperationalInsights/workspaces/${LAW_NAME}"
SA_RESOURCE="/subscriptions/${SUB}/resourceGroups/${RG}/providers/Microsoft.Storage/storageAccounts/${SA_NAME}"
# Entra Diagnostic Settings (covers sign-in + audit + provisioning + non-interactive)
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
"https://graph.microsoft.com/beta/auditLogs/directoryAudits" \
-d @- <<EOF
{
"name": "acg-audit-export",
"logs": [
{"category": "AuditLogs", "enabled": true},
{"category": "SignInLogs", "enabled": true},
{"category": "NonInteractiveUserSignInLogs", "enabled": true},
{"category": "ServicePrincipalSignInLogs", "enabled": true},
{"category": "ManagedIdentitySignInLogs", "enabled": true},
{"category": "ProvisioningLogs", "enabled": true},
{"category": "ADFSSignInLogs", "enabled": true},
{"category": "RiskyUsers", "enabled": true},
{"category": "UserRiskEvents", "enabled": true}
],
"workspaceId": "${LAW_RESOURCE}",
"storageAccountId": "${SA_RESOURCE}"
}
EOF
```
Note: Entra Diagnostic Settings actually go through Azure Resource Manager (not Graph), and the proper endpoint is:
```
PUT https://management.azure.com/providers/microsoft.aadiam/diagnosticSettings/{name}?api-version=2017-04-01-preview
```
Authenticate against ARM (`https://management.azure.com`), not Graph. The Tenant Admin SP needs `Microsoft.AzureActiveDirectory/diagnosticSettings/write` permission, granted via the Security Administrator directory role. Howard: validate the working endpoint during dry-run; the cURL above is the conceptual shape, not the exact call.
### Phase 3: Verification (1h after setup)
```bash
# Query LAW for recent sign-ins
az monitor log-analytics query \
--workspace "$LAW_NAME" \
--resource-group "$RG" \
--analytics-query "SigninLogs | take 5 | project TimeGenerated, UserPrincipalName, ResultType"
# Confirm Storage Account is receiving blobs
az storage blob list --container-name insights-logs-signinlogs \
--account-name "$SA_NAME" --account-key "$SA_KEY" --num-results 5
```
If LAW returns rows and SA has blobs, the export is live.
### Phase 4: UAL harvester (deferred — separate buildout)
UAL has no Diagnostic Settings export. Approach when we get to it:
- Azure Function (Python or PowerShell), timer trigger every 4h
- Per onboarded tenant: managed identity granted `ActivityFeed.Read` against Office 365 Management API
- Polls `/api/v1.0/{tenantId}/activity/feed/subscriptions/content?contentType=Audit.AzureActiveDirectory|Audit.Exchange|Audit.SharePoint|Audit.General|DLP.All`
- Writes raw JSON to `<sa>/ual/{yyyy}/{MM}/{dd}/{tenantId}/{contentType}-{timestamp}.json`
- Deduplicates via `contentId`
Codify the design once we've run it manually for a few weeks against Cascades. Estimated build: 4-6 hours dev + test.
## Operational
### Quarterly verification (per tenant, ~10 min)
1. Run a `SigninLogs | summarize count() by bin(TimeGenerated, 1d) | order by TimeGenerated desc` query in LAW. Expect daily volume.
2. Spot-check Storage Account container blob counts and timestamps.
3. Confirm lifecycle policy hasn't drifted: `az storage account management-policy show -g $RG -n $SA_NAME`.
4. Cost: `az consumption usage list --start-date $(date -d '30 days ago' +%Y-%m-%d) --end-date $(date +%Y-%m-%d) --query "[?contains(instanceId,'$SA_NAME')||contains(instanceId,'$LAW_NAME')]"` — should be ~$1/mo per tenant.
### Forensics retrieval
- **090 days:** KQL on LAW directly. Sub-second queries.
- **90 days 6 years:** rehydrate blob from archive tier.
```bash
az storage blob set-tier --tier Hot --rehydrate-priority Standard \
--account-name $SA_NAME --container-name <c> --name <blob>
```
Standard rehydrate SLA: ~15 hours. High-priority: ~1 hour, costs ~10x more.
### When to upgrade subscription split
- Triggers: 3+ HIPAA tenants, an external compliance audit asking about subscription scope, or one tenant generating >10 GB/month
- Path: provision new subscription `acg-msp-compliance`, move RGs via `az resource move`, update Diagnostic Settings destination ARM IDs
## Onboarding integration (codify after pilot validated)
Once Cascades is running cleanly for 30 days, fold the per-tenant Phase 1 + Phase 2 into `onboard-tenant.sh` as a flag:
```bash
bash onboard-tenant.sh <tenant-id> --enable-audit-archive --client-shortname <short>
```
Implementation outline:
- Read `--enable-audit-archive` flag
- Provision RG + SA + LAW under ACG sub (idempotent: skip if exists)
- Issue PUT for Diagnostic Settings against the customer tenant
- Append "Audit archive: [OK]" row to the final status table
Until codified, Howard runs the runbook manually per tenant. Cascades is the only HIPAA-tier tenant currently — this is fine.
## Open questions / future work
- **UAL harvester:** designed but not built. Punt until pilot CA cutover is done.
- **Defender for Office 365 export:** does it expose Diagnostic Settings? If not, may need OMA-style poll. Check during Cascades verification.
- **MDE alerts:** ditto.
- **Sentinel:** the natural upgrade path if alerting becomes important. Cost crosses ~$200/mo at first tenant — defer until justified by an actual operational need.
- **Break-glass sign-in alert:** when break-glass admin lands, KQL alert rule on LAW: `SigninLogs | where UserPrincipalName == "breakglass-csc@cascadestucson.com"` → Action Group → email Mike + Howard. Lives in this same LAW.

View File

@@ -21,6 +21,28 @@ Full contact list + Wi-Fi, KPAX, M365 admin, UniFi hardware MACs, GoDaddy are in
| `svc-audit-upload` | service account for Syncro audit upload to `AuditDrop$` share | `clients/cascades-tucson/svc-audit-upload.sops.yaml` | | `svc-audit-upload` | service account for Syncro audit upload to `AuditDrop$` share | `clients/cascades-tucson/svc-audit-upload.sops.yaml` |
| `\\CS-SERVER\homes` | file share at `D:\Homes`; per-user subfolders for folder redirection. Domain Users: Change. Domain Admins: Full. **EncryptData currently false — HIPAA workitem to flip on.** | — | | `\\CS-SERVER\homes` | file share at `D:\Homes`; per-user subfolders for folder redirection. Domain Users: Change. Domain Admins: Full. **EncryptData currently false — HIPAA workitem to flip on.** | — |
## M365 admin model
Tenant ID: `207fa277-e9d8-4eb7-ada1-1064d2221498`
Mike's design intent (confirmed 2026-04-29): **the cloud admin layer is fully separated from the on-prem AD admin layer.**
| Account | Layer | Synced via Connect? | Purpose |
|---|---|---|---|
| On-prem AD `Administrator` | On-prem only | No (separate identity layer) | DC + file server admin, GPO, on-prem services. Never authenticates to M365. |
| `admin@cascadestucson.com` | Cloud-only | **No — intentionally Connect-excluded** | Cascades day-to-day cloud GA |
| `sysadmin@cascadestucson.com` | Cloud-only | **No — intentionally Connect-excluded** | Howard's tech account / cloud admin work |
| ACG GDAP partner principals | Foreign principals | N/A | MSP delivery (Mike + Howard from `@azcomputerguru.com`) |
| `breakglass1-csc@cascadestucson.com` | Cloud-only | No (definitionally) | Emergency primary — FIDO2 YubiKey at Cascades sealed envelope |
| `breakglass2-csc@cascadestucson.com` | Cloud-only | No (definitionally) | Emergency secondary — FIDO2 YubiKey at ACG safe |
**When Entra Connect exits staging mode** (Wave 0.5 G3-G5), admin@ and sysadmin@ stay cloud-only — they must remain in the Connect filter exclusion. Verify after every Connect sync rule change.
CA targeting consequences:
- admin@/sysadmin@: subject to all Cascades CA; must be in `SG-External-Signin-Allowed` for off-network admin work
- `SG-Break-Glass`: excluded from all CA (must add exclusion to every new policy)
- ACG GDAP foreign principals: excluded from blocking policies via the "Service provider users" condition (Microsoft's CA UI), NOT via group membership
## GuruRMM ## GuruRMM
- Client: **Cascades of Tucson** (code `CASC`, id `42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f`) - Client: **Cascades of Tucson** (code `CASC`, id `42e1b0e3-f8b7-4fc5-86bd-06bdbb073b7f`)

View File

@@ -0,0 +1,218 @@
# 2026-04-29 — Audit retention design + handoff to Howard
## User
- **User:** Mike Swanson (mike) (with Claude coordination)
- **Machine:** GURU-BEAST-ROG
- **Role:** admin
- **Session span:** 2026-04-29 ~13:30 PT (post-/sync follow-up to Howard's overnight session)
## Note for Howard
Closing several decisions that were waiting on me from your 2026-04-29 close-out log. **Read these before resuming the pilot build.**
### 1. CA bypass design — APPROVED with two corrections to policy 3
Your corrected 4-policy design (DELETE existing all-users-MFA + 4 new policies) is approved in shape. Logic checks out: caregiver on Cascades + compliant = password only, off-network = blocked, non-compliant on-site = MFA. **Two corrections to policy 3 user targeting before you build it:**
**Correction 1 — admin off-site access:** `admin@cascadestucson.com` and `sysadmin@cascadestucson.com` need to be in `SG-External-Signin-Allowed` so policy 3 doesn't block you from working from home. (You probably already have this in mind, but make it an explicit checkbox.)
**Correction 2 — GDAP partner access (the bug I caught):** Policy 3 as written ("All users" minus `SG-External-Signin-Allowed`) blocks ACG GDAP partner admins because Microsoft's "All users" *includes* service-provider foreign principals unless explicitly excluded. You and I signing into Cascades from `@azcomputerguru.com` would lose remote MSP access at cutover.
**Revised policy 3 user targeting:**
- Include: All users
- Exclude: `SG-External-Signin-Allowed` + `SG-Break-Glass` + **Service provider users** (the B2B/GDAP exclusion in CA's user picker)
Verify the exact CA UI label — it's the "External users" subselection where "Service provider users" appears as a checkbox. If Microsoft's renamed it again, the goal is "exclude foreign principals coming in via GDAP."
**Other build-time notes still valid:**
- `SG-Break-Glass` group must exist (with breakglass1 + breakglass2 — see §4) before you build any of the 4 policies. Excluding a non-existent group is silent failure.
- Stage in Report-only first, watch logs for 24-48h, watch for any sign-in from `@azcomputerguru.com` foreign principals being blocked — that's the canary on correction 2.
Bootstrap chicken-and-egg concern I raised earlier is retracted (on-network filter excuses policy 1 + 3, policy 2 prompts MFA which admin can satisfy → compliance flips → caregivers clean flow on subsequent sign-ins). Operational rule: phones don't get handed to caregivers until SDM bootstrap is done.
### 2. Audit retention — DECIDED: hybrid LAW + Storage, ACG-billed, on existing sub
Full design and runbook lives at:
**`.claude/skills/remediation-tool/references/audit-retention-runbook.md`**
Read that before you start. Summary:
- **Architecture:** hybrid. LAW for 90-day live forensics + Storage Account for 6-year cold archive (lifecycle: hot 30d → cool 60d → archive 6y → delete). Both fed by the same Diagnostic Settings export — single ingest, two retention tiers.
- **Subscription:** reuse the existing `e507e953-2ce9-4887-ba96-9b654f7d3267` (the GuruRMM Trusted Signing sub — Mike already has Owner). RG-isolated from the signing RG. Vault: `services/azure-trusted-signing.sops.yaml`.
- **Region:** `westus2`
- **Cost:** ~$0.501.00/mo per HIPAA-tier tenant. ACG-billed, bundled into HIPAA-tier MRR.
- **UAL handling:** poll-based harvester (Office 365 Management Activity API → Storage Account blobs). DEFERRED — design only, build after pilot CA cutover. Punt for now; M365 native 180-day UAL retention covers us short-term.
- **Codify path:** once Cascades runs cleanly for 30 days, fold Phase 1 + Phase 2 into `onboard-tenant.sh` as `--enable-audit-archive`.
**RBAC needed before you can start:** Mike needs to grant you Contributor on `rg-audit-cascadestucson` once the RG exists. The az command for that is in the runbook prereqs. Mike: that's on you to run when Howard creates the RG.
**Sequence (per the runbook):**
1. ACG-side resource provisioning (RG, Storage Account, lifecycle policy, LAW) — Howard runs az CLI
2. Customer-tenant Diagnostic Settings (Entra → both destinations) — Tenant Admin token + ARM endpoint
3. Verification (1h after, query LAW, check SA blobs)
4. Defender / Intune Diagnostic Settings — discover during verification, add as available
5. UAL harvester — DEFERRED
**Caveat in the runbook:** the cURL in Phase 2 is conceptual. Entra Diagnostic Settings actually go through ARM (`management.azure.com/providers/microsoft.aadiam/diagnosticSettings/...`), not Graph. Validate the working endpoint during your dry-run. Tenant Admin SP probably needs Security Administrator directory role (or a custom role with `Microsoft.AzureActiveDirectory/diagnosticSettings/write`) on top of CA Admin to create Diagnostic Settings on Entra. Worth confirming early — if true, add to the next `onboard-tenant.sh` patch.
### 3. Backfill sweep — APPROVED, scheduling
Patched `onboard-tenant.sh` will be re-run against all 6 ACG customer tenants tonight (21:00 PT) so they all get the CA Admin role + `Policy.Read.All` backfill: bg-builders, cascades-tucson (idempotent — Howard's PIM-set role will trigger Conflict-fallback, that's fine), cw-concrete, dataforth, heieck-org, mvan. Mike to /schedule the agent or run manually depending on availability tonight.
**Known noise:** the `role_assigned` helper queries legacy `roleAssignments` and may report MISSING for PIM-managed assignments. Script handles it correctly via Conflict-fallback. Cosmetic only. TODO to teach `role_assigned` about `roleAssignmentSchedules` is tracked but not blocking.
### 4. Break-glass admin — design APPROVED (revised: TWO accounts)
Mike's question on the existing admin@ + sysadmin@ + GDAP partner access prompted a reshape. Break-glass is complementary to those, not redundant — see §6 for the four-paths rationale. Net change from my earlier "1 break-glass + 1 YubiKey" recommendation: **two break-glass accounts, two YubiKeys, split physical storage.**
**Accounts:**
| | Primary | Secondary |
|---|---|---|
| UPN | `breakglass1-csc@cascadestucson.com` | `breakglass2-csc@cascadestucson.com` |
| Type | Cloud-only Global Admin, no license | Cloud-only Global Admin, no license |
| FIDO2 | YubiKey #1 | YubiKey #2 |
| Physical storage | Cascades on-site, sealed envelope, sign-out audit | ACG office safe (Mike) |
| Vault entry | `clients/cascades-tucson/breakglass1.sops.yaml` | `clients/cascades-tucson/breakglass2.sops.yaml` |
Both accounts:
- Cloud-only (NOT synced from on-prem) — survives Entra Connect breaks
- Excluded from password expiration policy
- 32-char randomly generated password (separate per account, both vaulted)
- Member of `SG-Break-Glass` group, which is excluded from CA policies 1, 2, 3 and any future CA
- Sign-in alerts via LAW KQL alert rule:
```kql
SigninLogs
| where UserPrincipalName in ("breakglass1-csc@cascadestucson.com", "breakglass2-csc@cascadestucson.com")
| where ResultType == 0 // success only — for any-attempt alert, drop this
```
Action Group → email Mike + Howard immediately. Builds after the LAW lands as part of audit retention.
- Quarterly test: sign in with each, verify FIDO2 works, verify password still valid. Calendar both for the same day.
**Storage split rationale:** isolates failure domains — office fire, ACG compromise, individual rogue actor all need both keys to hit zero recovery, and they're physically separated. Microsoft's official guidance is exactly this two-account pattern.
**Cost:** $50 for two YubiKeys (was $25 for one). Trivial.
**Build order:**
1. Create both accounts (Cloud-only, license-free, GA role)
2. Generate + vault both 32-char passwords
3. Create `SG-Break-Glass` group, add both members
4. Register YubiKey #1 to breakglass1, YubiKey #2 to breakglass2 (both at the workstation, then physically separate)
5. Test sign-in with both before exiting the build session
6. Seal YubiKey #1 in envelope, store at Cascades; YubiKey #2 to ACG safe
7. Verify CA exclusion of `SG-Break-Glass` group is in place on every existing CA policy BEFORE building the new 4-policy design
When we have 3-4 HIPAA tenants on this pattern, codify into `onboard-tenant.sh` as `--enable-breakglass` (creates both accounts + group + alert rule per tenant).
### 5. ALIS BAA + Risk Analysis doc — your action items unchanged
These are still on your to-do (email Meredith for ALIS BAA; draft Risk Analysis doc via Ollama for Mike's review). Independent of CA + audit retention. Do whenever.
### 6. Four admin paths into Cascades — failure-domain map
Mike clarified the admin account model: **`admin@` and `sysadmin@` are intentionally excluded from Entra Connect sync — they live cloud-only.** The local AD `Administrator` is a separate on-prem identity, not part of the M365 admin model.
Final picture:
| Account | Lives | Synced? | CA posture |
|---|---|---|---|
| On-prem AD `Administrator` | Cascades AD only | No — never authenticates to M365 | N/A (out of scope here; managed via on-prem AD security separately) |
| `admin@cascadestucson.com` | Cloud-only (Connect-excluded by design) | No, intentional | Subject to CA + must be in `SG-External-Signin-Allowed` |
| `sysadmin@cascadestucson.com` | Cloud-only (Connect-excluded by design) | No, intentional | Subject to CA + must be in `SG-External-Signin-Allowed` |
| ACG GDAP partner (Mike + Howard from `@azcomputerguru.com`) | Foreign principal | N/A | Subject to Cascades CA — must be excluded from policy 3 via "Service provider users" |
| `breakglass1-csc@` / `breakglass2-csc@` | Cloud-only (definitionally) | No | `SG-Break-Glass` group excluded from all CA |
Failure modes each path covers:
| Path | Failure modes it covers |
|---|---|
| `admin@` / `sysadmin@` | Day-to-day Cascades-side admin work; immune to Entra Connect issues (excluded from sync) |
| ACG GDAP partner | Day-to-day MSP delivery; no dependency on Cascades-side identities |
| Break-glass (×2) | CA misconfig (the live risk during cutover next week), accidental disable/lockout of admin@/sysadmin@, ACG compromise, GDAP relationship revoke |
**Why break-glass is still needed even though admin@/sysadmin@ are already cloud-only:** the differentiator is **CA exclusion** and **ACG-independence**, not Connect immunity. During CA cutover, a misconfigured policy can lock out admin@ + sysadmin@ + GDAP foreign principals simultaneously (especially under the policy 3 GDAP bug). Break-glass is the only path that survives all three.
The on-prem AD `Administrator` account is "its own thing" per Mike — managed via on-prem AD security practices, not part of this design. If/when HIPAA on-prem hardening lands (privileged access workstation pattern, LAPS, audit, MFA-via-smartcard, etc.), that's a separate Wave.
### 7. Audit retention build dependencies
Your CA pilot cutover does NOT block on audit retention being live. Pilot phase is on Report-only and only flipping a small set of policies for `SG-Caregivers-Pilot`. Audit retention is HIPAA-tier infra and primarily matters before **Phase 3 (production rollout to all caregivers)** and before any real ePHI events flow through the system.
**Order of operations from here:**
1. (You) Finish pilot Outlook + LinkRx/Helpany apps + first phone enrollment + compliance flip
2. (You) Create pilot user + cloud group `SG-Caregivers-Pilot`
3. (You) Stand up audit retention (runbook ref above) — can interleave with #1-2 while waiting on app sync delays
4. (You) Build break-glass admin (depends on audit retention LAW for sign-in alerts)
5. (You) Stage 4 new CA policies in Report-only, assigned to `SG-Caregivers-Pilot` only
6. (You) 24-48h Report-only review, fix gaps
7. (You) Flip CA to On
8. (You) Run pilot validation tests
9. (You) Phase 3 production rollout (after AD prereq cleanup + Entra Connect staging exit)
Audit retention slots in around #3 because it doesn't block anything and the LAW has to exist before #4. If Microsoft's Entra Diagnostic Settings endpoint fights you, fall back to manual Azure portal setup for now (we can codify the API later) and don't let it block CA progress.
---
## Configuration Decisions Made
- **CA bypass design:** APPROVED as written by Howard (no changes)
- **Audit retention architecture:** APPROVED hybrid LAW + Storage, ACG-billed
- **Audit retention subscription:** reuse `e507e953-2ce9-4887-ba96-9b654f7d3267` (GuruRMM Trusted Signing sub) with isolated `rg-audit-*` resource groups
- **Audit retention region:** `westus2`
- **Audit retention RBAC:** Mike = Owner (sub level), Howard = Contributor (per-RG scope on `rg-audit-*`)
- **Backfill sweep:** APPROVED, Howard runs at his convenience (6 tenants, idempotent, ~5 min)
- **Break-glass count:** TWO accounts (Microsoft official guidance — primary at Cascades, secondary at ACG)
- **YubiKey order:** $50 (TWO YubiKeys, split storage Cascades/ACG)
- **Break-glass naming:** `breakglass1-csc@cascadestucson.com` and `breakglass2-csc@cascadestucson.com`
- **Policy 3 user targeting:** revised to also exclude Service provider users (GDAP foreign principal exclusion — bug fix)
---
## Files Created / Modified
- **NEW:** `.claude/skills/remediation-tool/references/audit-retention-runbook.md` — full design + per-tenant runbook, including conceptual cURL for Entra Diagnostic Settings (Howard validates exact endpoint during dry-run)
- This session log
---
## Pending / Incomplete (handoff)
### Mike's outstanding items
- [ ] Order TWO YubiKeys ($50 total, Amazon — one to Cascades sealed envelope via Howard, one to ACG safe)
- [ ] **One-time:** Grant Howard Owner on the ACG subscription so he can self-serve all future MSP-side Azure work (audit retention RGs and beyond):
```bash
az role assignment create \
--assignee howard.enos@azcomputerguru.com \
--role "Owner" \
--scope "/subscriptions/e507e953-2ce9-4887-ba96-9b654f7d3267"
```
- [ ] **One-time guardrails to make Owner-Howard low-risk:**
- Resource lock on `gururmm-signing-rg` (CanNotDelete) so the signing infra can't be accidentally removed:
```bash
az lock create --name signing-protect --lock-type CanNotDelete \
--resource-group gururmm-signing-rg \
--notes "Protect GuruRMM Trusted Signing infra from accidental deletion"
```
- PAYG cost alert at ~$50/mo total via Cost Management (Azure portal, 5-min UI task)
- [x] Document in `clients/cascades-tucson/CONTEXT.md` that admin@ and sysadmin@ are intentionally Connect-excluded (cloud-only by design). Done — added "M365 admin model" section.
### Howard's items
- See "Order of operations" above. Audit retention is the new #3 — slots in around the pilot phone enrollment work without blocking it.
- **Backfill sweep against the 6 ACG tenants.** Run `bash .claude/skills/remediation-tool/scripts/onboard-tenant.sh <tenant-id>` against bg-builders, cascades-tucson, cw-concrete, dataforth, heieck-org, mvan to apply the new CA Admin role + Policy.Read.All backfill. Idempotent — safe to re-run. Cascades will be noisy (PIM-managed role triggers Conflict-fallback path); that's expected. Recommend running this BEFORE starting audit retention since (a) it warms up against the patched script, (b) it confirms the baseline directory roles + Graph perms are in place across all tenants, which is a prereq for any future audit-retention codification. ~5 minutes total. Drop a one-liner session log when done.
### Tracked TODOs (not blocking)
- [ ] Teach `role_assigned` helper about `roleAssignmentSchedules` (cosmetic noise only)
- [ ] Build OMA Activity API harvester (4-6 hours dev when ready, after Cascades audit retention runs cleanly for 30d)
- [ ] Codify `onboard-tenant.sh --enable-audit-archive` flag (after pilot validated)
- [ ] Codify `onboard-tenant.sh --enable-breakglass` flag (after 3-4 HIPAA tenants on pattern)
---
## Reference
- Runbook: `.claude/skills/remediation-tool/references/audit-retention-runbook.md`
- Howard's prior session log: `clients/cascades-tucson/session-logs/2026-04-29-howard-cascades-bypass-pilot-phase-b-buildout.md`
- ACG Azure subscription: `e507e953-2ce9-4887-ba96-9b654f7d3267` (vault: `services/azure-trusted-signing.sops.yaml`)
- ACG Trusted Signing RG (existing, separate from audit): `gururmm-signing-rg`
- Cascades tenant ID: `207fa277-e9d8-4eb7-ada1-1064d2221498`

View File

@@ -0,0 +1,186 @@
# 2026-04-29 — Cascades audit retention design + Pro-Tech Services email investigation
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-BEAST-ROG
- **Role:** admin
- **Session span:** 2026-04-29 ~13:30 PT to ~14:50 PT (~1.5 hours, two work threads)
## Session Summary
The session began with the Cascades Tucson HIPAA work, starting with a `/sync` pull that included recent commits from Howard and Mike, along with a local radio-show commit pushed. Howard's note on the Cascades CA fix Path B execution and his `onboard-tenant.sh` patch were surfaced for review. The Conditional Access bypass design was reviewed and approved, with a critical bug identified in policy 3 that would block GDAP admins during the CA cutover. The bug was corrected by explicitly excluding service provider users from the policy. A concern about the "device bootstrap chicken-and-egg" was raised but later resolved upon clarification that bootstrap users are always admins with MFA on Cascades Wi-Fi.
The session then transitioned to the Audit Retention Design, where a hybrid architecture of Log Analytics Workspace and Storage Account with lifecycle policy was decided. The billing model was set to ACG-billed within the HIPAA-tier MRR, and the existing ACG Azure subscription `e507e953-2ce9-4887-ba96-9b654f7d3267` (the GuruRMM Trusted Signing sub) was reused with isolated `rg-audit-*` resource groups. A full runbook was written to document the design and implementation steps. Mike's question about how break-glass relates to existing admin@/sysadmin@/GDAP partner access prompted a reshape from one break-glass account to two, with FIDO2 YubiKeys split between Cascades and ACG. Mike clarified that admin@ and sysadmin@ are intentionally Connect-excluded (cloud-only by design); this was documented in `clients/cascades-tucson/CONTEXT.md` as a permanent reference.
For RBAC, Mike asked whether Howard could grant himself Contributor on the new audit RGs rather than Mike doing it. Decision: grant Howard Owner on the entire ACG subscription, matching the existing trust model in `CLAUDE.md`, with two guardrails — a `CanNotDelete` lock on `gururmm-signing-rg` and a $50/mo Cost Management alert. This unblocks all future MSP-side Azure self-service and removes Mike as a permanent bottleneck.
A separate thread emerged when Mike asked to check DNS for `pro-techhelps.com`. After pulling records, Mike provided email headers from a "test" message from Jenny Holtsclaw at `pro-techservices.co` (a different domain). DNS recon on both showed they are GoDaddy-resold M365 (`NETORG6702699.onmicrosoft.com` tenant) with no custom DKIM and no DMARC. Inky flagged the test message PCL=4 (high phishing suspicion) but did not quarantine. The diagnosis: DKIM signs only with the `.onmicrosoft.com` fallback selector (not aligned with the From: domain), and combined with absent DMARC, strict receivers (Google Workspace, healthcare, finance) silently drop calendar invites while regular email passes — explaining the "some recipients never receive invites" symptom Jenny had reported. Mike confirmed both domains belong to one company that only sends from `.co`, and decided to pitch Michelle Sora (the business contact) on migrating off GoDaddy onto direct M365 with both domains consolidated under one cleanly-managed tenant. Drafted an email for Michelle — first version too technical/peer-tone, revised to non-technical IT-provider framing once Mike clarified Michelle is "stubborn and afraid of change." Final draft written to a temp file and opened in Notepad for Mike's review.
## Key Decisions
- **Hybrid audit retention architecture** (LAW 90d interactive + Storage Account 6yr cold archive). Both fed by the same Diagnostic Settings export — single ingest, two retention tiers. Rationale: balance live forensics against compliance archive cost.
- **ACG-billed audit retention**, bundled into HIPAA-tier MRR. ~$0.501.00/mo per tenant. Avoids Cascades Azure subscription friction; supports "we handle compliance, you don't think about it" MSP positioning.
- **Reuse existing ACG Azure subscription** `e507e953-2ce9-4887-ba96-9b654f7d3267` (the GuruRMM Trusted Signing sub) rather than provisioning a new one. RG isolation between signing and audit. Future split to a dedicated compliance subscription deferred until 3+ HIPAA tenants or audit ask demands it.
- **Two break-glass accounts at Cascades** (`breakglass1-csc@` and `breakglass2-csc@`) per Microsoft official guidance. FIDO2 YubiKeys split-stored: one at Cascades sealed envelope, one in ACG safe. Cost $50 vs $25 single — trivial.
- **Howard granted Owner on the ACG subscription** rather than per-RG Contributor. Matches `CLAUDE.md` trust model; one-time grant unblocks all future self-service. Guardrails: `CanNotDelete` lock on `gururmm-signing-rg` + $50/mo Cost Management alert.
- **CA policy 3 user targeting revised** to exclude "Service provider users" (GDAP foreign principals) in addition to `SG-External-Signin-Allowed` and `SG-Break-Glass`. Without this, ACG GDAP partner admins lose remote MSP access at CA cutover.
- **admin@ and sysadmin@ at Cascades are intentionally Connect-excluded** (cloud-only by design). Local AD `Administrator` is a separate on-prem identity layer, not part of the M365 admin model. Documented in CONTEXT.md.
- **UAL handling deferred**: poll-based Office 365 Management Activity API harvester to write JSON blobs to the per-tenant Storage Account. Build after pilot CA cutover validated and Cascades audit retention runs cleanly for 30 days. Estimated 4-6 hours dev + test.
- **Backfill sweep moved to Howard's column.** Mike originally took ownership; reallocated when it became clear Howard can run it locally during his next session and surface results in his own log.
- **Email to Michelle framed as IT-provider follow-up, not technical pitch.** No DNS/DKIM/DMARC jargon. GoDaddy positioned as the external constraint that locks everyone (including Mike) out of fixing it properly. Reassurance front-and-center: same email addresses, same Outlook, nothing changes for the team.
## Problems Encountered
- **Policy 3 GDAP bug** (Cascades CA design). Howard's policy 3 ("All users" minus `SG-External-Signin-Allowed`) would block ACG GDAP partner admins because Microsoft's "All users" CA target includes service-provider foreign principals. Caught while walking through Mike's question on how break-glass relates to admin@/sysadmin@/GDAP. Resolved: explicit "Service provider users" exclusion added to the design.
- **Bootstrap chicken-and-egg false alarm.** Initially raised concern that new shared phones (non-compliant until first user sign-in) would be trapped by policy 2 (require MFA on non-compliant device). Mike pointed out the bootstrap user is always an admin (sysadmin@ or Howard) on Cascades Wi-Fi — admin has MFA, policy 1 doesn't fire on-network, policy 2 prompts MFA which admin satisfies, device flips compliant. No CA exception needed. Concern retracted in favor of operational rule: phones don't get handed to caregivers until SDM bootstrap is done.
- **Misread of "schedule with Howard."** When Mike said "schedule with Howard to get that going," I interpreted it as invoking the `/schedule` skill (remote agent on cron). Started the AskUserQuestion flow. Mike clarified he meant coordinate with Howard via session log when convenient for him, not create a remote agent. Backed out. Coordination is now via the existing session log + cross-user note hook.
- **First Michelle email draft was too technical.** Initial draft used peer-MSP tone with DKIM/DMARC explanation, ~450 words. Mike clarified he's already her IT provider and Michelle is "stubborn and afraid of change" — needed non-technical, reassurance-forward framing without DNS jargon. Revised to ~280 words with GoDaddy-as-constraint framing, "nothing changes for users" emphasis, and a soft close on scheduling a call.
## Configuration Changes
### Files created
- `.claude/skills/remediation-tool/references/audit-retention-runbook.md` — full design + per-tenant runbook (Phase 1 ACG-side resources, Phase 2 customer-tenant Diagnostic Settings, Phase 3 verification, Phase 4 deferred OMA harvester, operational + cost notes)
- `clients/cascades-tucson/session-logs/2026-04-29-mike-audit-retention-design-and-handoff.md` — full handoff log for Howard (covers all 6 decisions, runbook reference, order of operations, four-paths failure-domain map)
- `session-logs/2026-04-29-session.md` — this file
### Files modified
- `clients/cascades-tucson/CONTEXT.md` — added "M365 admin model" section (full admin landscape: on-prem AD Administrator, Connect-excluded admin@/sysadmin@, GDAP partner principals, two break-glass accounts; CA targeting consequences)
### Files NOT touched (orthogonal to this session, deliberately not committed in this log's commit)
- `projects/radio-show/audio-processor/server/main.py` — pre-existing modification, separate radio-show Q&A scoring work
- `projects/radio-show/audio-processor/classify_qa_quality.py` — pre-existing untracked file from radio-show work
- `.claude/scheduled_tasks.lock` — transient runtime file
### Temp files (not part of repo)
- `C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt` — Michelle Sora email draft (Mike opened in Notepad to review/send)
- `C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt` — Ollama prompt for this log
- `C:/Users/guru/AppData/Local/Temp/save_narrative_output.md` — Ollama narrative output
## Commands & Outputs
### /sync at session start
- Pulled 3 commits (Howard cascades close-out, Mike earlier auto-sync, Mike PIM gap note)
- Pushed 1 local commit (radio-show `import_to_sqlite.py` from earlier work)
- Vault: clean both directions
- HEAD after sync: `6b63c15`
### DNS reconnaissance — pro-techhelps.com
```
MX: protechhelps-com0i.mail.protection.outlook.com (pri 0)
SPF: "v=spf1 include:spf.protection.outlook.com -all" [direct M365]
DMARC: NXDOMAIN
DKIM: selector1/selector2 NXDOMAIN
WHOIS: registered 2016-12-09, GoDaddy registrar, NS69/NS70.DOMAINCONTROL.COM
A: 3.33.130.190, 15.197.148.33 (AWS-hosted)
```
### DNS reconnaissance — pro-techservices.co
```
MX: protechservices-co0i.mail.protection.outlook.com (pri 0)
SPF: "v=spf1 include:secureserver.net -all" [GoDaddy-resold path]
"NETORG6702699.onmicrosoft.com" [marker]
"v=verifydomain MS=9047794" [M365 verification]
DMARC: NXDOMAIN
DKIM: selector1/selector2 NXDOMAIN
A: 76.223.105.230, 13.248.243.5 (AWS-hosted)
Tenant: NETORG6702699.onmicrosoft.com (GoDaddy-resold M365)
```
### Email header analysis (Jenny Holtsclaw test message)
- From: `Jenny Holtsclaw <jholtsclaw@pro-techservices.co>` to `mike@azcomputerguru.com`
- Subject: `test`, sent 2026-04-29 22:31:34 UTC
- DKIM: `pass header.d=NETORG6702699.onmicrosoft.com` (fallback signer, not aligned with From: domain)
- SPF: pass on first hop, fail on Inky relay (normal forwarding artifact)
- DMARC: bestguesspass (no policy)
- compauth: `pass reason=130`
- Inky verdict: `X-IPW-Inky-PCL: 4` (high phish suspicion), `X-IPW-Inky-SCL: 0` (not spam), `X-Inky-Quarantine: False`
## Credentials & Secrets
No new credentials created or rotated this session. Existing references confirmed:
- **ACG Azure subscription**: `e507e953-2ce9-4887-ba96-9b654f7d3267`
- Vault: `services/azure-trusted-signing.sops.yaml`
- Existing usage: `gururmm-signing-rg` for GuruRMM Trusted Signing cert profile
- New planned usage: `rg-audit-*` resource groups for HIPAA-tier audit retention
- **Cascades tenant ID**: `207fa277-e9d8-4eb7-ada1-1064d2221498`
- **Cascades Tenant Admin SP appId**: `709e6eed-0711-4875-9c44-2d3518c47063` (objectId in Cascades: `a5fa89a9-b735-4e10-b664-f042e265d137`)
- **Cascades Conditional Access Administrator role template ID**: `b1be1c3e-b65d-4f19-8427-f6fa0d97feb9`
- **Microsoft Graph appRole `Policy.Read.All`**: `246dd0d5-5bd0-4def-940b-0421030a5b68`
- **Pro-Tech Services tenant**: `NETORG6702699.onmicrosoft.com` (GoDaddy-resold; not under ACG management)
## Infrastructure & Servers
- ACG Azure subscription `e507e953-2ce9-4887-ba96-9b654f7d3267` — westus2 region, owner: Mike Swanson, currently single resource group `gururmm-signing-rg`. Planned expansion: `rg-audit-cascadestucson` and future `rg-audit-*` per HIPAA-tier client.
- Cascades on-prem: CS-SERVER `192.168.2.254` (DC + file server, domain `cascades.local`); pfSense `192.168.0.1`; Synology `192.168.0.120:5000`. Cascades named location id `061c6b06-b980-40de-bff9-6a50a4071f6f` (both WANs trusted: `72.211.21.217/32` + `184.191.143.62/32`).
- Pro-Tech Services external: M365 EOP MX endpoints (`protechservices-co0i.mail.protection.outlook.com`, `protechhelps-com0i.mail.protection.outlook.com`); Inky phishfence (`ipw.inkyphishfence.com` 100.24.129.5) in inbound mail path to ACG.
## Pending / Incomplete Tasks
### Mike's outstanding items (Cascades audit retention)
- [ ] Order TWO YubiKeys ($50 total — one for Cascades sealed envelope via Howard, one for ACG safe)
- [ ] One-time: grant Howard Owner on ACG subscription
```bash
az role assignment create \
--assignee howard.enos@azcomputerguru.com \
--role "Owner" \
--scope "/subscriptions/e507e953-2ce9-4887-ba96-9b654f7d3267"
```
- [ ] One-time: resource lock on signing RG
```bash
az lock create --name signing-protect --lock-type CanNotDelete \
--resource-group gururmm-signing-rg \
--notes "Protect GuruRMM Trusted Signing infra from accidental deletion"
```
- [ ] One-time: $50/mo cost alert via Cost Management UI
### Mike's outstanding items (Pro-Tech Services)
- [ ] Review the Michelle Sora email draft in Notepad (`C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt`)
- [ ] Send the email when ready; possibly draft a parallel Jenny variant if needed
- [ ] If Michelle accepts: schedule discovery call covering user count, tenant access verification (admin.microsoft.com test), other domains, third-party integrations
- [ ] If migration proceeds: build proposal, T2T migration plan, license cutover sequencing
### Howard's outstanding items (Cascades, in order)
1. Backfill sweep against 6 ACG tenants (~5 min, idempotent): bg-builders, cascades-tucson, cw-concrete, dataforth, heieck-org, mvan
2. Pilot Outlook + LinkRx/Helpany apps + first phone compliance flip (resume from prior session)
3. Pilot user + cloud group `SG-Caregivers-Pilot`
4. Audit retention buildout (RG, Storage Account with lifecycle, LAW, Diagnostic Settings) per the runbook
5. Two break-glass accounts + YubiKey enrollment + `SG-Break-Glass` group + sign-in alert KQL rule on LAW
6. CA policies in Report-only → 24-48h review → flip to On
7. Phase 3 production rollout (after AD prereq cleanup + Entra Connect staging exit)
### Tracked TODOs (not blocking)
- [ ] Teach `role_assigned` helper about `roleAssignmentSchedules` (cosmetic noise only on PIM-managed assignments)
- [ ] Build OMA Activity API harvester for UAL (4-6 hours dev when ready, after Cascades audit retention runs cleanly for 30d)
- [ ] Codify `onboard-tenant.sh --enable-audit-archive` flag (after pilot validated)
- [ ] Codify `onboard-tenant.sh --enable-breakglass` flag (after 3-4 HIPAA tenants on pattern)
## Reference Information
### Files / paths
- Audit retention runbook: `.claude/skills/remediation-tool/references/audit-retention-runbook.md`
- Cascades handoff log: `clients/cascades-tucson/session-logs/2026-04-29-mike-audit-retention-design-and-handoff.md`
- Cascades CONTEXT (M365 admin model added): `clients/cascades-tucson/CONTEXT.md`
- Howard's prior session log: `clients/cascades-tucson/session-logs/2026-04-29-howard-cascades-bypass-pilot-phase-b-buildout.md`
- onboard-tenant.sh (now patched with CA Admin role + Policy.Read.All backfill): `.claude/skills/remediation-tool/scripts/onboard-tenant.sh`
- Michelle email draft (temp): `C:/Users/guru/AppData/Local/Temp/michelle-email-draft.txt`
### Vault entries referenced
- `services/azure-trusted-signing.sops.yaml` — ACG Azure subscription
- `clients/cascades-tucson/m365-admin.sops.yaml` — Cascades M365 admin
- `clients/cascades-tucson/m365-sysadmin.sops.yaml` — Cascades M365 sysadmin
- `clients/cascades-tucson/wifi-cscnet.sops.yaml`, `mdm-service-account.sops.yaml`, `pfsense-firewall.sops.yaml`
### Future vault entries (not yet created)
- `clients/cascades-tucson/breakglass1.sops.yaml` — primary break-glass (when account created)
- `clients/cascades-tucson/breakglass2.sops.yaml` — secondary break-glass
### External
- Microsoft Diagnostic Settings on Entra: ARM endpoint `https://management.azure.com/providers/microsoft.aadiam/diagnosticSettings/{name}?api-version=2017-04-01-preview` (NOT a Graph endpoint — Howard validates exact call shape during dry-run)
- Microsoft directory role template — Conditional Access Administrator: `b1be1c3e-b65d-4f19-8427-f6fa0d97feb9`
- Office 365 Management Activity API (UAL harvester future work): `/api/v1.0/{tenantId}/activity/feed/subscriptions/content?contentType=...`
### Cost model (audit retention)
- Per HIPAA-tier tenant: ~$0.501.00/mo (LAW $0.23 ingest + $0.03 retention + Storage $0.15 lifecycle blended)
- ACG cumulative at 5 HIPAA tenants: ~$510/mo
- Forensics rehydration (archive blob): ~$50100 per incident retrieval, one-time per event