16 KiB
Session Log: 2026-04-21
User
- User: Mike Swanson (mike)
- Machine: DESKTOP-0O8A1RL
- Role: admin
Session Summary
This session completed the M365 multi-tenant onboarding initiative. The goal was to onboard all 41 CIPP-managed partner tenants to the ComputerGuru app suite (Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on) with minimal customer interaction — customers click one URL (Tenant Admin consent), then the onboard-tenant.sh script handles all remaining programmatic consent and role assignments automatically.
Accomplishments
-
Tenant Admin manifest fix (from previous session): Added
AppRoleAssignment.ReadWrite.All(GUID:06b708a9-e830-4db3-a914-8e69da51d44f) to Tenant Admin app. This was required for the script to programmatically grant appRoleAssignments to other SPs in customer tenants. Fixed via Management app PATCH. -
Re-onboarded martylryan.com and grabblaw.com: These two were consented before the manifest fix. Both needed Tenant Admin re-consent (done by Mike), then script re-run. Both now fully onboarded with all apps and directory roles.
- martylryan.com: All 4 apps + Exchange Admin + User Admin + Auth Admin assigned
- grabblaw.com: 3 apps (no MDE) + Exchange Admin + User Admin + Auth Admin assigned; Defender skipped (no MDE license)
-
Cascades Tucson GoDaddy admin account (from previous session):
- Found disabled account
admin@NETORGFT4257522.onmicrosoft.com - Renamed UPN to
admin@cascadestucson.com(domain was verified default) - Enabled account, reset password to
Gptf*ttb123!@#-cs - Vaulted at
D:/vault/clients/cascades-tucson/m365-admin.sops.yaml
- Found disabled account
-
Batch tenant sweep: Ran
onboard-tenant.shagainst all 40 pending tenants. 17 were already fully consented and onboarded successfully. 23 still need initial Tenant Admin consent. -
tenant-consent.html: Updated to show only remaining pending tenants. 19 tenants now marked done (including martylryan + grabblaw post re-consent). 22 still pending.
Files Modified This Session
| File | Change |
|---|---|
.claude/skills/remediation-tool/scripts/onboard-tenant.sh |
Major rewrite: programmatic consent for all 4 non-admin apps after Tenant Admin consent |
.claude/skills/remediation-tool/references/tenants.md |
NEW: full 41-tenant list with display names, domains, tenant IDs, onboarding status, consent URLs |
.claude/skills/remediation-tool/references/tenant-consent.html |
NEW + updated: dark-theme HTML page with clickable consent links; 19 tenants marked done |
.claude/skills/remediation-tool/references/gotchas.md |
Updated: Grabblaw and martylryan marked fully onboarded with dates |
D:/vault/clients/cascades-tucson/m365-admin.sops.yaml |
NEW: SOPS-encrypted admin credentials for Cascades Tucson |
Credentials
Cascades Tucson M365 Admin
- Username: admin@cascadestucson.com
- Password: Gptf*ttb123!@#-cs
- Vault:
D:/vault/clients/cascades-tucson/m365-admin.sops.yaml - Notes: Renamed from admin@NETORGFT4257522.onmicrosoft.com (original GoDaddy provisioned account)
onboard-tenant.sh Architecture
Flow
- Resolve domain → tenant GUID (openid-configuration)
- Acquire Tenant Admin token (client_credentials) to verify consent
- Locate resource SPs in tenant: Microsoft Graph, Exchange Online, Defender ATP
- For each app (Security Investigator, Exchange Operator, User Manager, Defender Add-on):
- Create SP if missing (
POST /servicePrincipals) — sleep 5 after creation for replication - Grant all appRoleAssignments idempotently
- Create SP if missing (
- Assign directory roles (Exchange Admin to Sec Inv SP; User Admin + Auth Admin to User Mgr SP)
- Print status table
Key GUIDs
Permission resource app IDs:
- Microsoft Graph:
00000003-0000-0000-c000-000000000000 - Exchange Online:
00000002-0000-0ff1-ce00-000000000000 - Defender ATP:
fc780465-2017-40d4-a0c5-307022471b92
App IDs:
- Security Investigator:
bfbc12a4-f0dd-4e12-b06d-997e7271e10c - Exchange Operator:
b43e7342-5b4b-492f-890f-bb5a4f7f40e9 - User Manager:
64fac46b-8b44-41ad-93ee-7da03927576c - Tenant Admin:
709e6eed-0711-4875-9c44-2d3518c47063 - Defender Add-on:
dbf8ad1a-54f4-4bb8-8a9e-ea5b9634635b
Tenant Admin manifest permissions required:
AppRoleAssignment.ReadWrite.All:06b708a9-e830-4db3-a914-8e69da51d44fApplication.ReadWrite.All:1bfefb4e-e0b5-418b-a88f-73c46d2cc8e9Directory.ReadWrite.All:19dbc75e-c2e2-444c-a770-ec69d8559fc7
Bugs Fixed During Development
- stdout/stderr pollution in
create_sp_if_missing: Human-readable status lines were going to stdout, corruptingsp_oid=$(create_sp_if_missing ...). Fix: all status echoes changed to>&2. - Graph replication delay: Newly created SPs need ~5s before appRoleAssignments can be granted. Fix:
sleep 5after successful SP creation. - jq null iterator:
[.value[] | select(...)]threw on fresh SPs with null appRoleAssignments. Fix:[.value[]? | select(...)].
Onboarding Status (as of 2026-04-21)
Done (19 tenants)
andysmobilefuel.com, tedards.net, cascadestucson.com, cclac.net, cobaltfinearts.com, dataforth.com, glaztech.com, heieck.org, jemaenterprises.com, mvan.onmicrosoft.com, bestmassageintucson.com, rednourlaw.com, reliantpump.services, ridgetopgroup.com, safesitellc.com, sonorangreenllc.com, valleywideplastering.com, martylryan.com, grabblaw.com
Pending — Needs Tenant Admin Consent (22 tenants)
Brian Kahn (briankahn.onmicrosoft.com), cuadro.design, Curtis Plumbing (cparizona.onmicrosoft.com), cwconcretellc.com, Feline Ltd (felineltd.onmicrosoft.com), ICE INC (iceinc.us.com), Instrumental Music (instrumentalmusic.onmicrosoft.com), JR Kennedy (jrkco.com), Khalsa Montessori (khalsamontessorischools.onmicrosoft.com), Kittle Design (kittlearizona.com), LeeAnn Parkinson (lamaddux.com), Patient Care Advocates (pcatucson.com), Putt Land Surveying (puttsurveying.com), Rincon Vista Vet (rinconvistavet.onmicrosoft.com), Russo Law (rrs-law.com), SANDTEKO (SANDTEKOMACHINERY.com), Shave Kevin (az2son.com), Starr Pass Realty (starrpass.com), The Dumpster Guys (dumpsterguys.onmicrosoft.com), The Prairie Schooner (theprairieschooner.onmicrosoft.com), Tucson Golden Corral (tucsongoldencorral.onmicrosoft.com), Tucson Mountain Motors (tucsonmountainmotors.com), Von's Carstar (vonscarstar.com)
Not in CIPP (needs investigation)
- Len's Auto Brokerage (tenant: 5ba99b55-...) — Mike accidentally opened Brian Kahn consent URL logged in as admin@lensautobrokerage.onmicrosoft.com; Len's may not be in CIPP partner list
Pending / Next Steps
- 22 tenants need initial Tenant Admin consent — use
tenant-consent.htmlto send links or open directly; after each consent, runonboard-tenant.sh <domain> - Len's Auto Brokerage — check if in CIPP, add if not, then onboard
- Brian Kahn — needs Brian Kahn's own Global Admin to click consent URL (not admin@lensautobrokerage.onmicrosoft.com)
- Tenant-consent.html UUID tenants — three entries show GUIDs not domains (f5f86b40, dfee2224, and cparizona/felineltd/etc use onmicrosoft.com domains) — verify display names in tenants.md match
Reference
- Consent HTML:
D:/claudetools/.claude/skills/remediation-tool/references/tenant-consent.html - Tenant list:
D:/claudetools/.claude/skills/remediation-tool/references/tenants.md - Onboarding script:
D:/claudetools/.claude/skills/remediation-tool/scripts/onboard-tenant.sh - Gotchas:
D:/claudetools/.claude/skills/remediation-tool/references/gotchas.md - Cascades vault:
D:/vault/clients/cascades-tucson/m365-admin.sops.yaml
Update: 07:26 — Cloudflare Tunnel Decommission + pfSense Audit
Summary
Decommissioned the Cloudflare tunnel (cloudflared Docker container on Jupiter), migrated all 9 tunneled services to direct Cloudflare proxy, and conducted a comprehensive pfSense audit removing ~40 stale config objects (NAT rules, filter rules, outbound NAT, IPsec, and aliases).
Background: Why the Tunnel Was Created
A Cox routing issue caused Cloudflare-proxied services to route inefficiently (Cox → Cloudflare PoP → back to Cox WAN). The cloudflared tunnel was created as a workaround — it establishes an outbound connection from Jupiter to Cloudflare PoPs, so all proxied traffic flows through the tunnel rather than requiring port forwards.
Cloudflared Container — DNS Fix
Problem: cloudflared container had no DNS servers configured ([]), causing it to use Docker's default resolver which couldn't reach region1.v2.argotunnel.com. This produced a Failed to refresh DNS local resolver timeout every 5 minutes, causing intermittent slowness.
Fix: Recreated container with explicit DNS:
--dns=1.1.1.1 --dns=1.0.0.1
Container startup confirmed clean after DNS fix.
Tunnel ID: 78d3e58f-1979-4f0e-a28b-98d6b3c3d867
Config location on Jupiter: /mnt/cache/appdata/cloudflared/config.yml
Cloudflare DNS Migration
Key discovery: pfSense has NO NAT rule for port 443 on primary Cox WAN IP (98.181.90.163). All port 443 rules are bound to specific 72.194.62.x IPs. Direct proxy to 98.181.90.163 gave 522 errors because of this.
Solution: Use 72.194.62.10 (which has an existing 443 → NPM:18443 NAT rule) as the target for NPM-backed services.
Services migrated from tunnel CNAME → direct Cloudflare proxy A records:
| Hostname | Old Target | New Target | Backend |
|---|---|---|---|
| git.azcomputerguru.com | tunnel CNAME | 72.194.62.10 | NPM → Jupiter:18443 |
| rmm.azcomputerguru.com | tunnel CNAME | 72.194.62.10 | NPM → Jupiter:18443 |
| rmm-api.azcomputerguru.com | tunnel CNAME | 72.194.62.10 | NPM → Jupiter:18443 |
| plexrequest.azcomputerguru.com | tunnel CNAME | 72.194.62.10 | NPM → Jupiter:18443 |
| sync.azcomputerguru.com | tunnel CNAME | 72.194.62.10 | NPM → Jupiter:18443 |
| azcomputerguru.com | tunnel CNAME | 72.194.62.5 | IX Web Hosting:443 |
| analytics.azcomputerguru.com | tunnel CNAME | 72.194.62.5 | IX Web Hosting:443 |
| community.azcomputerguru.com | tunnel CNAME | 72.194.62.5 | IX Web Hosting:443 |
| radio.azcomputerguru.com | tunnel CNAME | 72.194.62.5 | IX Web Hosting:443 |
All 9 services tested and confirmed working. Container then stopped and removed.
Public IP layout (relevant):
72.194.62.5→ IX Web Hosting server (172.16.3.10) via NAT72.194.62.10→ NPM on Jupiter (172.16.3.20:18443) via NAT98.181.90.163/31— Primary Cox WAN, NO port 443 NAT rule
pfSense SSH Access Fix
pfSense SSH was failing non-interactively with "Too many authentication failures" (SSH client tried multiple keys, hit MaxAuthTries before reaching id_ed25519).
Fix: Added id_ed25519 public key to pfSense admin user via web GUI (port 4433). Had to include webguicss=pfSense.css and dashboardcolumns=2 fields in the form POST to avoid theme validation errors.
SSH command: ssh -o StrictHostKeyChecking=no -i C:/Users/guru/.ssh/id_ed25519 -p 2248 admin@172.16.0.1
Vault updated: D:/vault/infrastructure/pfsense-firewall.sops.yaml — added web_port, ssh_key, ssh_cmd fields.
pfSense Audit — Rules Removed
All removals were done by uploading PHP scripts via SCP, executing on pfSense, then reloading filter with pfSsh.php playback svc restart filter.
Config backup pattern: /cf/conf/config.xml.bak-<description>-<timestamp>
Round 1 — TSM Network (dead server):
- NAT: TSM Network HTTP forward (72.194.62.x → TSM)
- NAT: TSM Network HTTPS forward
- NAT: LDAP to DC16
- FILTER: Associated pass rules
Round 2 — Neptune, IPsec, Gitea SSH, orphans:
- NAT: Neptune Exchange HTTP/HTTPS forwards
- NAT: 172.16.3.25 wildcard forward
- NAT: 172.16.3.25 HTTP/HTTPS forwards
- NAT: Gitea SSH forward (72.194.62.x:22 → Jupiter) — superseded by Cloudflare proxy
- FILTER: All associated pass rules
- FILTER: Orphaned LDAP filter rule
- FILTER: Neptune pass rules
- IPSEC: Phase 1 + Phase 2 for 184.182.208.116 (Mike's house — no longer needed)
Round 3 — Seafile:
- NAT: 72.194.62.9 Seafile/Sync forward — Seafile desktop client uses sync.azcomputerguru.com (now via NPM on .10), not a dedicated IP; .9 rule was orphaned
- FILTER: Associated pass rule
Round 4 — Neptune outbound NAT:
- OUTBOUND NAT: NEPTUNE_Internal → 72.194.62.7 masquerade rule
Round 5 — Neptune Exchange filter (missed in Round 2):
- FILTER: Rule with destination NEPTUNE_Internal:Exchange_Ports (was a filter rule, not NAT — earlier script only checked NAT)
Total rules removed: ~22 NAT/filter/IPsec rules
pfSense Audit — Aliases Removed (22)
All_Ports, EX1_Internal, Emby_Ports, Exchange_Ports, Exchange_VIP,
MailProtector_LDAP, NEPTUNE_Internal, Nextcloud_Local, NPM_Ports,
OwnCloud_Ports, RNAT_Webhost, RustDesk_Server, RustDesk_Server_Internal,
SpamIssue, Syslog, UNMS, Unifi_SSL, Unraid_Jupiter, Unraid_Sync,
VIP_NO_AUTODISCOVER, VPN_Ports, Webhost_Internal
Remaining aliases (all active/valid):
Cloudflare, FiberGW, HTTP_HTTPS, ICE_Users, NPM_Server, Unifi_Server, Unifi_TCP, Unifi_UDP, Webhost_TCP, Webhost_UDP, Tailscale, TFTP Server, WireGuard
pfSense Items Investigated — Left Alone
| Item | Decision |
|---|---|
| Golden Corral (72.194.62.6 → 172.16.1.6, HTTP_HTTPS) | Leave as-is — live client, working, no RDP exposed (80/443 only) |
| 72.194.62.7 VIP ("MAIL/NEPTUNE") | Unused IP — no rules reference it; could remove VIP or reassign |
Cloudflare alias |
Unused — could apply to restrict WAN access to CF IPs only |
Broad pass tcp/udp any→any WAN rule |
Noted, not yet addressed |
| 72.194.62.4 → NPM:18443 ("Emby on Fiber") | Verified pointing to NPM, labeled correctly |
| OwnCloud VM (172.16.3.22) | NAT rule still valid — cloud.acghosting.com lives there |
Infrastructure Reference
| Asset | Detail |
|---|---|
| pfSense | 172.16.0.1, SSH port 2248, HTTPS port 4433, admin user |
| pfSense config | /cf/conf/config.xml |
| Jupiter (Unraid) | 172.16.3.20 |
| NPM (Nginx Proxy Manager) | Jupiter:18443 (HTTPS), Jupiter:1880 (HTTP) |
| cloudflared | Stopped/removed — tunnel decommissioned |
| Primary Cox WAN | 98.181.90.163/31 — no port 443 NAT |
| Additional public IPs | 72.194.62.2–10, 70.175.28.51–57 |
Pending / Next Steps (Infrastructure)
- 72.194.62.7 VIP — decide: remove (Neptune gone) or repurpose
- Cloudflare alias — consider applying to WAN rules to restrict to CF IPs only (security hardening)
- Broad WAN pass rule — review and tighten if possible
- 22 M365 tenants — still need initial Tenant Admin consent (unchanged from earlier session)
Note for Howard
Vault + SOPS age key setup required on ACG-Tech03L before remediation-tool will work.
1. Clone the vault repo
Run in Git Bash (real terminal, not Claude Code shell):
git clone http://azcomputerguru@172.16.3.20:3000/azcomputerguru/vault.git D:/vault
Password: Gptf*77ttb123!@#-git
2. Install the SOPS age key
Create this file: C:\Users\howard\.config\sops\age\keys.txt
Content (copy exactly):
# created: 2026-03-30T13:53:19-07:00
# public key: age1qz7ct84m50u06h97artqddkj3c8se2yu4nxu59clq8rhj945jc0s5excpr
AGE-SECRET-KEY-1DE3V6V0ZLLZ45A7GA77M79CTN4LZQMTRCURP8VRGNLV6T2FSZEEQXUW2EU
3. Add vault_path to identity.json
Edit .claude/identity.json in your ClaudeTools folder, add:
"vault_path": "D:/vault"
4. Test
bash C:/claudetools/.claude/skills/remediation-tool/scripts/get-token.sh grabblaw.com investigator
Expected: JWT token starting with eyJ...
Note for Mike (Mac)
Vault + SOPS age key setup required on Mikes-MacBook-Air before remediation-tool will work.
1. Clone the vault repo
Run in a real terminal (not Claude Code shell):
git clone http://azcomputerguru@172.16.3.20:3000/azcomputerguru/vault.git ~/vault
Password: Gptf*77ttb123!@#-git
2. Install the SOPS age key
mkdir -p ~/.config/sops/age
cat > ~/.config/sops/age/keys.txt << 'AGEEOF'
# created: 2026-03-30T13:53:19-07:00
# public key: age1qz7ct84m50u06h97artqddkj3c8se2yu4nxu59clq8rhj945jc0s5excpr
AGE-SECRET-KEY-1DE3V6V0ZLLZ45A7GA77M79CTN4LZQMTRCURP8VRGNLV6T2FSZEEQXUW2EU
AGEEOF
chmod 600 ~/.config/sops/age/keys.txt
3. Add vault_path to identity.json
Edit /Users/azcomputerguru/ClaudeTools/.claude/identity.json, add:
"vault_path": "/Users/azcomputerguru/vault"
4. Test
bash ~/ClaudeTools/.claude/skills/remediation-tool/scripts/get-token.sh grabblaw.com investigator
Expected: JWT token starting with eyJ...