sync: auto-sync from HOWARD-HOME at 2026-06-01 20:16:54

Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-01 20:16:54
This commit is contained in:
2026-06-01 20:17:01 -07:00
parent 507897c639
commit cd0d2a3c23

View File

@@ -76,3 +76,54 @@ Howard requested the actual extraction be held until **6:00 PM MST** and that he
- Memory: `reference_trebesch_qnp3on5.md`, `project_trebesch_pst_consolidation.md`
- Syncro ticket #31953 (address book), #32160 (threats)
- RMM API: POST /api/agents/:id/command (context=user_session for COM), poll GET /api/commands/:id
---
## Update: 20:16 MST — PST contact extraction, merge, enrichment (executed)
### Session Summary
Executed the staged PST contact-consolidation on DESKTOP-QNP3ON5 after Howard's 6pm-MST go. Ran the two-phase pipeline (extract via Outlook COM in Owner's session -> merge as SYSTEM), first on the "safe set" (MaxMB=10000, skips big unmounted mail archives) producing 771 unique contacts, then on all PSTs (MaxMB=0) which added the two 48GB Outlook2 archives (793 + 725 contacts each — the giants DID hold address books) for 826 unique. Each 48GB AddStore took ~9s (Outlook reads the store index, not the whole file).
Howard then asked to parse structured data out of the free-text Notes into proper Outlook columns. Sampled 40 notes: 185/293 notes contained phone numbers, 44 contained emails, and real street addresses appeared in a clean "street, City, ST ZIP" shape. Wrote treb-enhance.ps1 (NON-DESTRUCTIVE: copies data into empty structured fields, never deletes from Notes). First enhance pass placed 431 phones, 68 addresses, 52 emails across 218 rows.
Verification then caught a real data-quality defect: the source "E-mail Address" field (Email1Address) was junk — a single letter for 666 of 695 contacts; real emails were scattered across E-mail Display Name and the Notes. As-is the file would have imported garbage emails, and the original merge had keyed dedup on that junk field. Rewrote treb-merge.ps1 to reconstruct real (@-bearing) emails from all email fields and key dedup on real-email-or-name (never the junk letter). Hit a PowerShell single-element-array collapse bug ($re[0] returned a [char]); fixed with @() wrapping + [string] casts.
Re-ran the corrected pipeline: 6118 raw -> 821 unique. Investigated high "Copies merged" counts (max 28) and confirmed they are LEGITIMATE heavy duplication of single real people (e.g., Tim Gleason / tgleason@SWAPA.org x28), not over-merge — every cluster has a consistent name+email. Final verification PASSED: 821 rows round-trip, 48 columns, 689 emails all valid (@), 0 junk; 221 contacts with phones (up from 7), 69 with addresses, 293 notes preserved intact. CSV is RFC-valid (round-trips despite commas/newlines in notes) and import-ready.
### Key Decisions
- Ran safe set first, then all-PSTs on Howard's "go, all PSTs" — giants turned out to hold real address books (worth scanning).
- Notes handling is copy-not-move (non-destructive): structured fields filled FROM notes, notes left verbatim. Protects against parse errors; nothing lost.
- Address parsing restricted to high-confidence "street, City, ST(valid US) ZIP5" pattern; ambiguous "City, ST" mentions left in Notes.
- Bare 7-digit phones captured as-is (no invented area code).
- Email reconstruction: real email = any @-bearing value across Address/DisplayName/2/3; dedup keys on real-email-or-name, never the junk single-letter Email1Address.
- High copies-merged (to 28) accepted as legitimate after confirming consistent identity per cluster — source has heavy internal + cross-PST duplication.
### Problems Encountered
- Source "E-mail Address" field was junk single letters (666/695); real emails in DisplayName/Notes. Original merge mis-keyed on it. Fixed by reconstructing real emails and re-keying. Verified 0 junk emails post-fix.
- PowerShell single-element-array collapse: RealEmails() returned a scalar string when one email, so $re[0] indexed a [char] (no .ToLower()). Fixed with @() wrapping and [string] casts.
- Initial 7-contacts-with-phones looked like a bug; confirmed via raw-JSON check it was real — phones live in the Notes free text, not structured fields. Drove the enrichment pass.
### Configuration Changes (this update)
- Created C:\claudetools\.claude\tmp\treb-enhance.ps1 (parse phones/addresses/emails from Notes into columns, non-destructive).
- Rewrote C:\claudetools\.claude\tmp\treb-merge.ps1 (real-email reconstruction + corrected dedup key + array-collapse fix).
- Edited treb-extract.ps1: added $MaxMB cap (skip big UNMOUNTED PSTs; mounted always read) and Suggested-Contacts/cache folder exclusion + Source-folder tracking.
### Results / Deliverable
- FINAL: C:\Users\Owner\Desktop\Contacts\AT-Trebesch-Contacts-FINAL-20260601-201451.csv (385 KB, 821 contacts, Outlook native headers + audit cols Source PSTs/Source folders/Copies merged).
- 16 unique PSTs scanned (24 total, byte-identical copies deduped). Contact-bearing: addresses(793), earthlink Default(794, live/mounted), earthlink165(793), Outlook Data File(740), Outlook2 x2(793/725), Outlook1 x2(374), Outlook x2(366). archive1/backup/A_T 2/MORE AT had none.
### Pending / Incomplete (this update)
- OFFERED, awaiting Howard: (1) name-cleanup pass — for contacts whose First/Last is an email handle (e.g. "badgerbd"), promote the real name from E-mail Display Name (heuristic, conservative, non-destructive); (2) clean up intermediate CSVs in the Desktop\Contacts folder, leaving only the FINAL file; (3) log time / resolution note on Syncro #31953.
- _work\ JSONs + extract.log/enhance log remain on the Desktop folder.
### Reference (this update)
- Scripts: .claude/tmp/treb-extract.ps1, treb-merge.ps1, treb-enhance.ps1
- Agent: ba173f0c-19e8-488d-834c-1b6f6dfd5699 (DESKTOP-QNP3ON5)
- Syncro #31953 (address book), customer 238740