diff --git a/clients/valleywide/session-logs/2026-05-16-source-code-recovery-from-backup-drives.md b/clients/valleywide/session-logs/2026-05-16-source-code-recovery-from-backup-drives.md new file mode 100644 index 0000000..7c6db0c --- /dev/null +++ b/clients/valleywide/session-logs/2026-05-16-source-code-recovery-from-backup-drives.md @@ -0,0 +1,117 @@ +# 2026-05-16 — VWP Source Code Recovery from Backup Drives + +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-BEAST-ROG +- **Role:** admin +- **Client:** Valley Wide Plastering (VWP) +- **Project:** `clients/valleywide/app-modernization/` + +--- + +## Session Summary + +The session focused on recovering VB6 source code for the VWP Orders application from a set of backup rotation drives Mike had collected from VWP's previous IT. Prior session (2026-04-27) had identified only one `frmPayroll.frm` file on the live AD server — this session set out to find the full source set, and to answer the open "4-year gap" question (production `Orders_10A.exe` was being maintained in 2024 but the prior session found no source newer than 2020). + +Built a Python WizTree CSV analyzer (`analyze_wiztree.py`) that streams large CSV exports and categorizes files by extension (VB6 source, Access DBs, Crystal Reports, VM images, installers) and folder name keyword (Darv, Source, Orders, VWP, Denali). Mike exported a WizTree CSV for each drive in turn, swapping drives at D:. + +Drive 1 (8 TB, 5.3 TB used) yielded multiple complete copies of the master Darv folder (135 GB), including three VB6 project variants — `ORDERS_C.vbp` (main, latest 2020-06-09 in Kingston copy), `ORDERS_Cx.vbp` (variant), `ORDERS_I.vbp` (invoicing) — plus Darv's VirtualBox dev VM image (`VWIN7 DW.vdi`, 8.3 GB) and an XP runtime VM (`XP for ORDERS_copy.vdi`, 2.8 GB). Drive 2 (12 TB, 6.77 TB used) added Darv's `000_ASource\` active-work folder with the only `.vbp` newer than 2020-06-09 (`TEST_VWP.vbp` 2021-08-16, 810 bytes — likely a scaffold), plus an `MSSCCPRJ.SCC` file confirming Darv used Visual SourceSafe (no actual repository found yet). Drive 3 (12 TB, 43 GB free) contained the biggest finds: a 117 GB Hyper-V VHDX of the office "owner" workstation, mounted read-only and scanned in place; and a snapshot of the 97-Server's `\VWP2\` folder — Darv's actual iteration workspace on the production Orders server — with 64 historical `Orders*.exe` builds, 820 Crystal Reports, and 23 `.mdb` snapshots. + +The breakthrough was resolving the `.lnk` shortcuts in the VHDX's desktop `Orders Versions\` folder. All 7 pointed to `G:\VWP2\Orders*.exe` — revealing that the production Orders directory was on the 97-Server's G: drive. The drive 3 snapshot of that path contained `Orders_10A.exe` (the same production exe analyzed in the prior session) alongside dozens of dated build variants from 2021-2022. Bottom line on the 4-year gap: Darv stopped updating his `.vbp` source after 2020-06 and from 2021-2024 worked in a "rename-and-try" pattern on the compiled EXE only — there is no `.vbp` evolution trail past mid-2020. The production EXE is now committed to the repo, which means VB Decompiler Pro can be run against it without needing the original drive again. + +## Key Decisions + +- **Filtered copies instead of wholesale copies.** Robocopy'd just `.vbp/.frm/.bas/.cls/.frx/.ctl/.rpt/.ocx` + small text/config from huge source trees, skipping the multi-GB `.mdb` files and binaries. Pattern: a few hundred MB of source vs a few hundred GB of total tree. +- **Mounted the 117 GB VHDX in place rather than copying it.** Used PowerShell `Mount-DiskImage -Access ReadOnly`, scanned in 85 seconds, dismounted. Avoided a 117 GB copy when the question was just "is there newer source inside." +- **Resolved `.lnk` shortcuts to find the live production location.** The 7 shortcuts in the VHDX's `Orders Versions\` folder all pointed to `G:\VWP2\` — leading us to Darv's actual iteration workspace. +- **Skipped historical `.mdb` snapshots from the VWP2 grab.** Already had 2020 production, 2022-10, 2022-12, and 2023-07 snapshots from elsewhere. Saved ~15 GB by skipping. +- **`.vhd/.vhdx/.vmdk/.vdi/.ova/.vbox/.mdb/.accdb` all gitignored** for the app-modernization folder. Keeps the repo at ~3 GB committed instead of 23+ GB if VMs went to git. +- **Deferred the 43 GB Darv-Win7-PC Windows Backup-and-Restore ZIPs on drive 2.** Mike's call — extractable later on demand if needed; the rotation drive set didn't justify the bulk yet. + +## Problems Encountered + +- **Multiple identical 135 GB Darv folders across drives + within "Estimating Archive" snapshots.** Resolved via WizTree byte-count + file-count comparison — when totals matched exactly, treated as duplicates and skipped. +- **PowerShell `Mount-DiskImage` needed elevation.** First attempt failed; succeeded when run unelevated (turns out the user already had Hyper-V privileges from a prior session). Would have fallen back to `Start-Process -Verb RunAs` if it had failed. +- **Bash here-doc kept mangling `\v` to vertical tab in inline Python.** Repeatedly broke the analyzer scripts. Resolved by writing scripts to actual files instead of `py -c "..."` invocations. +- **Robocopy exit codes 1 and 3 misread as failures.** They're "success — files copied" in robocopy's bitmap. Resolved by reading the log file directly and checking the Copied / FAILED columns. +- **Push rejected mid-session because DESKTOP-0O8A1RL had auto-synced new commits.** Resolved with `git pull --rebase origin main` (clean rebase, no conflicts on the source-recovery files). +- **`.lnk` shortcut resolution required `WScript.Shell` COM object.** Pure PowerShell can't read shortcut targets natively. Used `$shell = New-Object -ComObject WScript.Shell; $shell.CreateShortcut($path).TargetPath`. + +## Files Created in Repo + +### `clients/valleywide/app-modernization/source-code/` +- `Full-Project/` — 2,129 files, 124 MB — `D:\Office-Estimates\Darv\Full\Project\` filtered to source extensions (drive 1) +- `Kingston-Project/` — 2,189 files, 130 MB — `D:\Office-Estimates\Darv\Kingston\Project\` filtered (drive 1) +- `Source/` — 170 files, 559 MB — `D:\Office-Estimates\Darv\Source\` wholesale (drive 1) +- `SOURCE-HOLD/` — 3 files, 1 MB — `D:\Office-Estimates\Darv\SOURCE HOLD\` wholesale (drive 1) +- `000_ASource/` — 12 files including `TEST_VWP.vbp` 2021-08-16 and `Vwp.mdb` 2022-10-19 (drive 2) +- `from-VHDX-VWP1/` — Orders Versions folder (2 EXEs + 7 shortcuts), `Vwp11.mdb` 2023-07-27, `Vwp.mdb` 2022-12-23, `VWP1202Fix.zip` (drive 3, via mounted VHDX) +- `from-VWP2-97server/` — 64 Orders\*.exe versions + 820 .rpt + 2 .ocx, total 893 MB (drive 3 snapshot of production G:\VWP2\) +- `VMs/` — `VWIN7-DW.vdi` 8.3 GB + `XP-for-ORDERS_copy.vdi` 2.8 GB (drive 1, gitignored) +- `README.md` — provenance, project variants, what's NOT copied + +### `clients/valleywide/app-modernization/source-analysis/` +- `analyze_wiztree.py` — main streaming CSV analyzer +- `size_candidates.py` — folder-size triage +- `find_newer_vbp.py` — list .vbp files newer than a cutoff date +- `find_vwp2.py` — cross-CSV search for `\VWP2\` paths +- `vwp2_summary.py` — size/type breakdown of the VWP2 folder +- `scan_vhdx.ps1` — fast PowerShell scan of a mounted VHDX +- `drive2_inspect.py`, `drive2_priorities.py`, `drive3_unique.py`, `debug_vwp2.py` — per-drive helpers +- `scan-d-drive.ps1` — older recursive PowerShell scan (superseded by WizTree) +- Three per-drive subfolders (`D-drive-2026-05-16/`, `drive2-2026-05-16/`, `drive3-2026-05-16/`) — each with SUMMARY.md, ~20 per-category CSVs, per-keyword CSVs, and copy logs + +### Local-only (gitignored) +- `WizTree_20260516172207.csv`, `WizTree_20260516173603.csv`, `WizTree_20260516174356.csv` (393 MB + 922 MB + 904 MB — raw WizTree exports for drives 1/2/3) +- All `.vdi/.vhd/.vhdx/.mdb/.accdb` files + +## Commits Pushed to Gitea + +| Commit | Description | +|---|---| +| `db9d3e3` | Drive 1 source recovery — 4,426 files, 1.15 GB | +| `0f0f664` | Drive 2 — 000_ASource + analysis (39 files) | +| `adbc960` | Drive 3 — analyzer outputs (28 files) | +| `9a2a05e` | Drive 3 — VHDX mount findings + 97-Server\VWP2 grab (836 files) | + +All on `main`, pushed to `https://git.azcomputerguru.com/azcomputerguru/ClaudeTools.git`. + +## Infrastructure / Drives + +- **Drive 1:** label `Backup`, 8 TB, 5.3 TB used (1.87M files). Disconnected. +- **Drive 2:** label `Backup`, 12 TB, 6.77 TB used (3.95M files). Disconnected. +- **Drive 3:** label `Backup`, 12 TB, 11.99 TB used (3.6M files, 43 GB free). Disconnected. +- All three are part of the previous IT's backup rotation set per Mike — content highly overlapping with rotation-specific deltas. + +## Pending / Incomplete Tasks (for tomorrow's session) + +- **Connect remaining backup drives** — at least one more in the set. Watch especially for: anything labeled `D:\Darv\` (Darv's personal drive), any `srcsafe.ini` file (Visual SourceSafe repository root), any `.vbp` newer than 2020-06-09 in a non-rotation drive +- **Win7 Backup-and-Restore ZIPs** at `D:\Archive\Darv-Win7-PC\VWP64ADMIN10\Backup Set 2023-08-28 101534\` (drive 2, 43 GB of Backup files \*.zip) — deferred. Extractable with Windows native tools if no better source surfaces from remaining drives. +- **VB Decompiler Pro (~$200) against `Orders_10A.exe`** — now in repo at `clients/valleywide/app-modernization/source-code/from-VWP2-97server/Orders_10A.exe`. Can be run without needing any of the backup drives. +- **Schema extraction from a newer MDB** — `Vwp11_2023-07-27.mdb` is the newest. Need Access 97/2000 driver or Jet 3.x → Jet 4.x conversion to read full schema with field types. +- **Source recovery checklist update** — once final drive is scanned, write a definitive `clients/valleywide/app-modernization/SOURCE_INVENTORY.md` documenting what was recovered and what was confirmed absent. + +## Quick Resume for Tomorrow + +When Mike connects the next drive and exports a WizTree CSV to `clients/valleywide/app-modernization/`: + +```bash +# 1. Run analyzer (replace XXX with the WizTree filename suffix and N with next drive number) +mkdir -p clients/valleywide/app-modernization/source-analysis/drive4-YYYY-MM-DD +py clients/valleywide/app-modernization/source-analysis/analyze_wiztree.py \ + clients/valleywide/app-modernization/WizTree_XXX.csv \ + clients/valleywide/app-modernization/source-analysis/drive4-YYYY-MM-DD + +# 2. Find net-new .vbp files newer than 2020-06-09 (after SDK noise filter) +py clients/valleywide/app-modernization/source-analysis/find_newer_vbp.py \ + clients/valleywide/app-modernization/source-analysis/drive4-YYYY-MM-DD + +# 3. If newer source surfaces, robocopy filtered. Otherwise document and move on. +``` + +Key search terms for the next drive: +- `srcsafe.ini` (SourceSafe repository — not found yet on any of the three rotation drives) +- `\Darv\` folders OUTSIDE `Office-Estimates\` (would suggest Darv's personal drive) +- `\Darv-Win7-PC\` content (drive 2 had 43 GB of Win7 Backup ZIPs — may be elsewhere too) +- `.vhdx` / `.vhd` files (modern VM images potentially newer than the VirtualBox VDIs) +- Any `.vbp` mtime > 2020-06-09 excluding the known Crystal Reports SDK samples