Files
claudetools/clients/valleywide/app-modernization/source-analysis/vwp2_summary.py
Mike Swanson 9a2a05e3cb feat(valleywide): drive 3 deep dive - VWP1.VHDX mount + 97-Server\VWP2 grab
Drive 3 yielded the biggest finds in the project so far.

VHDX mount + scan (D:\WIN7-Orders\Darv-2\VWP1.VHDX, 117 GB Hyper-V disk):
- Mounted read-only as F:, scanned in 85s, dismounted
- NOT Darv's dev box - it's the office "owner" admin workstation
- No newer source code found inside, but:
  - Vwp11.mdb (2023-07-27, 369 MB) - NEWEST DB snapshot anywhere
  - Vwp.mdb (2022-12-23, 769 MB) - on Desktop\Darv VWP
  - Orders Versions/ desktop folder with 2 EXEs + 7 shortcuts
  - The .lnk shortcuts all pointed to G:\VWP2\Orders*.exe - the
    "Aha!" that revealed where Darv's iteration happened
- Saved to source-code/from-VHDX-VWP1/

The VWP2 grab (D:\97-Server-G-Drive\g$ 2024-04-10\VWP2\):
- This is Darv's actual iteration workspace on the production
  Orders server (G: drive)
- 16.36 GB total, 1,061 files. Grabbed 886 files (~893 MB) filtered to
  *.exe, *.rpt, *.ocx, and VB6 source extensions:
  - 64 Orders*.exe versions - complete iteration history (includes the
    production Orders_10A.exe + Orders_10Z.exe variant + dozens more
    with Darv's "iterate-and-rename" naming pattern)
  - 820 Crystal Reports (.rpt)
  - 2 .ocx supporting controls
- Skipped 23 historical .mdb backups (15.8 GB) - we already have
  newer snapshots from 000_ASource and VHDX
- Skipped 6 large subfolders (HOLD, HHOLD, Pay_2021_0325, GWAC,
  20220205, 20211010 RPT) - mostly more MDBs
- Saved to source-code/from-VWP2-97server/

What we learned about the 4-year gap:
- No source code newer than 2020-06-09 ORDERS_C.vbp baseline found
  on any of the three rotation drives
- The 64 EXE versions in VWP2 go through 2022 - Darv was iterating
  and rebuilding compiled output but not updating his .vbp source
  control. This is consistent with his "rename and try" workflow
- The production exe (Orders_10A.exe) is in this batch - now we can
  use VB Decompiler Pro on it without needing the original drive

Helper scripts:
- scan_vhdx.ps1 - fast PowerShell scan of a mounted VHDX for source/DB
- find_vwp2.py - cross-CSV search for the VWP2 path
- vwp2_summary.py - size+type breakdown of the VWP2 folder
- debug_vwp2.py - one-off debug helper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:02:47 -07:00

72 lines
2.5 KiB
Python

"""Summarize VWP2 folder on drive 3 — size, newest content, file type breakdown."""
import csv
from collections import defaultdict
WIZ = 'clients/valleywide/app-modernization/WizTree_20260516174356.csv'
# Use raw string to avoid \v -> vertical tab interpretation
SEARCH_FOLDER = r'\97-server-g-drive\g$ 2024-04-10 21;13;45\vwp2\\'.lower()
folder_sizes = {}
files_by_ext = defaultdict(list)
all_files = []
prefix = SEARCH_FOLDER.rstrip('\\')
print(f'DEBUG prefix: {prefix!r}')
with open(WIZ, encoding='utf-8-sig', errors='replace') as f:
r = csv.reader(f); next(r); next(r)
rows_checked = 0
for row in r:
if not row or len(row) < 4:
continue
rows_checked += 1
p = row[0]; pl = p.lower()
if prefix not in pl:
continue
try:
sz = int(row[1])
except (ValueError, IndexError):
continue
if p.endswith('\\'):
try:
files = int(row[5]); folders = int(row[6])
except (ValueError, IndexError):
files, folders = 0, 0
folder_sizes[p] = (sz, files, folders, row[3])
else:
ext = p.rsplit('.', 1)[-1].lower() if '.' in p.rsplit('\\', 1)[-1] else '(none)'
files_by_ext[ext].append((row[3], sz, p))
all_files.append((row[3], sz, p, ext))
# Top-level VWP2 folder
top = next((k for k in folder_sizes if k.lower().rstrip('\\').endswith('\\vwp2')), None)
if top:
sz, files, folders, mod = folder_sizes[top]
print(f'=== {top} ===')
print(f' {sz/1024/1024/1024:.2f} GB, {files:,} files, {folders} subfolders, modified {mod}')
print()
print('=== File type breakdown ===')
totals = []
for ext, items in files_by_ext.items():
total_mb = sum(s for _,s,_ in items) / 1024 / 1024
totals.append((total_mb, ext, len(items)))
for total_mb, ext, n in sorted(totals, reverse=True)[:15]:
print(f' .{ext:<8} {n:>5} files, {total_mb:>10.1f} MB')
print()
print('=== Newest 30 files in \\VWP2\\ tree ===')
all_files.sort(reverse=True)
for mod, sz, p, ext in all_files[:30]:
print(f' {mod:<19} {sz/1024/1024:>8.1f} MB {p}')
print()
print('=== Top-level subfolders of \\VWP2\\ ===')
for path, (sz, files, folders, mod) in sorted(folder_sizes.items(), key=lambda x: -x[1][0]):
if path.lower().rstrip('\\').endswith('\\vwp2'):
continue
rel = path.lower().split('\\vwp2\\', 1)[1].rstrip('\\')
if '\\' in rel:
continue # only direct children
print(f' {sz/1024/1024:>8.1f} MB {files:>5} files {mod:<19} {path}')