Files
claudetools/clients/valleywide/app-modernization/source-analysis/drive2_priorities.py
Mike Swanson 0f0f664e8e feat(valleywide): add drive 2 findings - 000_ASource + analyzer outputs
Drive 2 (label "Backup", 12 TB, 6.77 TB used) — second of N VWP
backup drives. Scanned via WizTree, analyzed with analyze_wiztree.py.

NEW source content:
- 000_ASource/ — Darv's active work-in-progress folder. Contains
  TEST_VWP.vbp (2021-08-16, only .vbp newer than the 2020-06-09 baseline),
  four frmLotInfo*.frm variants (2020-10 to 2021-08), and an
  MSSCCPRJ.SCC file confirming Darv used Visual SourceSafe.
- The accompanying Vwp.mdb (2022-10-19, 764 MB) stays on local disk
  per .gitignore — newest database snapshot we have.

Analysis CSVs:
- source-analysis/drive2-2026-05-16/ — per-category + per-keyword
  breakdown of drive 2's 3.95M files (vs drive 1's 1.87M). Categories
  largely match drive 1 but with ~2x volume.

Net findings vs drive 1:
- Confirmed 4-year gap: only 4 .vbp files newer than 2020-06-09 on
  drive 2, all the same TEST_VWP.vbp scaffold. Main ORDERS_C.vbp source
  remains 2020-06-09. Darv stopped active VB6 dev around mid-2020.
- 43 GB Win7 Backup-and-Restore set in D:\Archive\Darv-Win7-PC\ (2023)
  not copied — deferred to later drives, ZIPs extractable on demand.
- Master Darv folder is bit-for-bit duplicate of drive 1's master (135 GB,
  same file/folder counts). Skipped.

New helper scripts:
- find_newer_vbp.py — list .vbp files newer than a date, filter SDK noise
- drive2_inspect.py / drive2_priorities.py — drive-specific triage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:42:19 -07:00

71 lines
2.3 KiB
Python

"""Drive 2 triage — find net-new high-value folders not present on drive 1."""
import csv
WIZ = 'clients/valleywide/app-modernization/WizTree_20260516173603.csv'
# Folders whose leaf name (last path component) we want to size
LEAVES = {
'darv-win7-pc', # likely Darv's actual Win7 backup
'98_server',
'000_asource',
'darv',
'kingston',
'full',
'virtualbox',
'vm_vdi',
}
PARENT_HINTS = {
'archive', # D:\Archive\ — backup root
'office-archive',
'office-estimates',
}
# Also look for ALL source-relevant file types unique to this drive
hits = []
all_vbp_by_path = []
with open(WIZ, encoding='utf-8-sig', errors='replace') as f:
r = csv.reader(f); next(r); next(r)
for row in r:
if not row or len(row) < 7:
continue
p = row[0]
if p.endswith('\\'):
leaf = p.rstrip('\\').rsplit('\\', 1)[-1].lower()
if leaf in LEAVES:
try:
sz = int(row[1]); files = int(row[5]); folders = int(row[6])
except (ValueError, IndexError):
continue
hits.append((sz, files, folders, row[3], p))
else:
if p.lower().endswith('.vbp'):
# Capture only VWP/Orders-relevant projects (not SDK samples)
pl = p.lower()
if 'seagat' in pl or 'samples\\code' in pl:
continue
try:
sz = int(row[1])
except (ValueError, IndexError):
continue
all_vbp_by_path.append((row[3], sz, p))
hits.sort(reverse=True)
print('=== Sizing key folder leaves ===')
print(f'{"GB":>8} {"Files":>8} {"Folders":>7} Modified Path')
for sz, files, folders, mod, p in hits[:40]:
print(f'{sz/1024/1024/1024:>8.2f} {files:>8} {folders:>7} {mod:<19} {p}')
print()
print('=== ALL VWP/Orders .vbp files in D:\\Archive\\Darv-Win7-PC\\ ===')
for mod, sz, p in sorted(all_vbp_by_path, reverse=True):
if '\\archive\\darv-win7-pc\\' in p.lower():
print(f'{mod} {sz:>6}b {p}')
print()
print('=== ALL .vbp files in Darv\\000_ASource\\ (any drive location) ===')
for mod, sz, p in sorted(all_vbp_by_path, reverse=True):
if '\\darv\\000_asource\\' in p.lower():
print(f'{mod} {sz:>6}b {p}')