Files
claudetools/clients/valleywide/app-modernization/source-analysis/drive3_unique.py
Mike Swanson adbc9601bf feat(valleywide): drive 3 analysis - 117 GB Hyper-V VHDX lead identified
Drive 3 (12 TB, 11.99 TB used, only 43 GB free) — third VWP backup
rotation drive. Per Mike, all three drives are rotation copies; content
largely overlaps.

Net-new content vs drives 1 and 2:
- D:\WIN7-Orders\Darv-2\VWP1.VHDX (117 GB, 2023-09-01) — Hyper-V disk
  named "VWP1" in a Darv-2 folder. Likely Darv's later workstation.
  Strongest candidate for finding any 2021-2023 source code that
  postdates our 2020-06-09 ORDERS_C.vbp baseline. Not copied.
- D:\WIN7-Orders\WindowsImageBackup\VWIN7-PC\...vhd (22 GB) — Windows
  Image Backup of the VWIN7-PC machine, dated 2023-08-31.
- D:\VWP-FIN\ (~44 GB) — Finance machine backups + RAR archives. Not
  relevant to Orders modernization but useful for QuickBooks context.

SourceSafe search:
- 1224 SourceSafe-related matches but ALL are Visual Studio install
  directories (Microsoft Visual Studio\Common\VSS\) and .SCC sentinel
  files. No srcsafe.ini (actual repository) anywhere on this drive.
  The SourceSafe repo is on a different drive (likely Darv's personal
  drive, not in the office rotation).

Source code:
- No .vbp newer than 2020-06-09 baseline. Same TEST_VWP.vbp scaffold
  from drive 2 (2021-08-16, 810 bytes) present here too.

Updated .gitignore: added *.vhd (was missing — only had *.vhdx).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:46:38 -07:00

48 lines
1.8 KiB
Python

"""Drive 3 — find SourceSafe artifacts + size up the VWIN7 and VWP-FIN top-level dirs that weren't on drive 1."""
import csv
WIZ = 'clients/valleywide/app-modernization/WizTree_20260516174356.csv'
ss_hits = []
top_level_sizes = {} # path -> (size, files, folders, mod)
TARGETS = {
r'D:\VWIN7\\',
r'D:\VWP-FIN\\',
r'D:\Archive\\',
r'D:\WIN7-Orders\Darv-2\\',
}
with open(WIZ, encoding='utf-8-sig', errors='replace') as f:
r = csv.reader(f); next(r); next(r)
for row in r:
if not row or len(row) < 4: continue
p = row[0]
pl = p.lower()
if pl.endswith('srcsafe.ini') or pl.endswith('.scc'):
try: sz = int(row[1])
except (ValueError, IndexError): sz = 0
ss_hits.append((row[3], sz, p))
if '\\sourcesafe' in pl or '\\vss\\' in pl:
try: sz = int(row[1])
except (ValueError, IndexError): sz = 0
ss_hits.append((row[3], sz, p))
if p.endswith('\\') and p in TARGETS:
try:
sz = int(row[1]); files = int(row[5]); folders = int(row[6])
top_level_sizes[p] = (sz, files, folders, row[3])
except (ValueError, IndexError):
pass
print('=== Top-level dirs unique-ish to drive 3 ===')
print(f'{"GB":>8} {"Files":>8} {"Folders":>7} Modified Path')
for path, (sz, files, folders, mod) in sorted(top_level_sizes.items()):
print(f'{sz/1024/1024/1024:>8.2f} {files:>8} {folders:>7} {mod:<19} {path}')
print(f'\n=== SourceSafe artifacts ({len(ss_hits)} matches, top 50) ===')
ss_hits.sort(reverse=True)
seen_paths = set()
for mod, sz, p in ss_hits[:50]:
# Dedupe by leaf path (snapshots often duplicate)
leaf_key = p.lower().split('\\estimating archive')[-1] if 'estimating archive' in p.lower() else p.lower()
print(f'{mod} {sz:>8}b {p}')