Files
claudetools/projects/dataforth-dos/datasheet-pipeline/upload_all_for_web.py
Mike Swanson dd5c5afd4b Session log + DFWDS Node port + Hoffman API uploader pipeline
Built the missing piece between the test datasheet pipeline and Dataforth's
new product API. End-to-end:

- Pulled DFWDS (Dataforth Web Datasheet System) VB6 source from
  AD1\Engineering\ENGR\ATE\Test Datasheets\DFWDS to local for analysis
- Decoded its filename validation: A-J prefix decodes (A=10..J=19), all-
  numeric WO# valid (no leading 0), anything else bad
- Ported the validation + move logic to Node (dfwds-process.js)
- Built bulk uploader (upload-delta.js) for Hoffman's Swagger API
  (POST /api/v1/TestReportDataFiles/bulk with OAuth client_credentials)

Sanitized 3 prior reference scripts (fetch-server-inventory, test-scenarios,
test-upload-two) to read CF_* env vars instead of hardcoded creds.

Live drain results:
- 897 files moved Test_Datasheets -> For_Web (all valid, no renames, no
  bad), DFWDS port summary in 1.1s
- Pushed entire For_Web (7,061 files) to Hoffman API in 49.7s @ 142/s:
  Created=803 Updated=114 Unchanged=6,144 Errors=0
- Server count: 489,579 -> 490,382 (+803 net new)

Also:
- Added clients/dataforth/.gitignore to exclude plaintext Oauth.txt note
- Added clients/instrumental-music-center/docs/2026-04-13-ticket-notes.md
  (ticket write-up of 2026-04-11/12/13 IMC1 RDS removal/SQL migration work)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 21:06:50 -07:00

80 lines
3.7 KiB
Python

"""Build delta = entire For_Web folder, upload via upload-delta.js.
Server is idempotent; already-present items return Unchanged, new ones Created.
Avoids the slow full server-inventory pull when we just want to drain new content.
"""
import base64, paramiko, subprocess, time, threading, yaml
ad2_pwd = yaml.safe_load(subprocess.run(['sops','-d','D:/vault/clients/dataforth/ad2.sops.yaml'],
capture_output=True, text=True, timeout=30, check=True).stdout)['credentials']['password'].replace('\\','')
api = yaml.safe_load(subprocess.run(['sops','-d','D:/vault/clients/dataforth/api-oauth.sops.yaml'],
capture_output=True, text=True, timeout=30, check=True).stdout)
REMOTE_DIR = 'C:/Users/sysadmin/Documents/dataforth-uploader'
DELTA_NAME = 'delta_for_web_all.txt'
c = paramiko.SSHClient(); c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
c.connect('192.168.0.6', username='sysadmin', password=ad2_pwd,
timeout=30, banner_timeout=45, look_for_keys=False, allow_agent=False)
def ps_b64(cmd, to=300):
enc = base64.b64encode(cmd.encode('utf-16-le')).decode()
_, o, e = c.exec_command(f'powershell -NoProfile -EncodedCommand {enc}', timeout=to)
return o.read().decode('utf-8','replace'), e.read().decode('utf-8','replace')
print('[1] enumerate For_Web on AD2')
ps = (
f'$out = "{REMOTE_DIR}/{DELTA_NAME}"; '
r'Get-ChildItem "C:\Shares\webshare\For_Web" -File -Filter *.TXT | '
r'ForEach-Object { '
r' $sn = [System.IO.Path]::GetFileNameWithoutExtension($_.Name); '
r' "$sn|$($_.FullName)|$($_.Length)|$($_.LastWriteTime.ToString("o"))" '
r'} | Set-Content -Path $out -Encoding ASCII; '
r'(Get-Content $out).Count'
)
out, err = ps_b64(ps, to=180)
print(f' delta entries: {out.strip()}')
if err.strip() and 'CLIXML' not in err: print('[stderr]', err[:300])
print('\n[2] upload all via upload-delta.js (idempotent)')
ps_upload = (
f'$env:CF_TOKEN_URL = "{api["endpoints"]["token-url"]}"; '
f'$env:CF_API_BASE = "{api["endpoints"]["api-base"]}"; '
f'$env:CF_CLIENT_ID = "{api["credentials"]["client-id"]}"; '
f'$env:CF_CLIENT_SECRET = "{api["credentials"]["client-secret"]}"; '
f'$env:CF_SCOPE = "{api["credentials"]["scope"]}"; '
f'cd "{REMOTE_DIR}"; '
f'& node upload-delta.js --delta "{REMOTE_DIR}/{DELTA_NAME}" --batch 100 2>&1'
)
enc = base64.b64encode(ps_upload.encode('utf-16-le')).decode()
stdin, stdout, stderr = c.exec_command(f'powershell -NoProfile -EncodedCommand {enc}', timeout=3600)
def reader(s):
for line in iter(lambda: s.readline(), ''):
if not line: break
print(line.rstrip(), flush=True)
t = threading.Thread(target=reader, args=(stdout,), daemon=True); t.start()
t2 = threading.Thread(target=reader, args=(stderr,), daemon=True); t2.start()
t0 = time.time()
while time.time() - t0 < 3600:
if stdout.channel.exit_status_ready(): break
time.sleep(1)
t.join(timeout=5); t2.join(timeout=5)
rc = stdout.channel.recv_exit_status() if stdout.channel.exit_status_ready() else -1
print(f'\n[upload rc={rc}]')
print('\n[3] post-upload server stats')
import json, urllib.request, urllib.parse
body = urllib.parse.urlencode({'grant_type':api['credentials']['grant-type'],
'client_id':api['credentials']['client-id'],
'client_secret':api['credentials']['client-secret'],
'scope':api['credentials']['scope']}).encode()
req = urllib.request.Request(api['endpoints']['token-url'], data=body, method='POST',
headers={'Content-Type':'application/x-www-form-urlencoded'})
tok = json.loads(urllib.request.urlopen(req,timeout=30).read())['access_token']
req = urllib.request.Request(f'{api["endpoints"]["api-base"]}/api/v1/TestReportDataFiles/stats',
headers={'Authorization':f'Bearer {tok}'})
print(json.dumps(json.loads(urllib.request.urlopen(req,timeout=30).read()), indent=2))
c.close()