Files
claudetools/projects/dataforth-dos/datasheet-pipeline/fetch-server-inventory.py
Mike Swanson dd5c5afd4b Session log + DFWDS Node port + Hoffman API uploader pipeline
Built the missing piece between the test datasheet pipeline and Dataforth's
new product API. End-to-end:

- Pulled DFWDS (Dataforth Web Datasheet System) VB6 source from
  AD1\Engineering\ENGR\ATE\Test Datasheets\DFWDS to local for analysis
- Decoded its filename validation: A-J prefix decodes (A=10..J=19), all-
  numeric WO# valid (no leading 0), anything else bad
- Ported the validation + move logic to Node (dfwds-process.js)
- Built bulk uploader (upload-delta.js) for Hoffman's Swagger API
  (POST /api/v1/TestReportDataFiles/bulk with OAuth client_credentials)

Sanitized 3 prior reference scripts (fetch-server-inventory, test-scenarios,
test-upload-two) to read CF_* env vars instead of hardcoded creds.

Live drain results:
- 897 files moved Test_Datasheets -> For_Web (all valid, no renames, no
  bad), DFWDS port summary in 1.1s
- Pushed entire For_Web (7,061 files) to Hoffman API in 49.7s @ 142/s:
  Created=803 Updated=114 Unchanged=6,144 Errors=0
- Server count: 489,579 -> 490,382 (+803 net new)

Also:
- Added clients/dataforth/.gitignore to exclude plaintext Oauth.txt note
- Added clients/instrumental-music-center/docs/2026-04-13-ticket-notes.md
  (ticket write-up of 2026-04-11/12/13 IMC1 RDS removal/SQL migration work)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 21:06:50 -07:00

81 lines
2.7 KiB
Python

"""Paginate API to fetch every server-side SerialNumber, write to file."""
import json
import time
import urllib.request
import urllib.parse
import os, sys
TOKEN_URL = os.environ.get("CF_TOKEN_URL", "https://login.dataforth.com/connect/token")
API_BASE = os.environ.get("CF_API_BASE", "https://www.dataforth.com") + "/api/v1"
CLIENT_ID = os.environ.get("CF_CLIENT_ID", "")
CLIENT_SECRET = os.environ.get("CF_CLIENT_SECRET", "")
SCOPE = os.environ.get("CF_SCOPE", "dataforth.web")
if not CLIENT_ID or not CLIENT_SECRET:
sys.exit("set CF_CLIENT_ID + CF_CLIENT_SECRET (vault: clients/dataforth/api-oauth.sops.yaml)")
OUTFILE = r"C:\Users\guru\AppData\Local\Temp\server_inventory.txt"
PAGE_SIZE = 1000 # try 1000 first; smaller if server complains
def get_token():
data = urllib.parse.urlencode({
"grant_type": "client_credentials",
"client_id": CLIENT_ID,
"client_secret": CLIENT_SECRET,
"scope": SCOPE,
}).encode()
with urllib.request.urlopen(urllib.request.Request(TOKEN_URL, data=data)) as r:
return json.loads(r.read())["access_token"]
def main():
token = get_token()
headers = {"Authorization": f"Bearer {token}"}
total = 0
page = 1
cursor = None
t_start = time.time()
with open(OUTFILE, "w") as f:
while True:
params = {"page": page, "pageSize": PAGE_SIZE}
if cursor:
params["afterSerialNumber"] = cursor
url = f"{API_BASE}/TestReportDataFiles?" + urllib.parse.urlencode(params)
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=30) as r:
obj = json.loads(r.read())
except Exception as e:
print(f" ERROR at page {page}: {e}")
# Refresh token and retry once
token = get_token()
headers = {"Authorization": f"Bearer {token}"}
with urllib.request.urlopen(urllib.request.Request(url, headers=headers), timeout=30) as r:
obj = json.loads(r.read())
items = obj.get("Items", [])
if not items:
break
for it in items:
f.write(it["SerialNumber"] + "\n")
total += len(items)
cursor = obj.get("NextCursor")
if page % 50 == 0 or page == 1:
rate = total / max(1, time.time() - t_start)
print(f" page {page}: total={total} rate={rate:.0f}/s cursor={cursor!r}")
if not cursor:
break
page += 1
elapsed = time.time() - t_start
print(f"\nDONE: {total} serials written to {OUTFILE} in {elapsed:.1f}s")
if __name__ == "__main__":
main()