- IMC: document 716 GB SQL backup cleanup, retention scheduled task, DB move C:->S:, sysadmin grant via single-user recovery, parked RDS removal after KB5075999 apply rolled back on ETW manifest error - Valleywide: document RDWeb brute-force incident on VWP-QBS, UDM port forward closure, 30-day audit showing no breach, lockout policy restoration - Dataforth: capture Swagger API review and Hoffman Zoom call prep
6.5 KiB
Session Log: 2026-04-13 — Dataforth
Summary
Continuation of the test datasheet pipeline work. Prior session (2026-04-12) confirmed PostgreSQL migration complete; Hoffman provided the new Swagger API URL; awaiting OAuth credentials. Today: reviewed the full API spec, prepared a structured question list for a Zoom call with Hoffman, and discussed architecture options (raw file upload vs. structured record push vs. direct DB).
Also helped user triage an unrelated Neptune Exchange mail-flow issue (tsorensen → external bounce). User resolved on their own before I got into it.
Work completed
API spec review
Pulled https://www.dataforth.com/swagger/v1/swagger.json and mapped endpoints.
Base URL: https://www.dataforth.com (presumed; Swagger UI at /swagger/index.html)
Authentication (IdentityServer-style)
- Flow: OAuth2 Authorization Code + PKCE
- Authorization URL:
https://login.dataforth.com/connect/authorize - Token URL:
https://login.dataforth.com/connect/token - Scopes:
openid,profile,dataforth.web - Swagger's own test client:
client_id = dataforth.swagger(NOT for our use) - OIDC discovery expected at:
https://login.dataforth.com/.well-known/openid-configuration
All endpoints
| Path | Method |
|---|---|
/api/v1/Admin/refresh-cache |
POST |
/api/v1/Admin/cache-status |
GET |
/api/v1/Categories |
GET |
/api/v1/Categories/{id} |
GET |
/api/v1/Categories/by-catalog-node/{catalogNodeId} |
GET |
/api/v1/OrderableProducts/{orderableProductId}/Attributes |
POST |
/api/v1/OrderableProducts/{orderableProductId}/Attributes/{attributeId} |
PUT/DELETE |
/api/v1/Products, /{id}, /by-part-number/{partNumber} |
GET |
/api/v1/product-series, /{id}, /by-designation/{designation}, /by-catalog-node/{catalogNodeId} |
GET |
/api/v1/ProductType, /{productTypeId}/products |
GET |
/api/v1/TestReportDataFiles |
POST (single upload) |
/api/v1/TestReportDataFiles |
GET (paginated list) |
/api/v1/TestReportDataFiles/bulk |
POST (batch upload) |
/api/v1/TestReportDataFiles/{serialNumber} |
GET / DELETE |
/api/v1/TestReportDataFiles/stats |
GET |
TestReportDataFiles payload shapes
- POST single:
{ SerialNumber: string(max 50), Content: string(min 1) }→{ SerialNumber, ContentHash, Created } - POST bulk:
{ Items: [CreateTestReportRequest, ...] }→{ TotalReceived, Created, Updated, Unchanged, Errors[] } - GET single:
{ SerialNumber, Content, CreatedAtUtc, UpdatedAtUtc } - GET stats:
{ TotalCount, LatestCreatedAtUtc, LatestUpdatedAtUtc } - Server handles dedup via ContentHash → client doesn't need to pre-check.
Architecture discussion
Three options for delivering datasheets:
- A: Raw file blob via current API — works today, zero new API work, simple client code
- B: Structured records via new endpoints — cleaner long-term; we already have parsed data in AD2's PostgreSQL
TestDataDB(2.8M records post-2026-04-12 migration). Requires Hoffman to add endpoints - C: Direct DB access — rejected (coupling, security, DBA nightmare)
Preferred path: whichever is less work for Hoffman. Frame it as offering flexibility — we can send raw text, structured JSON, or even CSV.
Questions prepared for John Hoffman Zoom call
Produced a prioritized list (MUST / SHOULD / NICE) covering:
- Batch size + payload size + rate limits (MUST)
- Idempotency + dedup semantics (MUST)
- Cutover plan from old DataforthWebShare path (MUST)
- Request: enable
client_credentialsgrant on a new client for the AD2 uploader (SHOULD) - Staging endpoint availability (SHOULD)
- PDF handling (
X:\For_Web_PDF) — same endpoint or different? (SHOULD) - Product linkage — does a TestReport need to link to a Product/Series record? (SHOULD)
- Monitoring + error visibility on his side (NICE)
- SLA / escalation contact (NICE)
Pending from Hoffman (as of end-of-session 2026-04-13)
- OAuth credentials (he said "today")
- Clarification on client_credentials grant support
- Answers to the MUST questions above after the Zoom
Pipeline context (unchanged from 2026-04-12)
Current state
- Stage 1: DOS test stations → D2TESTNAS (192.168.0.9, rsync daemon, module "test" → /data/test) ✓
- Stage 2: NAS → AD2 via
Sync-FromNAS-rsync.ps1scheduled every 15 min ✓ - Stage 3: DFWDS.exe validates + renames — config wiped in crypto attack;
C:\DFWDS\DFWDS_NAMES.TXTmissing. Check Haubner D: for backup. - Stage 4: Website upload — BROKEN; this is what we're rebuilding via the new API
- Stage 5: PDF generation — ~4,773 PDFs in
X:\For_Web_PDF, origin unclear
Data locations
- Incoming:
X:\Test_Datasheets(staging) - Validated:
X:\For_Web(~501K files) ← uploader source - PDFs:
X:\For_Web_PDF(~4.7K files) - Rejected:
X:\Bad_Datasheets(~18K) - DFWDS logs:
X:\Datasheets_Log X:=\\ad2\webshare
Datasheet format
Plain text, ~50 lines. Header: Dataforth address/phone. Fields: Date, Model (e.g. SCM5B41-03), SN (e.g. 178439-1), accuracy test table, final test results. Filename: {SN}.txt (e.g. 178439-1.txt).
Credentials used/referenced
- Old upload path (being replaced):
DataforthWebShare / Data6277 - New API: OAuth client credentials pending from Hoffman
- Neptune Exchange (for today's mail triage):
ACG\administrator/Gptf*77ttb##— requires VPN
Next session plan
- Receive OAuth creds from Hoffman (client_id + client_secret, ideally client_credentials grant enabled)
- Store credentials in
D:\vault\clients\dataforth\dataforth-api-oauth.sops.yaml - Stand up a one-page POC: get token, POST one test report, verify via GET
- If POC works → implement full uploader on AD2:
- Language: PowerShell (fits existing scripts) or Python (already used in
projects/dataforth-dos/datasheet-pipeline/implementation/) - State tracking: local manifest (serial → hash + last-upload-time) or use server's ContentHash response
- Use
/bulkendpoint in batches (size TBD with Hoffman) - Scheduled task on AD2, 15-min or hourly cadence
- Initial backfill script for 501K files — run off-hours
- Language: PowerShell (fits existing scripts) or Python (already used in
- Parallel-run with old webshare path until confident, then retire old path
Reference URLs
- Swagger UI: https://www.dataforth.com/swagger/index.html
- Swagger JSON: https://www.dataforth.com/swagger/v1/swagger.json
- Authorization URL: https://login.dataforth.com/connect/authorize
- Token URL: https://login.dataforth.com/connect/token
- Expected OIDC discovery: https://login.dataforth.com/.well-known/openid-configuration