Files

Mike Swanson 5169936cfc Session log: IMC SQL move + DISM repair attempt, VWP RDWeb brute-force incident, Dataforth API planning

- IMC: document 716 GB SQL backup cleanup, retention scheduled task, DB move C:->S:, sysadmin grant via single-user recovery, parked RDS removal after KB5075999 apply rolled back on ETW manifest error
- Valleywide: document RDWeb brute-force incident on VWP-QBS, UDM port forward closure, 30-day audit showing no breach, lockout policy restoration
- Dataforth: capture Swagger API review and Hoffman Zoom call prep

2026-04-13 15:40:43 -07:00

6.5 KiB

Raw Blame History

Session Log: 2026-04-13 — Dataforth

Summary

Continuation of the test datasheet pipeline work. Prior session (2026-04-12) confirmed PostgreSQL migration complete; Hoffman provided the new Swagger API URL; awaiting OAuth credentials. Today: reviewed the full API spec, prepared a structured question list for a Zoom call with Hoffman, and discussed architecture options (raw file upload vs. structured record push vs. direct DB).

Also helped user triage an unrelated Neptune Exchange mail-flow issue (tsorensen → external bounce). User resolved on their own before I got into it.

Work completed

API spec review

Pulled https://www.dataforth.com/swagger/v1/swagger.json and mapped endpoints.

Base URL: https://www.dataforth.com (presumed; Swagger UI at /swagger/index.html)

Authentication (IdentityServer-style)

Flow: OAuth2 Authorization Code + PKCE
Authorization URL: https://login.dataforth.com/connect/authorize
Token URL: https://login.dataforth.com/connect/token
Scopes: openid, profile, dataforth.web
Swagger's own test client: client_id = dataforth.swagger (NOT for our use)
OIDC discovery expected at: https://login.dataforth.com/.well-known/openid-configuration

All endpoints

Path	Method
`/api/v1/Admin/refresh-cache`	POST
`/api/v1/Admin/cache-status`	GET
`/api/v1/Categories`	GET
`/api/v1/Categories/{id}`	GET
`/api/v1/Categories/by-catalog-node/{catalogNodeId}`	GET
`/api/v1/OrderableProducts/{orderableProductId}/Attributes`	POST
`/api/v1/OrderableProducts/{orderableProductId}/Attributes/{attributeId}`	PUT/DELETE
`/api/v1/Products`, `/{id}`, `/by-part-number/{partNumber}`	GET
`/api/v1/product-series`, `/{id}`, `/by-designation/{designation}`, `/by-catalog-node/{catalogNodeId}`	GET
`/api/v1/ProductType`, `/{productTypeId}/products`	GET
`/api/v1/TestReportDataFiles`	POST (single upload)
`/api/v1/TestReportDataFiles`	GET (paginated list)
`/api/v1/TestReportDataFiles/bulk`	POST (batch upload)
`/api/v1/TestReportDataFiles/{serialNumber}`	GET / DELETE
`/api/v1/TestReportDataFiles/stats`	GET

TestReportDataFiles payload shapes

POST single: { SerialNumber: string(max 50), Content: string(min 1) } → { SerialNumber, ContentHash, Created }
POST bulk: { Items: [CreateTestReportRequest, ...] } → { TotalReceived, Created, Updated, Unchanged, Errors[] }
GET single: { SerialNumber, Content, CreatedAtUtc, UpdatedAtUtc }
GET stats: { TotalCount, LatestCreatedAtUtc, LatestUpdatedAtUtc }
Server handles dedup via ContentHash → client doesn't need to pre-check.

Architecture discussion

Three options for delivering datasheets:

A: Raw file blob via current API — works today, zero new API work, simple client code
B: Structured records via new endpoints — cleaner long-term; we already have parsed data in AD2's PostgreSQL TestDataDB (2.8M records post-2026-04-12 migration). Requires Hoffman to add endpoints
C: Direct DB access — rejected (coupling, security, DBA nightmare)

Preferred path: whichever is less work for Hoffman. Frame it as offering flexibility — we can send raw text, structured JSON, or even CSV.

Questions prepared for John Hoffman Zoom call

Produced a prioritized list (MUST / SHOULD / NICE) covering:

Batch size + payload size + rate limits (MUST)
Idempotency + dedup semantics (MUST)
Cutover plan from old DataforthWebShare path (MUST)
Request: enable client_credentials grant on a new client for the AD2 uploader (SHOULD)
Staging endpoint availability (SHOULD)
PDF handling (X:\For_Web_PDF) — same endpoint or different? (SHOULD)
Product linkage — does a TestReport need to link to a Product/Series record? (SHOULD)
Monitoring + error visibility on his side (NICE)
SLA / escalation contact (NICE)

Pending from Hoffman (as of end-of-session 2026-04-13)

OAuth credentials (he said "today")
Clarification on client_credentials grant support
Answers to the MUST questions above after the Zoom

Pipeline context (unchanged from 2026-04-12)

Current state

Stage 1: DOS test stations → D2TESTNAS (192.168.0.9, rsync daemon, module "test" → /data/test) ✓
Stage 2: NAS → AD2 via Sync-FromNAS-rsync.ps1 scheduled every 15 min ✓
Stage 3: DFWDS.exe validates + renames — config wiped in crypto attack; C:\DFWDS\DFWDS_NAMES.TXT missing. Check Haubner D: for backup.
Stage 4: Website upload — BROKEN; this is what we're rebuilding via the new API
Stage 5: PDF generation — ~4,773 PDFs in X:\For_Web_PDF, origin unclear

Data locations

Incoming: X:\Test_Datasheets (staging)
Validated: X:\For_Web (~501K files) ← uploader source
PDFs: X:\For_Web_PDF (~4.7K files)
Rejected: X:\Bad_Datasheets (~18K)
DFWDS logs: X:\Datasheets_Log
X: = \\ad2\webshare

Datasheet format

Plain text, ~50 lines. Header: Dataforth address/phone. Fields: Date, Model (e.g. SCM5B41-03), SN (e.g. 178439-1), accuracy test table, final test results. Filename: {SN}.txt (e.g. 178439-1.txt).

Credentials used/referenced

Old upload path (being replaced): DataforthWebShare / Data6277
New API: OAuth client credentials pending from Hoffman
Neptune Exchange (for today's mail triage): ACG\administrator / Gptf*77ttb## — requires VPN

Next session plan

Receive OAuth creds from Hoffman (client_id + client_secret, ideally client_credentials grant enabled)
Store credentials in D:\vault\clients\dataforth\dataforth-api-oauth.sops.yaml
Stand up a one-page POC: get token, POST one test report, verify via GET
If POC works → implement full uploader on AD2:
- Language: PowerShell (fits existing scripts) or Python (already used in projects/dataforth-dos/datasheet-pipeline/implementation/)
- State tracking: local manifest (serial → hash + last-upload-time) or use server's ContentHash response
- Use /bulk endpoint in batches (size TBD with Hoffman)
- Scheduled task on AD2, 15-min or hourly cadence
- Initial backfill script for 501K files — run off-hours
Parallel-run with old webshare path until confident, then retire old path

Reference URLs

Swagger UI: https://www.dataforth.com/swagger/index.html
Swagger JSON: https://www.dataforth.com/swagger/v1/swagger.json
Authorization URL: https://login.dataforth.com/connect/authorize
Token URL: https://login.dataforth.com/connect/token
Expected OIDC discovery: https://login.dataforth.com/.well-known/openid-configuration

6.5 KiB Raw Blame History