sync: auto-sync from DESKTOP-0O8A1RL at 2026-05-13 13:36:15

Author: Mike Swanson Machine: DESKTOP-0O8A1RL Timestamp: 2026-05-13 13:36:15
2026-05-13 13:36:16 -07:00
parent 777b679bac
commit 22bbe676b1
2 changed files with 268 additions and 1 deletions
--- a/clients/grabb-durando/ai-demand-review/GND-technical-limitations.txt
+++ b/clients/grabb-durando/ai-demand-review/GND-technical-limitations.txt
@@ -0,0 +1,267 @@
 GRABB & DURANDO — AI DEMAND LETTER PROJECT
 Technical Limitations of a Fully Automated Approach
 ================================================================================
 Prepared by: Arizona Computer Guru LLC
 Date: 2026-05-13
 ================================================================================
 OVERVIEW
 ================================================================================
 AI can read documents, extract facts, and produce polished legal prose. What it
 cannot do — yet, without careful engineering — is do all of those things
 reliably on a real case folder without structured human checkpoints. This
 document explains the specific technical constraints that drive the timeline,
 and why skipping the intermediate steps creates risk rather than saving time.
 The goal is not to slow things down. It is to get to a production-ready tool
 that you can trust on every case, not just the clean ones.
 LIMITATION 1 — SCANNED DOCUMENTS
 ================================================================================
 The problem:
  Roughly half the documents in a typical G&D case folder are image-only
  scans — meaning they contain pictures of text, not actual text. The AI
  cannot read a picture of a page.
  The documents consistently stored as image-only scans include:
    - Police reports (TPD, PCSD, DPS)
    - Intake forms / client questionnaires
    - Handwritten notes
    - Older medical records and some hospital discharge summaries
    - Many insurance correspondence letters
  Text-readable documents (where the AI can work reliably) include:
    - Jeff's Notes (always a clean PDF export)
    - The Eval Sheet (Excel file — fully readable)
    - Most records-and-bills PDFs from larger providers
    - Electronic EOBs and adjuster correspondence
 What this means for the app:
  An AI that reads a case folder and silently skips a scanned police report
  will produce a fact sheet with a missing liability narrative and no A.R.S.
  citation. It will look complete. It won't be.
  The Phase 1 fact sheet solves this explicitly: it lists every document it
  could not read by name. Staff sees "TPD Report — UNREADABLE (image scan)"
  and knows to manually pull the liability facts before the demand is
  generated. The AI is not guessing; it is being honest about its limits.
  OCR preprocessing (running a text-conversion pass on scanned PDFs before
  sending them to the AI) can recover most of this — but OCR accuracy on
  low-quality scans, handwriting, and two-column police report layouts is
  imperfect. A preprocessing step takes time to build and tune.
  Bottom line: the fact sheet review step is not bureaucratic overhead.
  It is the mechanism that catches the gap between what the AI read and
  what is actually in the folder.
 LIMITATION 2 — ATTORNEY JUDGMENT RULES ARE NOT WRITTEN DOWN ANYWHERE
 ================================================================================
 The problem:
  When you write a demand letter, you make dozens of small judgment calls
  that no document in the case folder records:
    - Whether to mention the unverified DUI suspicion in the police report
    - Whether to include the injury the client reported at intake but that
      no treating provider documented
    - Whether to invoke the Eggshell Plaintiff doctrine given a pre-existing
      condition and the severity of this accident
    - Whether this case warrants a full-narrative letter or a short two-pager
    - Which medical charges to gross-bill vs. which to note as reduced
    - Whether to include the pregnancy that appears in one record but that
      you chose not to use
  The AI has none of this. Out of the box, it will include everything it
  finds, or exclude everything uncertain, or do something inconsistent
  between cases. None of those defaults match how you actually practice.
 What this means for the app:
  These rules have to be extracted from you during Phase 1 and Phase 2 and
  encoded into the prompt as explicit instructions. This cannot be done
  without running real cases and comparing AI output to what you would have
  actually written. It is a calibration process, not a configuration process.
  Every prompt rule we add (e.g., "if an injury appears in intake notes but
  not in any medical record, flag it and exclude it from the demand") has to
  be tested against multiple cases to confirm it does not create an
  unintended side effect on a different case type.
  Bottom line: the AI does not know your practice. You have to teach it,
  and teaching it requires reviewing real output on real cases. Phase 3
  (live refinement) exists entirely for this reason.
 LIMITATION 3 — ACCURACY RISK IS DIFFERENT IN LEGAL DOCUMENTS
 ================================================================================
 The problem:
  In most applications, 90% accuracy is a useful tool. In a demand letter,
  90% accuracy is a liability.
  Specific failure modes the AI is prone to:
  Hallucination — The AI may state a fact with confidence that is not
    supported by any document. Example: it may produce a specials table
    showing $18,400 from Desert Radiology when the actual billing is
    $14,200, because it interpolated between two ambiguous line items.
    The letter goes to the adjuster. The adjuster pulls the records and
    finds the discrepancy. Your credibility on that claim is now reduced.
  Date errors — Medical chronology depends on getting every date right.
    A treatment date transposed by one digit changes the narrative. The AI
    makes these errors more often than a careful paralegal.
  Attribution errors — In cases with multiple providers and multiple
    accident-related conditions, the AI can assign the wrong diagnosis to
    the wrong provider or the wrong date of service.
  Document conflict — If Jeff's Notes says liability is clear and the police
    report narrative (once OCR'd) says contributory negligence was noted, the
    AI may silently favor one over the other without flagging the conflict.
 What this means for the app:
  The fact sheet review step is the error-catching checkpoint. The AI
  outputs a structured table of every fact it extracted — dates, providers,
  amounts, liability narrative, injury list — and a staff member verifies it
  against the source documents before a single word of the demand is
  generated.
  This is not a step that can be removed to save time. Removing it means
  the errors go directly into the demand letter.
  Bottom line: the tool is only as reliable as the review step that precedes
  it. A faster tool with no review step is not faster — it is slower, because
  you are now editing a letter for accuracy instead of reviewing a fact table.
 LIMITATION 4 — OUTPUT FORMAT REQUIRES CALIBRATION
 ================================================================================
 The problem:
  A demand letter is not just accurate facts assembled into paragraphs. It
  has a specific structure, tone, citation style, and internal logic that is
  particular to this firm and to the way Robert Grabb practices.
  The AI out of the box will produce a demand letter. It will not produce a
  G&D demand letter. Specific gaps:
    - The firm uses specific A.R.S. section citations for traffic violations.
      The AI may use the correct statute or may use a related but wrong one.
    - The two-style distinction (full-narrative vs. short-form) is not obvious
      to the AI. Without explicit rules, it will default to one style
      regardless of case type.
    - The tone and framing of the liability section in the Nichols-style
      full-narrative letter is different from the Swailieh/Ortega short form.
      This is a learned skill, not a template.
    - Specials presentation format — gross billed vs. reduced, how liens are
      noted, whether MedPay is broken out — varies by case and by audience
      (BI adjuster vs. UIM adjuster).
 What this means for the app:
  Phase 2 includes a comparison step: run the AI on 10-15 closed cases where
  we have the actual demand letter that was sent. Compare the AI output to
  the real letter. Identify every formatting, tone, and structural deviation.
  Adjust the prompt. Repeat.
  This is the only reliable way to calibrate the output format. There is no
  shortcut. The good news: once calibrated against 15 cases, the format lock
  holds. It does not need to be re-done for every new case.
  Bottom line: format calibration against real closed-case letters is a one-
  time cost that cannot be skipped. It is built into the Phase 2 timeline.
 LIMITATION 5 — THE DOCUMENT VOLUME PROBLEM
 ================================================================================
 The problem:
  A typical G&D case folder contains 20-60+ individual PDF files. Sending
  all of them to the AI in a single request is technically possible for
  smaller cases, but creates problems at scale:
    - Cost: Each page sent to the AI costs money. A large case with 400
      pages of records and bills can cost $2-5 per run at current API rates.
      Running the tool on 161 archived cases for calibration is not free.
    - Context limits: The AI has a maximum amount of text it can process in
      one request. A very large case (hospitalization, long treatment course,
      multiple providers) may exceed the practical limit, causing the AI to
      cut off documents or lose accuracy on earlier content.
    - Relevance: Not every document in the folder is relevant to the demand.
      The AI benefits from document triage — knowing which files to prioritize
      (Jeff's Notes, Eval Sheet, R&B files) vs. which to skim (adjuster
      correspondence, PD settlement docs).
 What this means for the app:
  The app needs document routing logic: a pre-processing step that identifies
  document types by folder name and filename pattern, assigns priority, and
  sends documents to the AI in a structured order rather than a raw file
  dump. This logic is straightforward to build for a well-organized case
  folder (G&D's naming conventions help significantly) but it is a distinct
  engineering task that adds to Phase 1 scope.
  Bottom line: document routing is not optional for production reliability.
  It is built into Phase 1 as part of the folder-reading logic.
 WHAT THE TIMELINE REFLECTS
 ================================================================================
 Each phase exists because of the limitations above, not because of artificial
 caution:
  Phase 1 (3-4 weeks) — Build the document reader with OCR awareness and
    document routing. Run it on 5-10 closed test cases. Produce structured
    fact sheets. Identify every gap between what the AI reads and what a
    paralegal would have caught. This phase answers the question: "Can the
    AI reliably read G&D case folders?"
  Phase 2 (3-4 weeks) — Build the demand generator. Run it against 10-15
    closed cases where we have the real demand letters. Calibrate the output
    format, style trigger rules, and specials presentation. This phase
    answers the question: "Does the AI produce output that matches how G&D
    practices?"
  Phase 3 (2-3 weeks) — Run on real active cases. Attorney reviews each
    output. Tune the omission rules and judgment call encoding based on
    actual attorney feedback. This phase answers the question: "Is the
    output reliable enough to send to an adjuster after a staff review?"
  Total: 8-11 weeks to a production-ready demand letter tool.
  What can be done sooner: Phase 1 produces something reviewable within
  3-4 weeks. The fact sheet output is the first concrete artifact — you will
  see exactly what the AI is reading out of a real case folder, and the
  staff review process will begin then. The demand letter comes after.
 WHAT AN "ALL-AI" APPROACH WITHOUT THESE PHASES PRODUCES
 ================================================================================
 A tool built without Phase 1 validation and Phase 2 calibration will:
  - Silently skip scanned police reports and produce demand letters with
    no liability narrative or wrong A.R.S. citation
  - Include unverified injuries and unconfirmed allegations because it has
    no omission rules
  - Produce specials tables with dollar amounts that do not match the source
    records — not by large margins, but enough to matter in negotiation
  - Use a generic letter format that does not match the firm's established
    style and structure
  - Produce output that requires more attorney time to correct than a
    paralegal-drafted letter would have required in the first place
 The phased timeline exists to prevent exactly that outcome. The first month
 of work is not delay — it is the foundation that makes everything after it
 reliable.
 ================================================================================
 Arizona Computer Guru LLC | Grabb & Durando Project | 2026-05-13
 ================================================================================
--- a/projects/msp-tools/guru-rmm
+++ b/projects/msp-tools/guru-rmm