feat(human-flow): add elevate (polish & redesign) heuristics layer

New `elevate` mode that goes beyond friction to make a UI top-notch and flags when to redesign rather than patch. references/polish-and-redesign.md holds 12 heuristics (hierarchy, signature moment, action gravity, narrative, lonely states, density, rhythm, type, tokens, depth/finish, motion, redesign triggers) synthesized from three independent model passes (Claude + Gemini + Grok). Adds an Elevation Index (0-10), a Redesign Urgency score (>=4 leads with a Structural Audit), and Opportunity-ranked Quick Wins / Elevations / Redesign Candidates tiers. SKILL.md: command + mode section + extend note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 17:43:18 -07:00
parent 47496ac432
commit 2402566782
2 changed files with 164 additions and 0 deletions
--- a/.claude/skills/human-flow/SKILL.md
+++ b/.claude/skills/human-flow/SKILL.md
@@ -40,6 +40,7 @@ Run via natural language ("human-flow scan the sessions table", "run human-flow
 |---------------------|-------------|
 | `scan [target]`     | AST-powered scan of files/directories for workflow friction. Produces a 0-10 Friction Index report. |
 | `audit [target]`    | Deeper pass: combines AST analysis, component review, and state-flow audit. |
+| `elevate [target]`  | **Polish & redesign pass.** Goes beyond friction to make a UI top-notch: information hierarchy, signature moment, action gravity, lonely states, density, rhythm, type, tokens, depth/finish, motion — and flags when a screen should be **redesigned, not patched**. Produces an Elevation Index + prioritized tiers (Quick Wins / Elevations / Redesign Candidates). Add `--redesign` to emphasize structural restructuring. See `references/polish-and-redesign.md`. |
 | `fix [target]`      | **DISABLED (advisory only for now).** Auto-apply is off — the AST code generator reprints whole files and produces noisy diffs. Use the scan/report output and have an agent apply the fixes surgically. Will be revisited with a surgical (string-splice) editor. |
 | `fancy [target]`    | **"Fancy as fuck" mode** — elegance pass with a calibrated Restraint-o-Meter. |
 | `report [target]`   | Generate a formatted markdown report with the Friction Index rubric. |
@@ -120,6 +121,33 @@ The scanner is **opinionated toward making the happy path for a human operator f

 See `references/report-template.md` for the full structure.

+## "Elevate" Mode (`elevate`) — Polish & Redesign
+
+Where `scan` finds what *hurts*, `elevate` finds what's *missing to be excellent* — and
+decides when a screen is beyond polishing and should be **restructured**. It exists because
+the maintainer is not a designer: after an `elevate` pass, the UI should feel/look/act as if
+a senior product designer + UI expert + UX team planned it.
+
+It is primarily an **agent judgment pass** seeded by static signals — read the component,
+understand the user's task, score each dimension 1–5, then prescribe the concrete better
+version (a tweak, or a sketched redesign). The 12 heuristics, the scoring model, and the
+output shape live in `references/polish-and-redesign.md`. In brief:
+
+- **12 heuristics:** Hierarchy & Visual Anchors · Signature Moment · Action Gravity ·
+  Narrative Coherence · Lonely States (empty/error/loading/success) · Progressive
+  Disclosure & Density · Spacing Rhythm · Typographic Scale · Token Fidelity · Surface/
+  Depth/Finish · Intentional Motion · Redesign Triggers.
+- **Elevation Index (0–10):** weighted score, with Hierarchy / Signature / Action Gravity /
+  Narrative weighted heaviest.
+- **Redesign Urgency (0–5):** if ≥ 4, lead with a Structural Audit ("restructure, don't
+  polish") and a sketched alternative layout/component tree.
+- **Prioritized, not dumped:** `Opportunity = ImpactWeight × (5 − score)`; present the top
+  5–7 as **Quick Wins / Elevations / Redesign Candidates**, each citing file + signal +
+  exact replacement.
+
+Recommended sequence: `scan` (kill friction) → `elevate` (reach top-notch / decide redesign)
+→ `fancy` (calibrated delight on top).
+
 ## "Fancy as Fuck" Mode (`fancy`)

 This is a deliberate second (or standalone) pass focused on **beauty, refinement, and elegant interaction**.
@@ -157,6 +185,7 @@ The output of a `fancy` pass should live in its own section of the report (or a

 - Add new heuristics to `references/mouse-keyboard-heuristics.md` (with detection hints and "better human workflow" examples).
 - Add fancy/delights ideas to `references/fancy-as-fuck.md`.
+- Add polish/redesign heuristics to `references/polish-and-redesign.md` (the `elevate` layer).
 - Update the scanner script for new static patterns (fancy detection is intentionally more qualitative).
 - The skill is designed to be extended — new categories of mouse/keyboard friction **and** opportunities for tasteful elegance are welcome.

--- a/.claude/skills/human-flow/references/polish-and-redesign.md
+++ b/.claude/skills/human-flow/references/polish-and-redesign.md
@@ -0,0 +1,135 @@
+# Human-Flow Heuristics: Polish & Redesign (the "Elevate" layer)
+
+The friction heuristics (`mouse-keyboard-heuristics.md`) find what *hurts*. This layer
+finds what's *missing to be excellent* — and decides when a screen is beyond polishing
+and should be **restructured**. Synthesized from three independent model passes (Claude,
+Gemini, Grok), which converged hard on this set.
+
+**The bar.** The maintainer is not a designer. After an `elevate` pass, the UI should
+feel/look/act as if a senior product designer + UI expert + UX team planned it. So this
+layer is *prescriptive*: don't just flag — propose the concrete better version, and when
+warranted, sketch a redesign.
+
+**How to run it.** `elevate` is primarily an **agent judgment pass** seeded by static
+signals — read the component, understand the user's task, then score and prescribe. The
+scanner can surface signals (heading counts, raw magic-number styles, missing state
+branches, animation imports) but the call is the agent's.
+
+Each heuristic below gives: **what it evaluates** (static signal vs. judgment), the
+**top-notch bar**, and the **prescription** (the move to recommend — tweak *or* redesign).
+
+---
+
+## 1. Hierarchy & Visual Anchors
+- **Evaluates:** Does visual weight follow importance? *Static:* multiple `<h1>`, repeated identical font-size/weight across unrelated text, div-soup without `section`/`article`/landmark structure, 4+ equally-weighted blocks. *Judgment:* does the dominant thing on screen match the primary user goal?
+- **Top-notch:** Passes the squint test — one clear primary message, 2–3 supporting levels max, scannable in under 3 seconds without reading every line.
+- **Prescription:** Consolidate to one `h1` + two supporting levels; promote the key value/data to a large semantic heading, demote metadata to caption/secondary. If 4+ blocks compete, restructure to "primary panel + supporting stack."
+
+## 2. Signature Moment (First 5 Seconds)
+- **Evaluates:** Quality of the initial viewport / hero / primary card. *Static:* generic header+content vs. a dedicated orientation block; headline present without supporting microcopy or a primary action. *Judgment:* does it answer "what is this and why act now?"
+- **Top-notch:** Instant orientation plus a functional or emotional hook — headline + one supporting line + primary action, above the fold, zero ambiguity.
+- **Prescription:** Replace the generic title with a Signature block (outcome-focused headline, one-sentence value, single primary CTA). On an operator tool, make it a task-oriented "what you can do right now" panel.
+
+## 3. Action Gravity
+- **Evaluates:** Is the primary action unmistakable? *Static:* count of primary-styled buttons (>1 is a smell), generic copy ("Submit", "Save"), key actions buried in menus. *Judgment:* is the most important action for *this* context the most salient?
+- **Top-notch:** Exactly one primary action (or a small, clearly-ranked set), visually and positionally dominant; everything else is secondary/tertiary.
+- **Prescription:** Elevate the one true primary (size, accent, top-right or sticky action bar). Demote the rest to ghost/icon actions. Rewrite labels to outcome verbs ("Publish changes", "Apply filters"). If two actions are genuinely co-primary for different users, put a segmented choice up front.
+
+## 4. Narrative Coherence
+- **Evaluates:** Does the screen tell one logical story matching the task sequence? *Static:* JSX section order vs. logical order, competing CTAs at equal weight, unrelated concerns interleaved. *Judgment:* can a user with the stated goal follow a path without backtracking or "why is this here?"
+- **Top-notch:** Layout order matches the user's decision/task sequence — one primary thread, clear branches, no random panels.
+- **Prescription:** Reorder to Context → Decision → Confirmation. Move "related items" into a collapsible tray. If the screen serves two unrelated tasks/roles, split into two focused views.
+
+## 5. The Lonely States (Empty / Zero / Loading / Error / Success)
+- **Evaluates:** Are non-happy-path states *designed*? *Static:* conditional rendering with only a happy branch, inline "no data" text, no skeleton/spinner, no `<EmptyState>`/`<ErrorState>`. *Judgment:* does each state reduce anxiety and offer the next action?
+- **Top-notch:** Every state is designed, not defaulted. Empty states explain *why* and offer the most relevant action; errors are specific + retry + support; success confirms without blocking.
+- **Prescription:** Add a first-class EmptyState (illustration slot + outcome copy + primary CTA). Replace "no results" with a contextual suggestion (closest useful filter or a create flow). Surface the real error reason + retry, never a bare "something went wrong."
+
+## 6. Progressive Disclosure & Density Tuning
+- **Evaluates:** Is density managed, secondary info deferred? *Static:* many visible fields/columns/metrics at once, hover-only details, no accordion/tabs/"show more", long flat tables without grouping. *Judgment:* can the primary task complete without seeing ~80% of the content?
+- **Top-notch:** Critical path is sparse; secondary detail is one expansion/click away; the user controls the level of detail.
+- **Prescription:** Collapse the bottom ~60% of a long form into an "Advanced" disclosure. Turn a 12-column table into summary cards + "View details" side panel. For dashboards, add tiered views (Summary / Standard / Full) with a per-user density toggle.
+
+## 7. Spacing Rhythm & Grid
+- **Evaluates:** Consistent spatial system? *Static:* hardcoded odd values (`margin: 13px`, `p-[17px]`), inconsistent sibling padding, no spacing scale, cramped containers. *Judgment:* does whitespace group related things and separate unrelated ones (proximity)?
+- **Top-notch:** Strict 4px/8px grid; related elements closer than unrelated ones; generous, intentional breathing room.
+- **Prescription:** Normalize all raw px to the nearest spacing token; apply a consistent container `gap`; increase separation between unrelated groups so the eye chunks the layout.
+
+## 8. Typographic Scale & Readability
+- **Evaluates:** Is text comfortable and contrasted? *Static:* body < 14px, missing `line-height` on prose, grey-on-grey, raw hex colors vs. WCAG AA. *Judgment:* is long-form content actually pleasant to read?
+- **Top-notch:** Body 16px+, line-height ~1.5–1.6, a clear primary vs. de-emphasized text system, AA contrast minimum.
+- **Prescription:** Raise body size/line-height; establish a small type scale (display / heading / body / caption); fix low-contrast pairings to meet AA.
+
+## 9. System Consistency (Token Fidelity)
+- **Evaluates:** Do styles come from the system, not one-offs? *Static:* raw numbers in classes (`text-[13px]`, `#3f2a1b`), inline styles, 3 near-identical button/card variants, magic numbers in layout. *Judgment:* justified exception vs. drift?
+- **Top-notch:** 95%+ of visual decisions come from tokens; new components are *composed*, not invented; rare one-offs are commented.
+- **Prescription:** Replace raw values with the nearest token. Consolidate near-duplicate components into one with size/emphasis variants. Extract recurring patterns (e.g. section header) into a reusable component that enforces rhythm.
+
+## 10. Surface, Depth & the Finish Layer (Trust Cues)
+- **Evaluates:** Does it feel finished and trustworthy? *Static:* no elevation shadows on floating elements, no `:active`/press feedback, generic button text, misaligned numbers, missing timestamps/ownership. *Judgment:* does it feel *crafted* or *assembled*?
+- **Top-notch:** Subtle shadows convey a Z-axis; interactive elements give a tactile press; numbers right-align; actions read as outcomes; small reassurances ("Last synced 2m ago", "Changes saved automatically") remove doubt.
+- **Prescription:** Add a consistent surface/chrome to the primary area; `active:` press transform on buttons; right-align numerics; add "last updated" context; unify card/section treatment so widgets read as one product.
+
+## 11. Intentional Motion & Choreography
+- **Evaluates:** Does motion serve comprehension, not decoration? *Static:* transition/framer-motion usage without variants, multiple simultaneous transforms on load, >300ms on non-modal elements, no `prefers-reduced-motion`. *Judgment:* does each animation have a purpose (reveal, state change, spatial relationship)?
+- **Top-notch:** Sparse, purposeful, choreographed; entrances/exits respect spatial relationships; honors reduced-motion. (Calibrate intensity with the Restraint-o-Meter in `fancy-as-fuck.md`.)
+- **Prescription:** Keep only optimistic state transitions (120–180ms ease), modal/drawer enter-exit with backdrop fade, and at most one staggered reveal that aids scanning. Add a reduced-motion guard. If motion is hiding a weak layout, fix the layout first.
+
+## 12. Redesign Triggers (Beyond Polishing)
+- **Evaluates:** Can local polish even fix this? *Static + structural:* >6 competing top-level sections; conditional rendering producing 4+ distinct layouts in one file; component >300 lines or >3 nested ternaries; deep nesting to model a simple hierarchy; "TODO: redesign" comments; a view that accreted 3+ features without re-architecture. *Judgment:* is the screen's conceptual model fundamentally broken?
+- **Top-notch:** One defensible conceptual model; adding the next feature wouldn't need another special case.
+- **Prescription:** Declare the redesign threshold crossed. Define the information model first (e.g. "Workspace → Item → Activity"), then a master-detail / two-pane layout that absorbs future features. Provide a sketched component tree + data shape. Do **not** patch the 14-section accordion.
+
+---
+
+## Scoring: the Elevation Index
+
+Score each heuristic **1–5** (1 = absent/harmful, 3 = competent, 5 = senior-designer execution).
+
+- **Elevation Index** (0–10): weighted average of the 12, scaled to 10. Weight the
+  high-user-impact dimensions heaviest — **Hierarchy, Signature Moment, Action Gravity,
+  Narrative Coherence** (×1.5); the rest ×1.0.
+- **Redesign Urgency** (0–5): a *separate* score driven mainly by **#12 Redesign
+  Triggers**, reinforced by low **Narrative Coherence** and **Progressive Disclosure**.
+  Urgency ≥ 4 ⇒ lead the report with a **Structural Audit** ("this screen has exceeded
+  patch capacity — restructure, don't polish") and a sketched alternative.
+
+### Prioritize, don't dump
+For each finding compute **Opportunity = ImpactWeight × (5 − CurrentScore)**, sort
+descending, and present the top **5–7** concrete moves (rest in an appendix). Group into
+three tiers:
+
+| Tier | Contains | Cost |
+|---|---|---|
+| **Quick Wins** | spacing, type, token fidelity, finish details | low effort, high return |
+| **Elevations** | hierarchy, states, motion, depth, disclosure, action gravity | structural/component-level |
+| **Redesign Candidates** | Redesign Urgency ≥ 4, or multiple high-impact structural heuristics failing | re-architecture |
+
+Every recommendation must cite the **file/component**, the **signal that triggered it**,
+and the **exact replacement pattern or new component/layout shape** — not a vague "improve this."
+
+---
+
+## Output shape (elevate)
+```markdown
+## Human-Flow Elevate: <target>
+
+**Elevation Index:** 6.4/10   **Redesign Urgency:** 2/5
+
+[If Urgency >= 4: Structural Audit block first — why patching won't work + sketched redesign]
+
+### Quick Wins (do this sprint)
+1. <file:line> — <signal> -> <concrete token/type/spacing/finish fix>
+
+### Elevations
+1. <file:component> — <signal> -> <hierarchy/state/motion/disclosure restructure>
+
+### Redesign Candidates (plan)
+1. <file/view> — <triggers> -> <new information model + component tree sketch>
+
+### Scorecard
+| Heuristic | Score | Top opportunity |
+```
+
+Pair with `scan` (fix friction first) and `fancy` (then calibrated delight). `elevate`
+is the bridge between "no longer painful" and "genuinely top-notch."