diff --git a/.claude/skills/memory-dream/SKILL.md b/.claude/skills/memory-dream/SKILL.md index a943c2d..54eaaa6 100644 --- a/.claude/skills/memory-dream/SKILL.md +++ b/.claude/skills/memory-dream/SKILL.md @@ -77,6 +77,21 @@ NEVER auto-applied. These stay in the report's `## PROPOSED` section. The rationale isn't "never delete" any more (the fleet-wide additive safety net was dropped 2026-06-02; see `feedback_memory_sync_destructive_ok.md`) — it's that merges and dedups require human judgment about which file is canonical and how to combine content. Profile-side deletion DOES happen automatically — but in `sync-memory.sh`, not here. +### The operator MUST execute consolidations — do not just propose and leave + +The script is additive-only **by design** (auto-merging would corrupt the store — see the trap below), but the SESSION running `/memory-dream` is NOT additive-only. **Executing the `## PROPOSED` consolidations is the expected work of a dream run, not a someday-maybe.** Parking proposals indefinitely is exactly what causes the fleet drift this skill exists to prevent. Each run, after reading the report: + +1. **Triage every `[MERGE?]` cluster with judgment** (the detector is a coarse net): + - **Intentional `X` + `X_history` (or `_archive`/`_detail`) splits are NOT duplicates** — current-state vs on-demand archive, cross-linked in frontmatter. Leave them. (The detector skips these as of 2026-06-11; older reports may still list them.) + - **Topically-clustered but distinct facts** (e.g. several `reference_gitea_*`) — merge into one topic file ONLY if genuinely redundant or it reads better as one; otherwise leave. + - **True duplicates / superseded files** — merge: pick the canonical file, fold in the others' unique content, `git rm` the retired ones, fix `MEMORY.md`. +2. **Deletions are first-class.** `git rm` of a retired memory is correct and propagates to every machine's profile store via `sync-memory.sh` mirror mode (repo authoritative). This is the fleet-consistency mechanism — use it. +3. **Commit + sync** so the consolidated store reaches the fleet. + +If a cluster genuinely needs a decision you can't make, leave it AND say so explicitly in your summary — don't silently skip the whole PROPOSED section. + +**The auto-merge trap (why the script stays additive):** blindly merging the detector's clusters destroys deliberate structure — verified 2026-06-11 when the flagged `project_*` / `project_*_history` pairs turned out to be intentional splits and the `gitea` / `syncro` clusters were distinct facts, not copies. Consolidation needs the operator's judgment; the script must never do it unattended. + ## Running it This machine's Python launcher is `py` (per identity.json); the script also diff --git a/.claude/skills/memory-dream/scripts/memory_dream.py b/.claude/skills/memory-dream/scripts/memory_dream.py index 6278c74..d3d0c33 100644 --- a/.claude/skills/memory-dream/scripts/memory_dream.py +++ b/.claude/skills/memory-dream/scripts/memory_dream.py @@ -432,6 +432,20 @@ def jaccard(a: set[str], b: set[str]) -> float: return inter / union if union else 0.0 +# Suffixes that denote an INTENTIONAL current/archive split (e.g. project_cascades +# + project_cascades_history). These are deliberately separate files — current +# state vs on-demand detail, cross-linked in frontmatter — NOT duplicates. They +# must not be flagged as merge candidates. +ARCHIVE_SUFFIXES = ("_history", "_archive", "_detail", "_log", "_rationale") + + +def strip_archive_suffix(slug: str) -> str: + for suf in ARCHIVE_SUFFIXES: + if slug.endswith(suf): + return slug[: -len(suf)] + return slug + + def cluster_overlaps(mems: list[Memory], threshold: float = 0.34): """ Within each type, find pairs with token-overlap >= threshold, then union @@ -466,6 +480,7 @@ def cluster_overlaps(mems: list[Memory], threshold: float = 0.34): parent[rx] = ry files = [m.filename for m in group] + slug_of = {m.filename: m.slug for m in group} slug_prefix = {} for m in group: parts = m.slug.split("_") @@ -480,6 +495,11 @@ def cluster_overlaps(mems: list[Memory], threshold: float = 0.34): and len(slug_prefix[fi].split("_")) >= 2 ) if sim >= threshold or same_prefix: + # Don't flag intentional current/archive splits (X + X_history): + # deliberately separate files, cross-linked in frontmatter, not dupes. + si, sj = slug_of[fi], slug_of[fj] + if si != sj and strip_archive_suffix(si) == strip_archive_suffix(sj): + continue union(fi, fj) groups: dict[str, list[str]] = {}