sync: auto-sync from GURU-5070 at 2026-06-11 08:29:58
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-11 08:29:58
This commit is contained in:
@@ -77,6 +77,21 @@ NEVER auto-applied.
|
||||
|
||||
These stay in the report's `## PROPOSED` section. The rationale isn't "never delete" any more (the fleet-wide additive safety net was dropped 2026-06-02; see `feedback_memory_sync_destructive_ok.md`) — it's that merges and dedups require human judgment about which file is canonical and how to combine content. Profile-side deletion DOES happen automatically — but in `sync-memory.sh`, not here.
|
||||
|
||||
### The operator MUST execute consolidations — do not just propose and leave
|
||||
|
||||
The script is additive-only **by design** (auto-merging would corrupt the store — see the trap below), but the SESSION running `/memory-dream` is NOT additive-only. **Executing the `## PROPOSED` consolidations is the expected work of a dream run, not a someday-maybe.** Parking proposals indefinitely is exactly what causes the fleet drift this skill exists to prevent. Each run, after reading the report:
|
||||
|
||||
1. **Triage every `[MERGE?]` cluster with judgment** (the detector is a coarse net):
|
||||
- **Intentional `X` + `X_history` (or `_archive`/`_detail`) splits are NOT duplicates** — current-state vs on-demand archive, cross-linked in frontmatter. Leave them. (The detector skips these as of 2026-06-11; older reports may still list them.)
|
||||
- **Topically-clustered but distinct facts** (e.g. several `reference_gitea_*`) — merge into one topic file ONLY if genuinely redundant or it reads better as one; otherwise leave.
|
||||
- **True duplicates / superseded files** — merge: pick the canonical file, fold in the others' unique content, `git rm` the retired ones, fix `MEMORY.md`.
|
||||
2. **Deletions are first-class.** `git rm` of a retired memory is correct and propagates to every machine's profile store via `sync-memory.sh` mirror mode (repo authoritative). This is the fleet-consistency mechanism — use it.
|
||||
3. **Commit + sync** so the consolidated store reaches the fleet.
|
||||
|
||||
If a cluster genuinely needs a decision you can't make, leave it AND say so explicitly in your summary — don't silently skip the whole PROPOSED section.
|
||||
|
||||
**The auto-merge trap (why the script stays additive):** blindly merging the detector's clusters destroys deliberate structure — verified 2026-06-11 when the flagged `project_*` / `project_*_history` pairs turned out to be intentional splits and the `gitea` / `syncro` clusters were distinct facts, not copies. Consolidation needs the operator's judgment; the script must never do it unattended.
|
||||
|
||||
## Running it
|
||||
|
||||
This machine's Python launcher is `py` (per identity.json); the script also
|
||||
|
||||
@@ -432,6 +432,20 @@ def jaccard(a: set[str], b: set[str]) -> float:
|
||||
return inter / union if union else 0.0
|
||||
|
||||
|
||||
# Suffixes that denote an INTENTIONAL current/archive split (e.g. project_cascades
|
||||
# + project_cascades_history). These are deliberately separate files — current
|
||||
# state vs on-demand detail, cross-linked in frontmatter — NOT duplicates. They
|
||||
# must not be flagged as merge candidates.
|
||||
ARCHIVE_SUFFIXES = ("_history", "_archive", "_detail", "_log", "_rationale")
|
||||
|
||||
|
||||
def strip_archive_suffix(slug: str) -> str:
|
||||
for suf in ARCHIVE_SUFFIXES:
|
||||
if slug.endswith(suf):
|
||||
return slug[: -len(suf)]
|
||||
return slug
|
||||
|
||||
|
||||
def cluster_overlaps(mems: list[Memory], threshold: float = 0.34):
|
||||
"""
|
||||
Within each type, find pairs with token-overlap >= threshold, then union
|
||||
@@ -466,6 +480,7 @@ def cluster_overlaps(mems: list[Memory], threshold: float = 0.34):
|
||||
parent[rx] = ry
|
||||
|
||||
files = [m.filename for m in group]
|
||||
slug_of = {m.filename: m.slug for m in group}
|
||||
slug_prefix = {}
|
||||
for m in group:
|
||||
parts = m.slug.split("_")
|
||||
@@ -480,6 +495,11 @@ def cluster_overlaps(mems: list[Memory], threshold: float = 0.34):
|
||||
and len(slug_prefix[fi].split("_")) >= 2
|
||||
)
|
||||
if sim >= threshold or same_prefix:
|
||||
# Don't flag intentional current/archive splits (X + X_history):
|
||||
# deliberately separate files, cross-linked in frontmatter, not dupes.
|
||||
si, sj = slug_of[fi], slug_of[fj]
|
||||
if si != sj and strip_archive_suffix(si) == strip_archive_suffix(sj):
|
||||
continue
|
||||
union(fi, fj)
|
||||
|
||||
groups: dict[str, list[str]] = {}
|
||||
|
||||
Reference in New Issue
Block a user