Add memory-dream skill + additive cross-machine memory sync

memory-dream: read-only memory lint/consolidation analyzer (index, backlinks,
stale refs, dup clusters, profile drift); additive-only --apply-safe, all
merges/deletes are proposals. sync-memory.sh: additive repo<->harness-profile
union (no delete/overwrite, conflicts surfaced), wired to a SessionStart hook.
Migrates the useful profile-only memories into the synced repo store.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-01 15:21:56 -07:00
parent a00069a020
commit 2a1ccfac73
24 changed files with 1875 additions and 0 deletions

View File

@@ -29,6 +29,7 @@
- [Howard Enos](user_howard.md) — Mike's brother, technician, full access. Machines: ACG-TECH03L, Howard-Home (authoritative in users.json). - [Howard Enos](user_howard.md) — Mike's brother, technician, full access. Machines: ACG-TECH03L, Howard-Home (authoritative in users.json).
## Feedback ## Feedback
- [Scheduling = coord todo, not schedulers](feedback_scheduling_via_coord_todo.md) — Defer future work as a coord todo (POST /api/coord/todos; needs text + created_by_user + created_by_machine) for a later session to pick up. NOT /schedule remote CCR agents (no vault/creds there) or local scheduled tasks.
- [Identify RMM agent by IP](feedback_rmm_identify_by_ip.md) — When the target machine is known by external IP, match the IP to find the agent; don't recon every candidate. (GuruRMM doesn't store agent IPs yet — todo 7459428e.) - [Identify RMM agent by IP](feedback_rmm_identify_by_ip.md) — When the target machine is known by external IP, match the IP to find the agent; don't recon every candidate. (GuruRMM doesn't store agent IPs yet — todo 7459428e.)
- [Attribution is read, never inferred](feedback_attribution_from_identity.md) — Who-did-what (user+machine) comes ONLY from identity.json + users.json + git authorship. Never infer from hostname patterns, the userEmail hint, or memory. The "5070" box is Mike's. sync.sh reconciles git config to identity.json; /save renders the User block via whoami-block.sh. - [Attribution is read, never inferred](feedback_attribution_from_identity.md) — Who-did-what (user+machine) comes ONLY from identity.json + users.json + git authorship. Never infer from hostname patterns, the userEmail hint, or memory. The "5070" box is Mike's. sync.sh reconciles git config to identity.json; /save renders the User block via whoami-block.sh.
- [GuruRMM agent parity rule](feedback_gururmm_agent_parity.md) — "Add feature X to the agent" = Windows + Linux + macOS in the same change, no exceptions. Stub + TODO if real impl not feasible. - [GuruRMM agent parity rule](feedback_gururmm_agent_parity.md) — "Add feature X to the agent" = Windows + Linux + macOS in the same change, no exceptions. Stub + TODO if real impl not feasible.
@@ -79,6 +80,7 @@
- [Mac gururmm setup pending](project_mac_gururmm_setup_pending.md) — ACTION REQUIRED: run `bash scripts/install-hooks.sh` in gururmm repo on Mikes-MacBook-Air before any RMM work - [Mac gururmm setup pending](project_mac_gururmm_setup_pending.md) — ACTION REQUIRED: run `bash scripts/install-hooks.sh` in gururmm repo on Mikes-MacBook-Air before any RMM work
## Project ## Project
- [Automate memory consolidation/lint (phased)](project_memory_consolidation_automation.md) — Eventually auto-run /memory-dream; lint+additive fixes can automate early, merges/deletes stay human-approved. Engine: .claude/skills/memory-dream/ + .claude/scripts/sync-memory.sh.
- [RMM webhook docs-only build guard](project_rmm_webhook_docs_guard.md) — RMM build webhook skips docs-only pushes (host guard in /opt/gururmm/webhook-handler.py, SPEC-020 Phase 0); repo copy is stale, don't redeploy it - [RMM webhook docs-only build guard](project_rmm_webhook_docs_guard.md) — RMM build webhook skips docs-only pushes (host guard in /opt/gururmm/webhook-handler.py, SPEC-020 Phase 0); repo copy is stale, don't redeploy it
- [GuruConnect v2 direction](project_guruconnect_v2_direction.md) — v2 re-architecture (SPEC-002, 2026-05-29): greenfield-salvage-cores, NATIVE-first (full key fidelity Win+R/Ctrl+Alt+Del + bidirectional file cut/paste/drag are Mike's headline must-haves; WebRTC fallback only), standalone-first + RMM contract, hardened single-tenant but tenancy-ready schema. Willing to scrap v1 entirely. - [GuruConnect v2 direction](project_guruconnect_v2_direction.md) — v2 re-architecture (SPEC-002, 2026-05-29): greenfield-salvage-cores, NATIVE-first (full key fidelity Win+R/Ctrl+Alt+Del + bidirectional file cut/paste/drag are Mike's headline must-haves; WebRTC fallback only), standalone-first + RMM contract, hardened single-tenant but tenancy-ready schema. Willing to scrap v1 entirely.
- [Apple MDM + Developer certs (GuruRMM mobile)](project_apple_mdm_certs.md) — ACG holds both Apple Developer+signing and Apple MDM Push certs (acquired 2026-05-29) for SPEC-017 mobile support. MDM push cert RENEWS ANNUALLY on the same Apple ID or all enrolled iOS devices break. Capture Apple ID + expiry. - [Apple MDM + Developer certs (GuruRMM mobile)](project_apple_mdm_certs.md) — ACG holds both Apple Developer+signing and Apple MDM Push certs (acquired 2026-05-29) for SPEC-017 mobile support. MDM push cert RENEWS ANNUALLY on the same Apple ID or all enrolled iOS devices break. Capture Apple ID + expiry.

View File

@@ -0,0 +1,16 @@
---
name: Client communication tone
description: How to write client-facing Syncro comments — expert partner, not intake questionnaire
type: feedback
originSessionId: 4ccedc24-2f39-497e-9a89-ca09aba03982
---
Write client comments from the position of a senior MSP that has managed the client for years. State findings, state what we did or are doing, ask only for the one specific thing we genuinely don't know.
**Why:** ACG has managed clients like GlazTech for 10-15 years. We know their locations, key staff, infrastructure, and service accounts. Comments that ask "can you tell us about your setup?" or list basic discovery questions make us look like we just walked in the door.
**How to apply:**
- Lead with what we found and what we already know
- Frame questions as targeted confirmations, not open-ended discovery ("Is FaxFinder authenticating via SMTP basic auth, or has that been migrated to OAuth?" — not "What does the FaxFinder account do?")
- Never ask the client to explain their own infrastructure to us unless Mike explicitly says we don't have context
- Steve Eastman (seastman@glaztech.com) is GlazTech's internal IT person — desktop-level tech, guides technical direction, ~200 users across 9 locations. We implement what he directs. Treat him as a peer, not an end user.
- If we're missing context (IPs, staff roles, auth methods), check session logs and vault first. Ask Mike privately before asking the client.

View File

@@ -0,0 +1,20 @@
---
name: Add Mike as owner on all Entra apps
description: Apps created via management SP have no user owner — must add Mike manually or publisher verification fails
type: feedback
originSessionId: 045c6ef2-5711-4aca-b86f-55506c9b6ada
---
After creating any Entra app registration via the ComputerGuru-Management service principal, always add Mike (f34ebe40-9565-4135-af4c-2e808df57a25) as an owner immediately.
**Why:** Apps created via client credentials have no user owner. Microsoft requires a user owner to perform publisher verification (MPN badge). Without this step, the portal shows "A verified publisher cannot be added to this application."
**How to apply:** After every `POST /v1.0/applications` call, immediately run:
```bash
curl -sk -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
"https://graph.microsoft.com/v1.0/applications/{APP_OBJ_ID}/owners/\$ref" \
-d '{"@odata.id":"https://graph.microsoft.com/v1.0/directoryObjects/f34ebe40-9565-4135-af4c-2e808df57a25"}'
```
Mike's user object ID: `f34ebe40-9565-4135-af4c-2e808df57a25`

View File

@@ -0,0 +1,14 @@
---
name: feedback-gururmm-builds
description: "GuruRMM builds must go through the Gitea webhook pipeline, never run manually via SSH"
metadata:
node_type: memory
type: feedback
originSessionId: 541d4004-8c45-4290-89f5-0ba9ee4e64a9
---
Never run `build-agents.sh` directly via SSH. All builds go through the normal Gitea webhook pipeline (push to main triggers the build automatically).
**Why:** Manual runs execute as the SSH user (`guru`) instead of root, breaking log writes, artifact cleanup, and service restarts. The pipeline exists precisely to handle this correctly.
**How to apply:** To trigger a build, push a commit to the gururmm main branch on Gitea. If a test build is needed without a real change, use an empty commit: `git commit --allow-empty -m "chore: trigger build"`.

View File

@@ -0,0 +1,16 @@
---
name: No TOML/config file approach for endpoints
description: User explicitly prohibits TOML or config-file-based endpoint configuration — this will never be approved
type: feedback
originSessionId: 50d853e9-1d2f-4094-9b7b-f509fb95891f
---
Never propose storing endpoint URLs, server addresses, API targets, or connection parameters in TOML files, config files, INI files, or any file-based config approach when it comes to deployed agents or endpoints.
**Why:** User stated directly: "I cannot stand the toml/config file approach to anything when it comes to endpoints" and "that approach will never be approved by me/the user."
**How to apply:** When designing agent deployment, enrollment, or configuration:
- Embed endpoint/server data directly in the binary (compile-time constants, build flags, or baked into the installer)
- Use registry keys (Windows) for anything that must be configurable post-install
- Use MSI properties for install-time configuration
- Never write agent.toml, config.toml, settings.ini, or equivalent files containing server URLs or connection endpoints
- This applies to GuruRMM agent and any future agent/endpoint projects

View File

@@ -0,0 +1,11 @@
---
name: Python on Windows — use py launcher
description: Windows Store python/python3 aliases disabled; always use py or jq on DESKTOP-0O8A1RL
type: feedback
originSessionId: bdd13bc7-44b1-4e16-aba8-a3332b0c8b8e
---
Always use `py` (Windows launcher) for Python on this machine, never `python3` or bare `python`.
**Why:** Windows Store app execution aliases for python.exe and python3.exe were disabled (2026-04-20). `python3` now fails cleanly (command not found). `py` at `C:\Windows\py.exe` is the correct entry point and reliably finds Python 3.14 at `C:\Program Files\Python314\python.exe`.
**How to apply:** In any script, doc, or inline command that needs Python: use `py`. For simple JSON extraction from curl output, prefer `jq` (available at `C:\Users\guru\AppData\Local\Microsoft\WinGet\Links\jq.exe`) — no Python needed at all.

View File

@@ -0,0 +1,12 @@
---
name: feedback_scheduling_via_coord_todo
description: Defer/schedule future work as a coord todo for a later session to pick up - NOT remote CCR routines or local scheduled tasks
metadata:
type: feedback
---
When something needs to happen later ("check this tomorrow", "verify in 24-48h", "follow up next week"), create a **coord todo** (`POST /api/coord/todos`) assigned to the right user/project. A future ClaudeTools session picks it up at session-start. Do NOT use the `/schedule` skill (remote anthropic_cloud CCR agents) or local OS scheduled tasks.
**Why:** Mike's directive 2026-06-01. Remote cloud routines have NO access to the local SOPS vault, the age key, B2/Discord/API creds, or identity.json - so any credentialed task fails there. Local scheduled tasks need the box powered on and run blind. The coord API is the team's shared source of truth and every session already reads pending todos on startup, so a todo is the reliable, credential-available, multi-machine handoff.
**How to apply:** `POST http://172.16.3.30:8001/api/coord/todos`. REQUIRED fields the CLAUDE.md doc omits: `text` (the todo body - NOT `title`/`description`), `created_by_user`, `created_by_machine` (read user+machine from `.claude/identity.json`). Optional: `project_key`, `assigned_to_user`, `auto_created`, `source_context`, `parent_id`. Put enough context in `text` (commands, exact targets, the when) that a cold session can act without re-deriving. See [[reference_coord_messages_api_shape]].

View File

@@ -0,0 +1,16 @@
---
name: Syncro - preview all comments before posting
description: Every Syncro comment must be previewed and confirmed before posting, no exceptions
type: feedback
originSessionId: 4ccedc24-2f39-497e-9a89-ca09aba03982
---
**Rule:** ALWAYS show the full comment text to Mike and wait for explicit confirmation before posting ANY comment to a Syncro ticket. No exceptions — not for billing comments, not for resolution notes, not for client-facing messages, not for internal notes.
**Why:** Mike has called this out multiple times. Comments posted without preview have had wrong tone, missing context, or incorrect content. Once posted they can't be deleted via API and require manual GUI cleanup.
**How to apply:**
- Draft the comment, show it in chat as a formatted block
- Say "Good to post?" or similar and wait for a yes
- Only then call POST /tickets/{id}/comment
- This applies to every single comment regardless of how routine it seems
- Also always ask for minutes + labor type before logging any time entry — never assume a default

View File

@@ -0,0 +1,20 @@
---
name: Syncro duplicate prevention — tickets AND comments
description: Never retry ANY Syncro POST (ticket create or comment) without first GETting to confirm the action didn't already succeed — Syncro has no idempotency on any endpoint
type: feedback
originSessionId: 7034be43-1464-4085-b765-dc1226b1f8e0
---
Never retry a POST /comment to Syncro without first doing GET /tickets/{id} to confirm the comment did not already post. The server has no idempotency — one POST always creates one comment, regardless of whether the client saw an error.
**ALSO: Always show the full comment draft to the user and wait for explicit confirmation before posting ANY comment — including internal/hidden notes.** This rule has been violated twice. There are no exceptions.
**ALSO: This applies to ticket CREATION too — not just comments.** When a POST /tickets response looks wrong (null fields, jq error, etc.), do GET /customers/{id}/tickets BEFORE retrying. The response wrapper is `{"ticket": {...}}` — always use `.ticket.id` not `.id`. Duplicate tickets were created twice by retrying a succeeded POST. Violated 2026-04-22.
**Why:** A comment was duplicated on ticket #32185 because the first POST succeeded but jq threw a parse error on the response (em-dash in subject caused shell interpolation issue), making the request look failed. A retry posted a second copy. Comments cannot be deleted via API — duplicates require manual GUI removal.
**How to apply:**
- Always write comment payloads to a temp file (`/tmp/syncro_comment.json`) before posting — avoids shell quoting/encoding failures that produce misleading errors
- If any POST /comment tool call returns an error or ambiguous result, immediately GET /tickets/{id} and check `.ticket.comments` for the subject/timestamp before retrying
- A jq parse error, curl error, or timeout on the response does NOT mean the POST failed — verify first
- **CRITICAL — jq path:** POST /comment response is `{"comment": {...}}` — ALWAYS use `.comment.id`, `.comment.created_at` etc. Using `.id` returns null and looks like failure even when the comment landed. This caused a duplicate on 2026-04-23 (#32142). When GETting to verify, check ALL comments not just `[-3:]` — the new comment may not be the most recent if other activity occurred.
- When GETting to verify after an ambiguous POST, search by subject: `.ticket.comments[] | select(.subject == "...")`

View File

@@ -0,0 +1,17 @@
---
name: Syncro comment HTML formatting
description: Use <br> for line breaks in Syncro comments, not <ul>/<li> — list tags don't render
type: feedback
originSessionId: b39e319c-ac3e-49f5-afb6-755e08f1fd82
---
Use `<br>` for line breaks in Syncro comment bodies. Do NOT use `<ul>`, `<li>`, or other block-level list tags — Syncro's renderer collapses them into a single line with no spacing.
**Why:** Posted a comment with `<ul><li>` items and they all ran together on one line in the ticket view. Had to post a corrected duplicate.
**How to apply:** For any bulleted list in a Syncro comment, use:
```
- Item one<br>
- Item two<br>
- Item three
```
wrapped in a `<p>` tag. Never use `<ul>/<li>`.

View File

@@ -0,0 +1,14 @@
---
name: feedback-syncro-labor-tax
description: Labor is never taxable in Arizona — always set taxable=false on labor line items in Syncro
metadata:
node_type: memory
type: feedback
originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e
---
Always pass `"taxable": false` explicitly on labor line items via `add_line_item`.
**Why:** Labor products are configured with `taxable: false` in Syncro, but the `add_line_item` API endpoint does not inherit the product's taxable setting — it posts the line item as `taxable: true` regardless of the product config.
**How to apply:** Include `"taxable": false` in every `add_line_item` payload for labor products (remote, onsite, in-shop, emergency, prepaid). The product itself is correct; the API just doesn't carry it through.

View File

@@ -0,0 +1,24 @@
---
name: feedback_syncro_line_items
description: Correct Syncro API endpoint for adding labor/product line items to tickets
metadata:
node_type: memory
type: feedback
originSessionId: 282e0176-1bdb-49b7-8c15-faf152774d7e
---
Use `POST /api/v1/tickets/{internal_ticket_id}/add_line_item` to add line items to tickets. Both `name` and `description` fields are required (422 if either missing). Never use timers.
**Why:** `/line_item`, `/line_items`, and PUT `line_items_attributes` all 404. The correct endpoint was found via Syncro Swagger spec at api-docs.syncromsp.com. Mike has explicitly said never use timers.
**How to apply:**
- Path uses internal ticket ID (e.g., 111387456), not ticket number (32339)
- Required fields: `name`, `description`, `quantity`, `price`, `taxable` (and `product_id` if catalog item)
- Response is a flat object — parse `.id` directly (not `.line_item.id`)
- For testing/practice, use internal ACG account only (customer ID 15353550)
Example:
```
POST /api/v1/tickets/111387456/add_line_item
{"product_id":1049360,"name":"Labor- Warranty work","description":"...","quantity":1,"price":0.0,"taxable":false}
```

View File

@@ -0,0 +1,18 @@
---
name: feedback-syncro-live-rates
description: Always fetch Syncro labor rates live from the API — never use hardcoded rate table
metadata:
node_type: memory
type: feedback
originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e
---
Always fetch `price_retail` live from `GET /products/<id>``.product.price_retail` before billing any Syncro line item. Never use the rate table in the skill as a source of truth for dollar amounts.
**Why:** The hardcoded rate table was proven wrong on 2026-05-20 (ticket #32304, Cascades) when Labor - Remote Business was listed at $150/hr but the correct rate was $175/hr. Rates vary by contract and change over time.
**How to apply:** In any billing workflow, fetch the rate immediately after selecting the product_id:
```bash
RATE=$(curl -s "${BASE}/products/${PRODUCT_ID}?api_key=${API_KEY}" | jq -r '.product.price_retail')
```
Use this `$RATE` value for the Ollama draft prompt, the preview shown to the user, and the `price_retail` field in all payloads. The product ID table in the skill is still valid — just not the rate column.

View File

@@ -0,0 +1,11 @@
---
name: ACG Website Hosting
description: azcomputerguru.com is hosted on IX Web Hosting via cPanel
type: project
originSessionId: 045c6ef2-5711-4aca-b86f-55506c9b6ada
---
azcomputerguru.com is hosted on **IX Web Hosting** (ixwebhosting.com), managed via **cPanel**.
**Why:** Core ACG company website. Files deploy to `/public_html/` via cPanel File Manager or FTP.
**How to apply:** When uploading files to azcomputerguru.com, use cPanel on IX. Login via ixwebhosting.com → My Hosting → cPanel. Credentials should be vaulted at `clients/azcomputerguru/cpanel.sops.yaml` (pending).

View File

@@ -0,0 +1,14 @@
---
name: project-cascades-billing
description: "Cascades of Tucson Syncro billing — prepaid block customer, rate TBD"
metadata:
node_type: memory
type: project
originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e
---
Cascades of Tucson (Syncro customer_id: 20149445) is a prepaid block customer. As of 2026-05-20 the block had ~37.5 hrs remaining (38.5 minus 1hr for ticket #32304).
**Block rate:** Not yet confirmed — $175/hr is the standard non-block remote rate, NOT the Cascades block rate. Ask Mike before billing future Cascades tickets.
**How to apply:** Always check prepay_hours before billing. Invoices post at $0.00 with hours deducted by quantity. Confirm block rate with Mike before setting price_retail.

View File

@@ -0,0 +1,13 @@
---
name: Dataforth email infrastructure
description: Dataforth uses M365 for email; the Exchange server on 172.16.x.x / neptune.acghosting.com is NOT Dataforth's — it belongs to ACG's own infrastructure
type: project
originSessionId: 7034be43-1464-4085-b765-dc1226b1f8e0
---
Dataforth's email runs on Microsoft 365 (sysadmin@dataforth.com, tenant in vault at `clients/dataforth/m365.sops.yaml`).
The Exchange server at `neptune.acghosting.com` / `67.206.163.124` listed in the vault under `clients/dataforth/neptune-exchange.sops.yaml` is **not** part of Dataforth's infrastructure — do not use it for Dataforth email workflows.
**Why:** Mike corrected this during pipeline notification work (2026-04-22). The Exchange entry is an ACG-side server, not Dataforth's.
**How to apply:** For any Dataforth email sending, SMTP basic auth is disabled on the tenant. Must use OAuth2 — either XOAUTH2 over SMTP or (preferred) Microsoft Graph API `POST /v1.0/users/sysadmin@dataforth.com/sendMail` with a client_credentials token. Entra app is in vault at `clients/dataforth/m365.sops.yaml` under `credentials.entra-app`. Verify `Mail.Send` application permission is granted before use.

View File

@@ -0,0 +1,14 @@
---
name: project_memory_consolidation_automation
description: Goal - automate the memory lint/consolidation (/memory-dream) once the additive proposals prove trustworthy; phased rollout
metadata:
type: project
---
Mike wants the memory consolidation/lint process automated eventually (stated 2026-06-01). Phased so we never auto-wipe useful data:
1. **Now:** `/memory-dream` runs report-only by default; `--apply-safe` does additive-only fixes (append missing index lines, migrate profile-only memories in). Merges/dedups/deletions are PROPOSED only, human-approved. Build trust in the proposals first.
2. **Next - automate the read-only half:** the lint pass + additive `--apply-safe` are non-destructive, so safe to run unattended via a hook (e.g. Stop/SessionStart runs the lint and, on findings, drops a coord todo - NOT a cron/remote scheduler, per [[feedback_scheduling_via_coord_todo]]).
3. **Later - automate consolidation with guardrails:** auto-apply only high-confidence merges; low-confidence stays a proposal; deletion stays gated regardless. Earn the automation.
Engine already exists: skill `.claude/skills/memory-dream/` (+ `scripts/memory_dream.py`). Cross-machine sync of the repo memory store is handled by `.claude/scripts/sync-memory.sh` (additive union repo<->harness-profile, SessionStart hook). See also [[feedback_memory_repo_not_profile]] if/when written re: the two-store model (repo `.claude/memory/` syncs via Gitea; harness `~/.claude/projects/<slug>/memory/` is machine-local and auto-injected).

View File

@@ -0,0 +1,18 @@
---
name: project-pluto-build-server
description: "Pluto Windows build server — location, role, and access details"
metadata:
node_type: memory
type: project
originSessionId: 541d4004-8c45-4290-89f5-0ba9ee4e64a9
---
Pluto (`PLUTO`, 172.16.3.36) is a Windows Server 2019 VM hosted on Jupiter (Unraid primary).
**Why:** It is the primary Windows build server for GuruRMM — builds all Windows agent variants (amd64, x86, legacy, debug), runs WiX 4 MSI builds, and signs binaries via Azure Trusted Signing.
**Credentials:** Administrator / `Paper123!@#` (set 2026-05-15). SSH key: `guru@gururmm-build` (ed25519, `Q+ivqd/...`) must be in `C:\ProgramData\ssh\administrators_authorized_keys` with icacls `/inheritance:r` and ASCII encoding (not UTF-16).
**How to apply:** When Pluto is unreachable or SSH auth fails, check Jupiter's VM console first (not physical machine). SSH key file must be ASCII-encoded — PowerShell `>` writes UTF-16 and breaks auth silently. Use `[System.IO.File]::WriteAllText(..., [System.Text.Encoding]::ASCII)` to write the key.
**GuruRMM agent:** Installed but historically runs old versions (was on 0.6.3 as of 2026-05-15). Update it after any Pluto maintenance.

View File

@@ -0,0 +1,32 @@
---
name: Gitea Internal API Access
description: git.azcomputerguru.com is NOT behind Cloudflare — it's the office Cox IP NAT'd to NPM (openresty) on Jupiter. Prefer internal 172.16.3.20:3000 for reliability (bypasses NPM SSL-renewal reload blips)
type: reference
originSessionId: 511840e9-1aba-40e6-a81e-4905bac958ec
---
**CORRECTED 2026-05-27** (prior note claimed "behind Cloudflare / curl gets a JS challenge" — that is WRONG/outdated).
`git.azcomputerguru.com` resolves to a **direct public A record `72.194.62.10`** (an ACG-office Cox static IP, adjacent to ix at .5 — `wsip-72-194-62-10.ph.ph.cox.net`). NOT Cloudflare-proxied (same answer from 1.1.1.1; no CF edge IP). Path: `.10` → office firewall NAT → **NPM (Nginx Proxy Manager = openresty) on Jupiter `172.16.3.20`** → Gitea container `:3000`. The NPM proxy host is `/data/nginx/proxy_host/4.conf`. `curl`/HTTPS works fine and returns `200` (Server: openresty) — there is no challenge page.
**Why prefer the internal address for API/git on-network:** the external path goes through NPM, which periodically renews its SSL certs and reloads openresty — that briefly drops external `:443` (observed 2026-05-27: ~7-9 min TCP-timeout window, self-recovered when renewal completed). The internal address bypasses NPM, so it's faster and immune to those renewal blips. It is NOT about Cloudflare.
Use the internal LAN/Tailscale address:
```
http://172.16.3.20:3000/api/v1/...
```
Works when on LAN or when Tailscale is connected. Requires the API token from vault:
```bash
bash D:/vault/scripts/vault.sh get-field services/gitea.sops.yaml credentials.api.api-token
# 9b1da4b79a38ef782268341d25a4b6880572063f
```
Example issue creation:
```bash
TOKEN="9b1da4b79a38ef782268341d25a4b6880572063f"
curl -s -X POST "http://172.16.3.20:3000/api/v1/repos/azcomputerguru/gururmm/issues" \
-H "Authorization: token $TOKEN" \
-H "Content-Type: application/json" \
-d '{"title": "...", "body": "..."}'
```

View File

@@ -0,0 +1,333 @@
#!/bin/sh
# sync-memory.sh -- additive union between the REPO memory store and the
# machine-local HARNESS PROFILE memory store.
#
# WHY: There are TWO memory stores on every machine:
# REPO store : <root>/.claude/memory/ (git-tracked, synced via Gitea -- SOURCE OF TRUTH)
# PROFILE store : $HOME/.claude/projects/<slug>/memory/ (machine-local, NOT in git;
# the harness auto-injects THIS into the prompt)
# They drift. This script keeps the auto-injected PROFILE store in step with the
# synced REPO store, while preserving any memories authored only on this machine.
#
# ADDITIVE UNION -- NO DELETES, NO DESTRUCTIVE OVERWRITES:
# * file in REPO but not in PROFILE -> copy REPO -> PROFILE
# * file in PROFILE but not in REPO -> copy PROFILE -> REPO (capture local-only for git sync)
# * file in BOTH, identical -> nothing
# * file in BOTH, content differs -> DO NOT overwrite; log "CONFLICT (manual review)"
# * never deletes from either side.
#
# Idempotent and safe to run repeatedly on every machine (Windows Git Bash,
# macOS, Linux). ASCII output only. No hardcoded drive paths.
#
# Flags:
# --dry-run show what WOULD be copied; copy nothing, create nothing.
#
# Exit: 0 normally (including when conflicts are present -- they are reported,
# not fatal). Non-zero only on a hard setup error (no repo, no memory dir).
set -u
DRY_RUN=0
for arg in "$@"; do
case "$arg" in
--dry-run) DRY_RUN=1 ;;
-h|--help)
echo "Usage: sync-memory.sh [--dry-run]"
echo " Additive union of the repo and harness-profile memory stores."
echo " --dry-run print planned copies; change nothing."
exit 0
;;
*)
echo "[ERROR] unknown argument: $arg" >&2
echo "Usage: sync-memory.sh [--dry-run]" >&2
exit 2
;;
esac
done
# --- Resolve CLAUDETOOLS_ROOT (env -> identity.json -> git toplevel -> script dir) ---
# Absolute dir of this script, POSIX-portable (no readlink -f on macOS).
script_path="$0"
case "$script_path" in
/*) : ;;
*) script_path="$(pwd)/$script_path" ;;
esac
SCRIPT_DIR="$(CDPATH= cd -- "$(dirname -- "$script_path")" && pwd)"
ROOT=""
if [ -n "${CLAUDETOOLS_ROOT:-}" ] && [ -d "${CLAUDETOOLS_ROOT}" ]; then
ROOT="$CLAUDETOOLS_ROOT"
fi
# Walk up from the script dir to find a dir containing .claude/ (covers the
# normal layout <root>/.claude/scripts/sync-memory.sh).
if [ -z "$ROOT" ]; then
d="$SCRIPT_DIR"
while [ "$d" != "/" ] && [ -n "$d" ]; do
if [ -d "$d/.claude" ]; then
ROOT="$d"
break
fi
parent="$(dirname -- "$d")"
[ "$parent" = "$d" ] && break
d="$parent"
done
fi
# identity.json may override with claudetools_root (authoritative per machine).
if [ -n "$ROOT" ] && [ -f "$ROOT/.claude/identity.json" ]; then
ID_ROOT="$(
PYBIN=""
for c in py python3 python; do
if command -v "$c" >/dev/null 2>&1; then PYBIN="$c"; break; fi
done
if [ -n "$PYBIN" ]; then
"$PYBIN" - "$ROOT/.claude/identity.json" <<'PYEOF' 2>/dev/null
import json, os, sys
try:
d = json.load(open(sys.argv[1]))
r = d.get("claudetools_root")
if r and os.path.isdir(r):
print(r)
except Exception:
pass
PYEOF
fi
)"
if [ -n "$ID_ROOT" ] && [ -d "$ID_ROOT" ]; then
ROOT="$ID_ROOT"
fi
fi
# Last resort: git toplevel.
if [ -z "$ROOT" ]; then
ROOT="$(git -C "$SCRIPT_DIR" rev-parse --show-toplevel 2>/dev/null || true)"
fi
if [ -z "$ROOT" ] || [ ! -d "$ROOT/.claude" ]; then
echo "[ERROR] could not resolve CLAUDETOOLS_ROOT (no .claude dir found)." >&2
exit 1
fi
REPO_MEM="$ROOT/.claude/memory"
if [ ! -d "$REPO_MEM" ]; then
echo "[ERROR] repo memory dir not found: $REPO_MEM" >&2
exit 1
fi
# --- Derive the harness profile memory dir ---------------------------------
# Slug = absolute project path with every run of non-alphanumeric chars -> '-'.
# Must match memory_dream.py's derivation. Prefer CLAUDE_PROJECT_DIR if set.
HOME_DIR="${HOME:-$(cd ~ 2>/dev/null && pwd)}"
PROJECT_DIR="${CLAUDE_PROJECT_DIR:-$ROOT}"
# Absolutize PROJECT_DIR.
case "$PROJECT_DIR" in
/*) : ;;
*) PROJECT_DIR="$(CDPATH= cd -- "$PROJECT_DIR" 2>/dev/null && pwd || echo "$PROJECT_DIR")" ;;
esac
# Compute the slug. Use Python when available (exact, matches the analyzer);
# otherwise fall back to sed/tr.
# Single-dash collapse: replace any non-alphanumeric run with a single '-'.
# Prefer Python (exact, matches the analyzer) when available, else sed.
SLUG_SINGLE=""
PYBIN=""
for c in py python3 python; do
if command -v "$c" >/dev/null 2>&1; then PYBIN="$c"; break; fi
done
if [ -n "$PYBIN" ]; then
SLUG_SINGLE="$(
"$PYBIN" - "$PROJECT_DIR" <<'PYEOF' 2>/dev/null
import os, re, sys
print(re.sub(r"[^A-Za-z0-9]+", "-", os.path.abspath(sys.argv[1])))
PYEOF
)"
fi
if [ -z "$SLUG_SINGLE" ]; then
SLUG_SINGLE="$(printf '%s' "$PROJECT_DIR" | sed 's/[^A-Za-z0-9][^A-Za-z0-9]*/-/g')"
fi
# The Claude Code harness maps a Windows drive colon to '--' (so
# "D:\claudetools" -> "D--claudetools"), while the single-dash collapse above
# produces "D-claudetools". Reproduce the harness rule by doubling a leading
# "<drive>-" into "<drive>--".
SLUG_DOUBLE="$(printf '%s' "$SLUG_SINGLE" | sed 's/^\([A-Za-z]\)-/\1--/')"
# Resolve the profile dir. Try the EXACT candidate slugs in priority order
# (harness double-dash first, then single-dash collapse); use the first whose
# profile memory dir actually exists.
PROJ_ROOT="$HOME_DIR/.claude/projects"
SLUG=""
PROFILE_MEM=""
SKIP_PROFILE=0
for cand_slug in "$SLUG_DOUBLE" "$SLUG_SINGLE"; do
[ -n "$cand_slug" ] || continue
if [ -d "$PROJ_ROOT/$cand_slug/memory" ]; then
SLUG="$cand_slug"
PROFILE_MEM="$PROJ_ROOT/$cand_slug/memory"
break
elif [ -d "$PROJ_ROOT/$cand_slug" ]; then
SLUG="$cand_slug"
PROFILE_MEM="$PROJ_ROOT/$cand_slug/memory"
break
fi
done
# ONLY if no exact candidate exists, fall back to a case-insensitive tail-scan
# of $HOME/.claude/projects/*/memory for a dir whose slug ends with the repo's
# basename. If MORE THAN ONE dir matches, do NOT guess -- skip the profile side
# entirely to avoid cross-project contamination.
if [ -z "$PROFILE_MEM" ]; then
# Default to the harness (double-dash) slug for messaging / dir creation.
SLUG="$SLUG_DOUBLE"
PROFILE_MEM="$PROJ_ROOT/$SLUG/memory"
if [ -d "$PROJ_ROOT" ]; then
BASE="$(basename -- "$ROOT")"
BASE_SLUG="$(printf '%s' "$BASE" | sed 's/[^A-Za-z0-9][^A-Za-z0-9]*/-/g' | tr 'A-Z' 'a-z')"
MATCH_COUNT=0
MATCH_LIST=""
MATCH_DIR=""
for cand in "$PROJ_ROOT"/*/memory; do
[ -d "$cand" ] || continue
cand_parent="$(basename -- "$(dirname -- "$cand")")"
cand_lc="$(printf '%s' "$cand_parent" | tr 'A-Z' 'a-z')"
case "$cand_lc" in
*"$BASE_SLUG")
MATCH_COUNT=$((MATCH_COUNT + 1))
MATCH_DIR="$cand"
if [ -z "$MATCH_LIST" ]; then
MATCH_LIST="$cand_parent"
else
MATCH_LIST="$MATCH_LIST, $cand_parent"
fi
;;
esac
done
if [ "$MATCH_COUNT" -gt 1 ]; then
echo "[WARNING] multiple profile dirs matched ($MATCH_LIST); skipping profile sync to avoid cross-project contamination"
SKIP_PROFILE=1
elif [ "$MATCH_COUNT" -eq 1 ]; then
PROFILE_MEM="$MATCH_DIR"
SLUG="$(basename -- "$(dirname -- "$MATCH_DIR")")"
fi
fi
fi
echo "[INFO] sync-memory.sh"
if [ "$DRY_RUN" -eq 1 ]; then
echo "[INFO] MODE: DRY-RUN (no files will be copied or created)"
else
echo "[INFO] MODE: APPLY (additive union)"
fi
echo "[INFO] repo store : $REPO_MEM"
if [ "$SKIP_PROFILE" -eq 1 ]; then
echo "[INFO] profile store : (skipped -- ambiguous match)"
echo "[INFO] profile slug : (skipped -- ambiguous match)"
echo "[INFO] ----- summary -----"
echo "[INFO] profile sync SKIPPED: multiple candidate profile dirs matched; refusing to guess."
echo "[INFO] additive-only: no file was deleted or overwritten on either side."
exit 0
fi
echo "[INFO] profile store : $PROFILE_MEM"
echo "[INFO] profile slug : $SLUG"
# Create the profile dir (apply mode only).
if [ ! -d "$PROFILE_MEM" ]; then
if [ "$DRY_RUN" -eq 1 ]; then
echo "[INFO] would create profile dir: $PROFILE_MEM"
else
mkdir -p "$PROFILE_MEM" || {
echo "[ERROR] could not create profile dir: $PROFILE_MEM" >&2
exit 1
}
echo "[OK] created profile dir: $PROFILE_MEM"
fi
fi
# --- Build the union of *.md basenames (excluding MEMORY.md) ----------------
# MEMORY.md (the human index) is intentionally NOT synced -- the repo index is
# authoritative and the profile index is regenerated by the harness; copying it
# either way would create a guaranteed perpetual "conflict".
list_md() {
# $1 = dir ; prints basenames of *.md except MEMORY.md, one per line
dir="$1"
[ -d "$dir" ] || return 0
for f in "$dir"/*.md; do
[ -e "$f" ] || continue
b="$(basename -- "$f")"
case "$b" in
MEMORY.md|memory.md) continue ;;
esac
printf '%s\n' "$b"
done
}
# Collect names into a deduped, sorted list using a temp file (portable).
NAMES_TMP="$(mktemp 2>/dev/null || echo "${TMPDIR:-/tmp}/sync-memory-names.$$")"
trap 'rm -f "$NAMES_TMP"' EXIT INT TERM
{
list_md "$REPO_MEM"
[ -d "$PROFILE_MEM" ] && list_md "$PROFILE_MEM"
} | sort -u > "$NAMES_TMP"
copied_r2p=0
copied_p2r=0
conflicts=0
identical=0
# Copy helper honoring dry-run.
do_copy() {
src="$1"; dst="$2"; label="$3"
if [ "$DRY_RUN" -eq 1 ]; then
echo "[DRY-RUN] would copy $label : $(basename -- "$src")"
else
# Belt-and-suspenders no-clobber: even though the caller only invokes
# do_copy when the destination is absent, re-check here to close any
# TOCTOU window and protect against future refactors. ADDITIVE-ONLY:
# never overwrite an existing destination.
if [ -e "$dst" ]; then
echo "[SKIP] dst appeared, not overwriting: $(basename -- "$dst")"
return 0
fi
# cp -p preserves mtime; portable across BSD/GNU.
if cp -p "$src" "$dst" 2>/dev/null || cp "$src" "$dst"; then
echo "[OK] copied $label : $(basename -- "$src")"
else
echo "[ERROR] failed to copy $label : $(basename -- "$src")" >&2
return 1
fi
fi
}
while IFS= read -r name; do
[ -n "$name" ] || continue
r="$REPO_MEM/$name"
p="$PROFILE_MEM/$name"
if [ -f "$r" ] && [ ! -f "$p" ]; then
do_copy "$r" "$p" "REPO->PROFILE" && copied_r2p=$((copied_r2p + 1))
elif [ ! -f "$r" ] && [ -f "$p" ]; then
do_copy "$p" "$r" "PROFILE->REPO" && copied_p2r=$((copied_p2r + 1))
elif [ -f "$r" ] && [ -f "$p" ]; then
if cmp -s "$r" "$p"; then
identical=$((identical + 1))
else
echo "[CONFLICT] $name : repo and profile differ -- left untouched (manual review)"
conflicts=$((conflicts + 1))
fi
fi
done < "$NAMES_TMP"
echo "[INFO] ----- summary -----"
echo "[INFO] copied REPO -> PROFILE : $copied_r2p"
echo "[INFO] copied PROFILE -> REPO : $copied_p2r"
echo "[INFO] identical (no action) : $identical"
echo "[INFO] conflicts (manual review): $conflicts"
if [ "$DRY_RUN" -eq 1 ]; then
echo "[INFO] DRY-RUN: nothing was changed."
fi
echo "[INFO] additive-only: no file was deleted or overwritten on either side."
exit 0

View File

@@ -29,6 +29,17 @@
} }
] ]
} }
],
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "bash -c '[ -f \"${CLAUDE_PROJECT_DIR}/.claude/scripts/sync-memory.sh\" ] && bash \"${CLAUDE_PROJECT_DIR}/.claude/scripts/sync-memory.sh\" || true'",
"timeout": 30
}
]
}
] ]
} }
} }

View File

@@ -0,0 +1,131 @@
---
name: memory-dream
description: >-
Memory lint + consolidation analyzer for the ClaudeTools REPO memory store
(.claude/memory/). Audits the index, backlinks, referenced file paths,
duplicate/overlap clusters, stale dated facts, and drift against the
machine-local harness profile memory store. ADDITIVE-ONLY: the default run is
read-only and mutates nothing; --apply-safe performs only non-destructive
additive fixes (append missing index lines, migrate profile-only memories
into the repo). It NEVER deletes, merges, truncates, or overwrites -- every
destructive idea is surfaced as a PROPOSED action for a human to approve.
Invoke for: "memory dream", "consolidate memory", "memory lint", "clean up
memory", "memory errors", "dedupe memory".
---
# Memory Dream
A read-only-by-default analyzer that keeps the shared memory store healthy
without ever risking the data in it.
## The two-store model (important)
There are TWO separate memory stores on every machine:
- REPO store -- `.claude/memory/` (88+ `*.md` files + `MEMORY.md` index).
Tracked in git, syncs to all machines via Gitea. **This is the source of
truth.** `CLAUDE.md` mandates writing here.
- HARNESS PROFILE store -- `$HOME/.claude/projects/<slug>/memory/`. Machine
local, NOT in git, NOT synced. This is the store the Claude Code harness
auto-injects into the system prompt at session start.
The two drift over time. `memory-dream` reports that drift; the companion infra
script `.claude/scripts/sync-memory.sh` reconciles it (additively).
## What it checks
`scripts/memory_dream.py` runs six READ-ONLY analyses over the REPO store:
1. INDEX RECONCILE -- orphan files (no `MEMORY.md` line), index lines whose
target file is missing, and frontmatter `name:` vs filename signals.
2. BACKLINKS -- `[[name]]` references in bodies whose target slug has no file.
3. REFERENCED-ARTIFACT VALIDITY -- conservatively extracts repo-relative file
paths / script names from each body (backtick-wrapped single tokens only)
and flags ones not found in the repo. Reported as **verify**, never delete
(many are legitimately server-side or in sibling repos).
4. DUPLICATE / OVERLAP CLUSTERS -- groups memories by type + token-overlap /
shared slug-prefix and lists candidate mergeable clusters (e.g. the many
`feedback_syncro_*` files). **Proposes** merges; never performs them.
5. STALE DATED FACTS -- flags `project`-type memories with an "as of <date>"
style claim older than ~6 months for re-verification.
6. DRIFT vs PROFILE STORE -- locates the harness profile memory dir for this
project and reports profile-only files (candidates to migrate INTO the repo)
and repo-only files (candidates to push OUT to profile). Report only.
The report ends with a `## PROPOSED (needs human approval)` section that is
NEVER auto-applied.
## Modes
- Default (no flag) -- **REPORT ONLY. Mutates nothing.** Writes a timestamped
report to `.claude/memory/_reports/YYYY-MM-DD-HHMM-dream.md` (created if
missing) and prints it to stdout.
- `--apply-safe` -- performs ONLY additive, non-destructive fixes and prints
each action:
- (a) append missing index lines to `MEMORY.md` for orphan files, under the
correct `## <Type>` header, never reordering or removing existing lines;
- (b) copy profile-only memory files INTO the repo store (additive
migration). If a same-named repo file already exists it is SKIPPED and the
conflict is reported -- it is never overwritten.
- `--no-file` -- print to stdout only; skip writing the `_reports/` file.
- `--report-file <path>` -- write the report to an explicit path.
### The guarantee
`memory-dream` NEVER:
- deletes a memory file (either store),
- removes or reorders an index line,
- overwrites a file whose content differs,
- performs a proposed merge or dedup.
All of the above stay in the report as **proposals** for a human to action.
This is deliberate: "additive at first so we don't wipe useful data."
## Running it
This machine's Python launcher is `py` (per identity.json); the script also
runs under `python` / `python3`. Stdlib only -- no pip deps.
```bash
# REPORT ONLY (default) -- writes _reports/<stamp>-dream.md and prints it
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py"
# report to stdout only, write nothing
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --no-file
# additive-only fixes (append orphan index lines, migrate profile-only files)
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/memory_dream.py" --apply-safe
```
`CLAUDETOOLS_ROOT` resolves from the env var, else `claudetools_root` in
`.claude/identity.json`, else the repo root derived from the script's own
location -- no hardcoded drive letters.
## Cleanup / approve workflow
1. Run with no flag. Read the report (stdout or `_reports/<stamp>-dream.md`).
2. Run `--apply-safe` to take the safe additive wins: orphan index lines get
added, profile-only memories get migrated into the repo (conflicts skipped
and reported).
3. Work the `## PROPOSED` section by hand:
- `[MERGE?]` -- decide whether to consolidate a cluster. If yes, author a new
combined memory, keep the originals or retire them deliberately, and update
`MEMORY.md`. Never bulk-delete.
- `[REVERIFY?]` -- confirm the dated fact still holds; update the body and
its date if it changed.
- `[STALE-REF?]` -- confirm the referenced path moved/renamed; repoint or
annotate. Many are legitimately server-side (`.service` units, `/opt/...`).
- `[INDEX-CLEANUP?]` / `[DRIFT-RESOLVE?]` -- human picks the winner.
4. Commit the repo store changes so they sync to the fleet via Gitea.
## Self-test
`scripts/selftest.py` runs the analyzer against a synthetic fixture memory
store in a temp dir and asserts each detector fires (orphan, missing target,
broken backlink, stale path, cluster, profile drift) and that `--apply-safe` is
strictly additive (no deletions, no overwrites). Run:
```bash
py "$CLAUDETOOLS_ROOT/.claude/skills/memory-dream/scripts/selftest.py"
```

View File

@@ -0,0 +1,903 @@
#!/usr/bin/env python3
"""
memory_dream.py -- memory lint + consolidation analyzer for the ClaudeTools REPO
memory store (.claude/memory/).
ADDITIVE-ONLY by design. The default run is READ-ONLY and mutates nothing.
The only mutating mode is --apply-safe, which performs ONLY additive,
non-destructive actions:
* append missing index lines to MEMORY.md for orphan memory files
* copy profile-only memory files INTO the repo store (never overwriting)
It NEVER deletes a file, NEVER removes an index line, NEVER overwrites differing
content, and NEVER performs a proposed merge. Every destructive idea stays in
the report as a PROPOSED action for a human to approve.
Stdlib only. Python launcher on Windows fleet is `py`; also runs under
python3/python.
Usage:
py memory_dream.py # REPORT ONLY (default)
py memory_dream.py --apply-safe # additive-only fixes + report
py memory_dream.py --no-file # report to stdout only, skip _reports/ file
py memory_dream.py --report-file X # write report to an explicit path
"""
from __future__ import annotations
import argparse
import datetime
import os
import re
import shutil
import sys
from pathlib import Path
# Windows consoles default to cp1252; memory bodies contain Unicode (arrows,
# em dashes). Force UTF-8 stdout/stderr with replacement so printing never
# crashes regardless of the active code page.
for _stream in (sys.stdout, sys.stderr):
try:
_stream.reconfigure(encoding="utf-8", errors="replace")
except Exception:
pass
# --------------------------------------------------------------------------
# Path resolution -- no hardcoded drive letters.
# --------------------------------------------------------------------------
STALE_MONTHS = 6 # project facts older than this (in "as of <date>") -> re-verify
def _read_identity_root(repo_guess: Path) -> str | None:
"""Best-effort read of claudetools_root from .claude/identity.json."""
ident = repo_guess / ".claude" / "identity.json"
if not ident.is_file():
return None
try:
import json
data = json.loads(ident.read_text(encoding="utf-8"))
root = data.get("claudetools_root")
if root and Path(root).is_dir():
return root
except Exception:
return None
return None
def resolve_claudetools_root() -> Path:
"""
Resolve CLAUDETOOLS_ROOT:
1. env CLAUDETOOLS_ROOT
2. .claude/identity.json claudetools_root (found by walking up from script)
3. derive from this script's location (.../.claude/skills/memory-dream/scripts/)
"""
env_root = os.environ.get("CLAUDETOOLS_ROOT")
if env_root and Path(env_root).is_dir():
return Path(env_root).resolve()
# Walk up from this file looking for a .claude dir.
here = Path(__file__).resolve()
derived = None
for parent in here.parents:
if (parent / ".claude").is_dir():
derived = parent
break
if derived is not None:
ident_root = _read_identity_root(derived)
if ident_root:
return Path(ident_root).resolve()
return derived.resolve()
# Last resort: assume scripts/ -> memory-dream/ -> skills/ -> .claude/ -> ROOT
# (script is at ROOT/.claude/skills/memory-dream/scripts/memory_dream.py)
return here.parents[4].resolve()
def profile_memory_dir(repo_root: Path) -> Path | None:
"""
Derive the harness profile memory dir for this project.
Slug: take the absolute project path, replace every run of non-alphanumeric
chars with '-', then look under $HOME/.claude/projects/<slug>/memory/.
Prefers CLAUDE_PROJECT_DIR if set; falls back to repo_root.
Returns the dir if it exists, else None.
"""
home = Path(os.environ.get("HOME") or os.path.expanduser("~"))
project_dir = os.environ.get("CLAUDE_PROJECT_DIR") or str(repo_root)
abspath = str(Path(project_dir).resolve())
projects_root = home / ".claude" / "projects"
# The single-dash collapse: replace every run of non-alphanumeric chars with
# a single '-'. This is the historical/POSIX-style derivation.
slug_single = re.sub(r"[^A-Za-z0-9]+", "-", abspath)
# The Claude Code harness maps a Windows drive colon to '--' (so
# "D:\\claudetools" -> "D--claudetools"), but the single-dash collapse above
# produces "D-claudetools". Reproduce the harness rule by doubling a leading
# "<drive>-" into "<drive>--".
slug_double = re.sub(r"^([A-Za-z])-", r"\1--", slug_single)
# Try the EXACT candidate slugs in priority order; use the first whose
# profile memory dir actually exists. The double-dash (harness) variant is
# primary; the single-dash collapse is the secondary exact candidate.
seen: set[str] = set()
for slug in (slug_double, slug_single):
if slug in seen:
continue
seen.add(slug)
base = projects_root / slug
for candidate in (base / "memory", base):
if candidate.is_dir():
# If the slug dir itself was matched (no nested memory/), use the
# conventional memory subdir under it.
return (base / "memory") if candidate == base else candidate
# ONLY if none of the exact candidates exist, fall back to a case-insensitive
# tail-scan of $HOME/.claude/projects/*/memory for a dir whose slug "looks
# like" this repo (tail match on the last path component). If MORE THAN ONE
# dir matches, do NOT guess -- report the ambiguity and skip.
if projects_root.is_dir():
tail = re.sub(r"[^A-Za-z0-9]+", "-", repo_root.name).lower()
matches: list[Path] = []
for child in sorted(projects_root.iterdir()):
if not child.is_dir():
continue
if child.name.lower().endswith(tail):
mem = child / "memory"
if mem.is_dir():
matches.append(mem)
if len(matches) > 1:
names = ", ".join(str(m.parent.name) for m in matches)
print(
f"[WARNING] multiple profile dirs matched ({names}); "
"skipping profile drift analysis to avoid cross-project contamination"
)
return None
if len(matches) == 1:
return matches[0]
return None
# --------------------------------------------------------------------------
# Frontmatter / memory file parsing
# --------------------------------------------------------------------------
class Memory:
def __init__(self, path: Path):
self.path = path
self.filename = path.name
self.slug = path.stem
self.name: str | None = None
self.description: str | None = None
self.type: str | None = None
self.body: str = ""
self._parse()
def _parse(self) -> None:
text = self.path.read_text(encoding="utf-8", errors="replace")
lines = text.splitlines()
if not lines or lines[0].strip() != "---":
# No frontmatter; whole file is body.
self.body = text
return
# Find closing fence.
end = None
for i in range(1, len(lines)):
if lines[i].strip() == "---":
end = i
break
if end is None:
self.body = text
return
fm = lines[1:end]
self.body = "\n".join(lines[end + 1 :])
self._parse_frontmatter(fm)
def _parse_frontmatter(self, fm_lines: list[str]) -> None:
"""
Tolerant YAML-ish parse. Handles:
name: X
description: X (or '>-' folded block following)
type: X (top-level)
metadata:
type: X (nested)
"""
i = 0
in_metadata = False
while i < len(fm_lines):
raw = fm_lines[i]
line = raw.rstrip("\n")
stripped = line.strip()
indent = len(line) - len(line.lstrip())
if not stripped:
i += 1
continue
if stripped == "metadata:":
in_metadata = True
i += 1
continue
# Detect leaving the metadata block (a top-level key reappears).
if in_metadata and indent == 0 and ":" in stripped:
in_metadata = False
m = re.match(r"^([A-Za-z_][\w\-]*):\s*(.*)$", stripped)
if not m:
i += 1
continue
key, val = m.group(1), m.group(2)
# Folded/literal block scalar -> capture following more-indented lines.
if val in (">-", ">", "|", "|-", "|+"):
block_lines = []
j = i + 1
base_indent = indent
while j < len(fm_lines):
nxt = fm_lines[j]
nxt_indent = len(nxt) - len(nxt.lstrip())
if nxt.strip() == "" or nxt_indent > base_indent:
block_lines.append(nxt.strip())
j += 1
else:
break
val = " ".join(x for x in block_lines if x)
i = j
else:
val = val.strip().strip('"').strip("'")
i += 1
if key == "name" and not in_metadata:
self.name = val
elif key == "description":
self.description = val
elif key == "type":
# Both top-level and metadata.type land here.
self.type = (val or "").lower() or None
else:
continue
# --------------------------------------------------------------------------
# Index (MEMORY.md) parsing
# --------------------------------------------------------------------------
INDEX_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
# Body backlinks like [[some-name]]
BACKLINK_RE = re.compile(r"\[\[([^\]]+)\]\]")
# "as of <date>" style dated claims.
DATE_RE = re.compile(
r"(?:as of|updated|corrected|lesson|fixed|live)\s+"
r"(\d{4}-\d{2}-\d{2})",
re.IGNORECASE,
)
ISO_DATE_RE = re.compile(r"\b(\d{4}-\d{2}-\d{2})\b")
# Type -> index header. Index uses singular headers.
TYPE_HEADER = {
"reference": "Reference",
"feedback": "Feedback",
"project": "Project",
"user": "Users",
}
def parse_index(index_path: Path):
"""
Returns:
links: list of (title, target, lineno, raw_line)
headers: dict header-name -> lineno
lines: original file lines (no newline)
"""
links = []
headers = {}
if not index_path.is_file():
return links, headers, []
text = index_path.read_text(encoding="utf-8", errors="replace")
lines = text.split("\n")
for idx, line in enumerate(lines):
hm = re.match(r"^##\s+(.+?)\s*$", line)
if hm:
headers[hm.group(1).strip()] = idx
continue
if line.lstrip().startswith("- "):
m = INDEX_LINK_RE.search(line)
if m:
links.append((m.group(1), m.group(2), idx, line))
return links, headers, lines
# --------------------------------------------------------------------------
# Referenced-artifact extraction (conservative)
# --------------------------------------------------------------------------
# Referenced-artifact extraction is intentionally CONSERVATIVE: it only inspects
# backtick-wrapped spans (`...`) and only treats a span as a repo path when the
# whole span is a single path-like token. Extensions are ordered longest-first
# so `identity.json` is never truncated to `identity.js`. We do NOT scan bare
# prose -- too many false positives.
PATHISH_RE = re.compile(r"`([^`\n]+?)`")
# Longest-first extension alternation, anchored to end-of-token, prevents the
# json->js / yaml->yml style truncation bug.
KNOWN_EXTS = (
"tsx", "json", "yaml", "toml", "service",
"py", "sh", "rs", "ts", "js", "md", "yml", "sql", "ps1",
)
EXT_RE = re.compile(r"\.(?:" + "|".join(KNOWN_EXTS) + r")$", re.IGNORECASE)
# Vault-style secret paths live in the SEPARATE vault repo, not claudetools.
VAULT_HINT_RE = re.compile(r"\.sops\.ya?ml$", re.IGNORECASE)
# Tokens we never treat as repo paths.
ABS_PREFIXES = ("/api/", "/home/", "/var/", "/opt/", "/etc/", "/tmp/",
"/proc/", "/dev/", "/data/", "/usr/")
def looks_like_repo_path(token: str) -> bool:
token = token.strip()
if not token:
return False
# Reject anything with whitespace, glob/placeholder/url/colon characters --
# those are descriptions or templates, not concrete repo paths.
if any(c in token for c in (" ", "<", ">", "*", "?", ":", "|", "\\")):
return False
if token.startswith(("http://", "https://", "//", "git@", "vault:")):
return False
if token.startswith(ABS_PREFIXES):
return False # server absolute paths, not repo-relative
# Vault secret refs belong to the vault repo -- not a staleness signal here.
if VAULT_HINT_RE.search(token):
return False
# Must end in a recognized extension (anchored, longest-first).
if not EXT_RE.search(token):
return False
# A real reference is either repo-relative-with-slash or a bare filename.
# Reject single-segment tokens that are clearly prose-y (no slash AND no
# underscore/dash) unless they look like a script filename.
has_slash = "/" in token
if not has_slash:
# bare filename: require it to look like an actual file (has a dot ext,
# already guaranteed) and contain a separator or be a known script ext.
return True
return True
def extract_referenced_paths(body: str) -> list[str]:
found = set()
for m in PATHISH_RE.finditer(body):
span = m.group(1).strip()
# A backtick span counts only if the ENTIRE span is one token (a path).
# Spans with spaces are commands/prose -> skip (avoids `cmd args` noise).
if not span or " " in span:
continue
token = span.lstrip("./")
if looks_like_repo_path(token):
found.add(token)
return sorted(found)
def repo_path_exists(repo_root: Path, token: str) -> bool:
token = token.lstrip("./")
# Try repo-relative.
if (repo_root / token).exists():
return True
# Bare filename -> search anywhere in repo (cheap, bounded).
if "/" not in token:
try:
return any(True for _ in repo_root.rglob(token))
except OSError:
return False
# Also try matching just the tail (last 2 segments) anywhere, since memories
# often cite paths relative to a subproject root.
parts = token.split("/")
if len(parts) >= 2:
tail = "/".join(parts[-2:])
try:
for p in repo_root.rglob(parts[-1]):
if str(p).replace("\\", "/").endswith(tail):
return True
except OSError:
return False
return False
# --------------------------------------------------------------------------
# Similarity / duplicate clustering (token-overlap heuristic)
# --------------------------------------------------------------------------
STOPWORDS = {
"the", "a", "an", "and", "or", "to", "of", "in", "on", "for", "with",
"is", "are", "be", "not", "via", "use", "used", "uses", "no", "never",
"always", "only", "via", "from", "by", "at", "as", "it", "this", "that",
"when", "if", "then", "do", "don't", "we", "our", "you", "your",
}
def tokenize(text: str) -> set[str]:
toks = re.findall(r"[a-z0-9]+", (text or "").lower())
return {t for t in toks if t not in STOPWORDS and len(t) > 2}
def jaccard(a: set[str], b: set[str]) -> float:
if not a or not b:
return 0.0
inter = len(a & b)
union = len(a | b)
return inter / union if union else 0.0
def cluster_overlaps(mems: list[Memory], threshold: float = 0.34):
"""
Within each type, find pairs with token-overlap >= threshold, then union
them into clusters. Returns list of (type, [filenames]) for clusters >1.
"""
clusters_out = []
by_type: dict[str, list[Memory]] = {}
for m in mems:
by_type.setdefault(m.type or "untyped", []).append(m)
for typ, group in by_type.items():
# token signature per memory: name + description + slug words
sigs = {}
for m in group:
base = " ".join(
filter(None, [m.name, m.description, m.slug.replace("_", " ")])
)
sigs[m.filename] = tokenize(base)
# Also bias by shared slug prefix (e.g. feedback_syncro_*).
parent = {m.filename: m.filename for m in group}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx != ry:
parent[rx] = ry
files = [m.filename for m in group]
slug_prefix = {}
for m in group:
parts = m.slug.split("_")
slug_prefix[m.filename] = "_".join(parts[:2]) if len(parts) >= 2 else m.slug
for i in range(len(files)):
for j in range(i + 1, len(files)):
fi, fj = files[i], files[j]
sim = jaccard(sigs[fi], sigs[fj])
same_prefix = (
slug_prefix[fi] == slug_prefix[fj]
and len(slug_prefix[fi].split("_")) >= 2
)
if sim >= threshold or same_prefix:
union(fi, fj)
groups: dict[str, list[str]] = {}
for f in files:
groups.setdefault(find(f), []).append(f)
for members in groups.values():
if len(members) > 1:
clusters_out.append((typ, sorted(members)))
return clusters_out
# --------------------------------------------------------------------------
# Stale dated facts
# --------------------------------------------------------------------------
def find_stale_dates(mem: Memory, today: datetime.date):
"""Return list of (date_str, age_days) for dated claims older than STALE_MONTHS."""
hits = []
seen = set()
for rx in (DATE_RE, ISO_DATE_RE):
for m in rx.finditer(mem.body):
ds = m.group(1)
if ds in seen:
continue
seen.add(ds)
try:
d = datetime.date.fromisoformat(ds)
except ValueError:
continue
age = (today - d).days
if age > STALE_MONTHS * 30:
hits.append((ds, age))
return hits
# --------------------------------------------------------------------------
# Report
# --------------------------------------------------------------------------
class Report:
def __init__(self):
self.lines: list[str] = []
def add(self, s: str = ""):
self.lines.append(s)
def __str__(self):
return "\n".join(self.lines)
def slugify_link_target(target: str) -> str:
return Path(target).stem
def run(args) -> int:
repo_root = resolve_claudetools_root()
mem_dir = repo_root / ".claude" / "memory"
index_path = mem_dir / "MEMORY.md"
if not mem_dir.is_dir():
print(f"[ERROR] memory dir not found: {mem_dir}")
return 2
today = datetime.date.today()
rpt = Report()
rpt.add("# Memory Dream Report")
rpt.add(f"Generated: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M')}")
rpt.add(f"Repo root: {repo_root}")
rpt.add(f"Memory store: {mem_dir}")
rpt.add(f"Mode: {'APPLY-SAFE (additive)' if args.apply_safe else 'REPORT-ONLY'}")
rpt.add("")
# Load memories.
mem_files = sorted(p for p in mem_dir.glob("*.md") if p.name != "MEMORY.md")
mems = [Memory(p) for p in mem_files]
mem_by_file = {m.filename: m for m in mems}
rpt.add(f"Loaded {len(mems)} memory files (excluding MEMORY.md).")
rpt.add("")
# ----- 1. INDEX RECONCILE -----
links, headers, index_lines = parse_index(index_path)
indexed_targets = {slugify_link_target(t): (title, t, ln)
for (title, t, ln, _raw) in links}
rpt.add("## 1. INDEX RECONCILE")
rpt.add("")
orphans = [] # files with no index line
for m in mems:
if m.slug not in indexed_targets:
orphans.append(m)
rpt.add(f"### Orphan files (no index line): {len(orphans)}")
for m in orphans:
rpt.add(f"- [INFO] {m.filename} (type={m.type or '?'})")
rpt.add("")
missing_targets = [] # index lines whose file is missing
for title, target, ln, _raw in links:
# Only consider links that look like local memory files.
tgt = target.strip()
if tgt.startswith(("http://", "https://")):
continue
resolved = (mem_dir / tgt).resolve()
if not resolved.is_file():
missing_targets.append((title, target, ln))
rpt.add(f"### Index lines pointing at missing files: {len(missing_targets)}")
for title, target, ln in missing_targets:
rpt.add(f"- [WARNING] line {ln + 1}: [{title}]({target}) -> file not found")
rpt.add("")
name_mismatches = [] # frontmatter name vs filename slug
for m in mems:
if m.name is None:
name_mismatches.append((m.filename, "(no name in frontmatter)"))
continue
# The convention is loose: name may be a title, not the slug. Only flag
# when name itself looks like a slug AND differs from the filename slug.
name_as_slug = re.sub(r"[^A-Za-z0-9]+", "_", m.name.strip().lower()).strip("_")
if re.fullmatch(r"[a-z0-9_]+", m.name.strip()) and m.name.strip() != m.slug:
name_mismatches.append((m.filename, f"name='{m.name}' != slug='{m.slug}'"))
rpt.add(f"### Frontmatter name vs filename signals: {len(name_mismatches)}")
for fn, note in name_mismatches:
rpt.add(f"- [INFO] {fn}: {note}")
rpt.add("")
# ----- 2. BACKLINKS -----
rpt.add("## 2. BACKLINKS ([[name]] references)")
rpt.add("")
known_slugs = {m.slug for m in mems}
broken_backlinks = []
for m in mems:
for bm in BACKLINK_RE.finditer(m.body):
ref = bm.group(1).strip()
ref_slug = slugify_link_target(ref)
if ref_slug not in known_slugs and ref not in known_slugs:
broken_backlinks.append((m.filename, ref))
rpt.add(f"### Broken backlinks: {len(broken_backlinks)}")
for fn, ref in broken_backlinks:
rpt.add(f"- [WARNING] {fn}: [[{ref}]] has no matching memory file")
if not broken_backlinks:
rpt.add("- [OK] no broken backlinks found")
rpt.add("")
# ----- 3. REFERENCED-ARTIFACT VALIDITY -----
rpt.add("## 3. REFERENCED-ARTIFACT VALIDITY (conservative; 'verify', not 'delete')")
rpt.add("")
artifact_flags = []
for m in mems:
for tok in extract_referenced_paths(m.body):
if not repo_path_exists(repo_root, tok):
artifact_flags.append((m.filename, tok))
rpt.add(f"### Referenced paths not found in repo: {len(artifact_flags)}")
for fn, tok in artifact_flags:
rpt.add(f"- [VERIFY] {fn}: `{tok}` not found under repo (may be server-side "
f"or renamed -- verify, do not auto-delete)")
if not artifact_flags:
rpt.add("- [OK] no clearly-stale repo paths detected")
rpt.add("")
# ----- 4. DUPLICATE / OVERLAP CLUSTERS -----
rpt.add("## 4. DUPLICATE / OVERLAP CLUSTERS (PROPOSED merges -- never auto-applied)")
rpt.add("")
clusters = cluster_overlaps(mems)
clusters.sort(key=lambda c: (-len(c[1]), c[0]))
rpt.add(f"### Candidate clusters: {len(clusters)}")
for typ, members in clusters:
rpt.add(f"- [{typ}] {len(members)} related memories:")
for f in members:
mm = mem_by_file.get(f)
desc = (mm.description or mm.name or "") if mm else ""
desc = desc[:90]
rpt.add(f" - {f} -- {desc}")
if not clusters:
rpt.add("- [OK] no overlap clusters above threshold")
rpt.add("")
# ----- 5. STALE DATED FACTS -----
rpt.add(f"## 5. STALE DATED FACTS (project-type, dated > {STALE_MONTHS} months)")
rpt.add("")
stale_hits = []
for m in mems:
if (m.type or "") != "project":
continue
hits = find_stale_dates(m, today)
if hits:
stale_hits.append((m.filename, hits))
rpt.add(f"### Project memories with stale dated claims: {len(stale_hits)}")
for fn, hits in stale_hits:
for ds, age in hits:
rpt.add(f"- [VERIFY] {fn}: dated {ds} (~{age} days old) -- re-verify")
if not stale_hits:
rpt.add("- [OK] no stale dated project facts")
rpt.add("")
# ----- 6. DRIFT vs PROFILE STORE -----
rpt.add("## 6. DRIFT vs HARNESS PROFILE STORE")
rpt.add("")
prof_dir = profile_memory_dir(repo_root)
profile_only = []
repo_only = []
conflicts = []
if prof_dir is None:
rpt.add("- [INFO] profile memory dir not found; skipping drift check.")
else:
rpt.add(f"Profile store: {prof_dir}")
rpt.add("")
prof_files = {p.name for p in prof_dir.glob("*.md") if p.name != "MEMORY.md"}
repo_files = {m.filename for m in mems}
for pf in sorted(prof_files - repo_files):
profile_only.append(pf)
for rf in sorted(repo_files - prof_files):
repo_only.append(rf)
for both in sorted(prof_files & repo_files):
a = (prof_dir / both).read_text(encoding="utf-8", errors="replace")
b = (mem_dir / both).read_text(encoding="utf-8", errors="replace")
if a != b:
conflicts.append(both)
rpt.add(f"### Profile-only (candidates to MIGRATE INTO repo): {len(profile_only)}")
for f in profile_only:
rpt.add(f"- [INFO] {f}")
rpt.add("")
rpt.add(f"### Repo-only (candidates to PUSH OUT to profile): {len(repo_only)}")
for f in repo_only:
rpt.add(f"- [INFO] {f}")
rpt.add("")
rpt.add(f"### Present in BOTH but differing (CONFLICT -- human review): "
f"{len(conflicts)}")
for f in conflicts:
rpt.add(f"- [WARNING] {f}: content differs between repo and profile")
rpt.add("")
# ----- APPLY-SAFE ACTIONS (additive-only) -----
actions_taken = []
if args.apply_safe:
rpt.add("## APPLY-SAFE ACTIONS PERFORMED (additive-only)")
rpt.add("")
# (a) Append missing index lines for orphan files.
if orphans and index_path.is_file():
appended = append_index_lines(index_path, orphans, index_lines, headers)
for line, hdr in appended:
actions_taken.append(f"INDEX += [{hdr}] {line}")
rpt.add(f"- [OK] appended index line under ## {hdr}: {line}")
elif orphans:
rpt.add("- [WARNING] orphans exist but MEMORY.md missing; nothing appended")
# (b) Copy profile-only files INTO repo (never overwrite).
if prof_dir is not None:
for f in profile_only:
src = prof_dir / f
dst = mem_dir / f
if dst.exists():
rpt.add(f"- [SKIP] {f}: already exists in repo (not overwriting)")
continue
shutil.copy2(src, dst)
actions_taken.append(f"COPIED profile->repo: {f}")
rpt.add(f"- [OK] copied profile-only file into repo: {f}")
if not actions_taken:
rpt.add("- [INFO] no additive actions were necessary")
rpt.add("")
# ----- SUMMARY -----
rpt.add("## SUMMARY")
rpt.add("")
rpt.add(f"- memory files: {len(mems)}")
rpt.add(f"- orphan files (no index): {len(orphans)}")
rpt.add(f"- index -> missing file: {len(missing_targets)}")
rpt.add(f"- name/filename signals: {len(name_mismatches)}")
rpt.add(f"- broken backlinks: {len(broken_backlinks)}")
rpt.add(f"- stale referenced paths: {len(artifact_flags)}")
rpt.add(f"- overlap clusters: {len(clusters)}")
rpt.add(f"- stale dated project facts: {len(stale_hits)}")
rpt.add(f"- profile-only files: {len(profile_only)}")
rpt.add(f"- repo-only files: {len(repo_only)}")
rpt.add(f"- repo<->profile conflicts: {len(conflicts)}")
if args.apply_safe:
rpt.add(f"- additive actions performed: {len(actions_taken)}")
rpt.add("")
rpt.add("## PROPOSED (needs human approval -- NEVER auto-applied)")
rpt.add("")
n_prop = 0
for typ, members in clusters:
n_prop += 1
rpt.add(f"- [MERGE?] consolidate {len(members)} '{typ}' memories: "
f"{', '.join(members)}")
for fn, hits in stale_hits:
n_prop += 1
rpt.add(f"- [REVERIFY?] {fn} (dated facts) -- confirm still true, then update")
for fn, tok in artifact_flags:
n_prop += 1
rpt.add(f"- [STALE-REF?] {fn} references `{tok}` -- confirm/repoint or note moved")
for title, target, ln in missing_targets:
n_prop += 1
rpt.add(f"- [INDEX-CLEANUP?] MEMORY.md line {ln + 1} points at missing "
f"{target} -- human decides keep/remove")
if prof_dir is not None:
for f in conflicts:
n_prop += 1
rpt.add(f"- [DRIFT-RESOLVE?] {f} differs repo vs profile -- human picks "
f"winner (sync-memory.sh leaves both untouched)")
if n_prop == 0:
rpt.add("- [OK] nothing proposed; memory store is clean")
rpt.add("")
out = str(rpt)
print(out)
# Write report file unless suppressed.
if not args.no_file:
reports_dir = mem_dir / "_reports"
reports_dir.mkdir(parents=True, exist_ok=True)
if args.report_file:
rpath = Path(args.report_file)
else:
stamp = datetime.datetime.now().strftime("%Y-%m-%d-%H%M")
rpath = reports_dir / f"{stamp}-dream.md"
rpath.write_text(out + "\n", encoding="utf-8")
print(f"\n[INFO] report written: {rpath}")
return 0
def append_index_lines(index_path: Path, orphans, index_lines, headers):
"""
Additive only: append a '- [Name](file.md) -- description' line for each
orphan under the correct '## <Header>' section. Never reorders or removes
existing lines. If a header doesn't exist, append it at end of file.
Returns list of (line_text, header_used).
"""
text = index_path.read_text(encoding="utf-8", errors="replace")
lines = text.split("\n")
appended = []
# Group orphans by target header.
by_header: dict[str, list[Memory]] = {}
for m in orphans:
hdr = TYPE_HEADER.get(m.type or "", None)
if hdr is None:
hdr = "Project" # safe default bucket; human can recategorize
by_header.setdefault(hdr, []).append(m)
def build_line(m: Memory) -> str:
title = m.name or m.slug
hook = (m.description or "").strip()
if hook:
return f"- [{title}]({m.filename}) -- {hook}"
return f"- [{title}]({m.filename})"
for hdr, members in by_header.items():
# Find header line index.
hidx = None
for i, ln in enumerate(lines):
hm = re.match(r"^##\s+(.+?)\s*$", ln)
if hm and hm.group(1).strip() == hdr:
hidx = i
break
new_lines = [build_line(m) for m in members]
if hidx is None:
# Append a fresh section at end of file.
if lines and lines[-1].strip() != "":
lines.append("")
lines.append(f"## {hdr}")
lines.extend(new_lines)
for nl, m in zip(new_lines, members):
appended.append((nl, hdr))
continue
# Find end of this section: next '## ' or EOF.
end = len(lines)
for j in range(hidx + 1, len(lines)):
if re.match(r"^##\s+", lines[j]):
end = j
break
# Insert after the last non-blank line of the section.
insert_at = end
while insert_at - 1 > hidx and lines[insert_at - 1].strip() == "":
insert_at -= 1
for off, (nl, m) in enumerate(zip(new_lines, members)):
lines.insert(insert_at + off, nl)
appended.append((nl, hdr))
index_path.write_text("\n".join(lines), encoding="utf-8")
return appended
def main() -> int:
ap = argparse.ArgumentParser(
description="Memory lint + consolidation analyzer (additive-only)."
)
ap.add_argument(
"--apply-safe",
action="store_true",
help="Perform ONLY additive fixes (append index lines, copy profile-only "
"files into repo). Never deletes/overwrites/merges.",
)
ap.add_argument(
"--no-file",
action="store_true",
help="Print report to stdout only; do not write a _reports/ file.",
)
ap.add_argument(
"--report-file",
default=None,
help="Explicit path for the report file (overrides _reports/ default).",
)
args = ap.parse_args()
try:
return run(args)
except KeyboardInterrupt:
print("[ERROR] interrupted")
return 130
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
selftest.py -- exercises memory_dream.py against a synthetic fixture store.
Builds a throwaway repo + profile memory store in a temp dir, runs the analyzer
both in report-only and --apply-safe modes (as a subprocess, with
CLAUDETOOLS_ROOT / HOME / CLAUDE_PROJECT_DIR pointed at the fixtures), and
asserts:
* each detector fires (orphan, missing index target, broken backlink, stale
referenced path, overlap cluster, profile drift),
* --apply-safe is strictly additive (no file deleted, no file overwritten,
orphan index line appended, profile-only file migrated, differing file
skipped not clobbered).
Stdlib only. Exit 0 on success, 1 on any failed assertion.
"""
from __future__ import annotations
import os
import re
import subprocess
import sys
import tempfile
from pathlib import Path
SCRIPT = Path(__file__).resolve().with_name("memory_dream.py")
PY = sys.executable or "python"
FAILURES: list[str] = []
def check(cond: bool, msg: str) -> None:
status = "[OK]" if cond else "[ERROR]"
print(f"{status} {msg}")
if not cond:
FAILURES.append(msg)
def write(path: Path, text: str) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(text, encoding="utf-8")
def build_fixture(root: Path):
"""Create repo + profile fixture stores. Returns (repo_root, project_dir, home)."""
repo_root = root / "repo"
mem = repo_root / ".claude" / "memory"
mem.mkdir(parents=True, exist_ok=True)
# A real script the memory can reference (exists -> must NOT be flagged).
write(repo_root / ".claude" / "scripts" / "real.sh", "#!/bin/sh\necho hi\n")
# --- memory files ---
# indexed + clean
write(mem / "reference_alpha.md",
"---\nname: Alpha\ndescription: alpha thing\nmetadata:\n type: reference\n---\n"
"Uses `.claude/scripts/real.sh` which exists.\n")
# orphan (no index line) + broken backlink + stale referenced path
write(mem / "feedback_orphan.md",
"---\nname: Orphan Feedback\ndescription: an orphan\ntype: feedback\n---\n"
"See [[no_such_memory]] and `scripts/ghost_missing.py` which is gone.\n")
# two overlapping feedback memories (same slug prefix -> cluster)
write(mem / "feedback_syncro_aaa.md",
"---\nname: Syncro AAA\ndescription: syncro billing rule one\ntype: feedback\n---\nbody\n")
write(mem / "feedback_syncro_bbb.md",
"---\nname: Syncro BBB\ndescription: syncro billing rule two\ntype: feedback\n---\nbody\n")
# stale dated project fact
write(mem / "project_old.md",
"---\nname: Old Project\ndescription: ancient\ntype: project\n---\n"
"Migration completed as of 2019-01-01 and never touched since.\n")
# --- MEMORY.md index ---
# references reference_alpha + a MISSING target; omits feedback_orphan (orphan)
write(mem / "MEMORY.md",
"# Memory Index\n\n"
"## Reference\n"
"- [Alpha](reference_alpha.md) -- alpha thing\n"
"- [Ghost](reference_ghost.md) -- points at a missing file\n\n"
"## Feedback\n"
"- [Syncro AAA](feedback_syncro_aaa.md) -- syncro billing rule one\n"
"- [Syncro BBB](feedback_syncro_bbb.md) -- syncro billing rule two\n\n"
"## Project\n"
"- [Old Project](project_old.md) -- ancient\n")
# --- profile store ---
# slug derivation mirrors memory_dream.profile_memory_dir
project_dir = repo_root # we set CLAUDE_PROJECT_DIR to repo_root
home = root / "home"
slug = re.sub(r"[^A-Za-z0-9]+", "-", str(project_dir.resolve()))
prof = home / ".claude" / "projects" / slug / "memory"
prof.mkdir(parents=True, exist_ok=True)
# profile-only file (candidate to migrate INTO repo)
write(prof / "feedback_profile_only.md",
"---\nname: Profile Only\ndescription: lives only in profile\ntype: feedback\n---\nkeep me\n")
# same-named in BOTH but DIFFERING content (must be skipped, not overwritten)
write(prof / "reference_alpha.md",
"---\nname: Alpha\ndescription: alpha thing\nmetadata:\n type: reference\n---\n"
"PROFILE VERSION -- different content.\n")
return repo_root, project_dir, home, prof
def run_analyzer(repo_root: Path, project_dir: Path, home: Path, *extra) -> str:
env = dict(os.environ)
env["CLAUDETOOLS_ROOT"] = str(repo_root)
env["CLAUDE_PROJECT_DIR"] = str(project_dir)
env["HOME"] = str(home)
env["PYTHONIOENCODING"] = "utf-8"
cmd = [PY, str(SCRIPT), "--no-file", *extra]
res = subprocess.run(cmd, env=env, capture_output=True, text=True,
encoding="utf-8", errors="replace")
return res.stdout + "\n" + res.stderr
def main() -> int:
with tempfile.TemporaryDirectory() as td:
root = Path(td)
repo_root, project_dir, home, prof = build_fixture(root)
mem = repo_root / ".claude" / "memory"
# ---- report-only run ----
out = run_analyzer(repo_root, project_dir, home)
check("Mode: REPORT-ONLY" in out, "default run is report-only")
check("feedback_orphan.md" in out and "Orphan files (no index line): 1" in out,
"detects the orphan file")
check("reference_ghost.md" in out and "missing files: 1" in out,
"detects index line pointing at missing file")
check("[[no_such_memory]]" in out, "detects broken backlink")
check("ghost_missing.py" in out, "flags stale referenced path")
check("real.sh" not in out.split("REFERENCED-ARTIFACT")[-1].split("##")[0]
if "REFERENCED-ARTIFACT" in out else True,
"does NOT flag an existing referenced path (real.sh)")
check("feedback_syncro_aaa.md" in out and "feedback_syncro_bbb.md" in out
and "CLUSTER" in out.upper(), "detects overlap cluster")
check("project_old.md" in out and "2019-01-01" in out,
"detects stale dated project fact")
check("feedback_profile_only.md" in out
and "MIGRATE INTO repo" in out, "detects profile-only drift")
check("reference_alpha.md" in out and "differs between repo and profile" in out,
"detects repo<->profile content conflict")
check("PROPOSED (needs human approval" in out, "emits PROPOSED section")
# ---- snapshot repo state before apply-safe ----
before = {p.name: p.read_text(encoding="utf-8") for p in mem.glob("*.md")}
# ---- apply-safe run (additive only) ----
out2 = run_analyzer(repo_root, project_dir, home, "--apply-safe")
after = {p.name: p.read_text(encoding="utf-8") for p in mem.glob("*.md")}
# No file deleted.
check(set(before).issubset(set(after)), "apply-safe deleted no repo file")
# Orphan index line appended (file content for non-index unchanged).
for fn, content in before.items():
if fn == "MEMORY.md":
continue
check(after.get(fn) == content,
f"apply-safe did not alter memory body: {fn}")
# MEMORY.md grew (orphan appended) and kept all old lines.
idx_before = before["MEMORY.md"]
idx_after = after["MEMORY.md"]
check("feedback_orphan.md" in idx_after,
"apply-safe appended orphan index line")
check(all(line in idx_after for line in idx_before.splitlines() if line.strip()),
"apply-safe preserved every existing index line")
# Profile-only migrated INTO repo.
check("feedback_profile_only.md" in after,
"apply-safe migrated profile-only file into repo")
# Differing same-named file was SKIPPED, not overwritten.
check(after["reference_alpha.md"] == before["reference_alpha.md"],
"apply-safe did NOT overwrite differing repo file (skipped)")
# The differing same-named file is surfaced as a drift conflict, not a
# copy target -- apply-safe leaves it for human review.
check("reference_alpha.md" in out2
and "differs between repo and profile" in out2,
"apply-safe reported the differing file as a conflict (not overwritten)")
# Profile store itself untouched by dream (dream only writes repo side).
check((prof / "feedback_profile_only.md").exists(),
"profile-only source still present after migration")
print()
if FAILURES:
print(f"[ERROR] {len(FAILURES)} self-test assertion(s) failed:")
for f in FAILURES:
print(f" - {f}")
return 1
print("[SUCCESS] all self-test assertions passed")
return 0
if __name__ == "__main__":
sys.exit(main())