feat: add wiki knowledge layer (Phase 0 + Phase 1 seed)

Implements LLM-compiled wiki layer between raw session logs and live
CONTEXT.md, inspired by Karpathy's knowledge base workflow. Adds wiki/
directory structure, article templates, spec docs, and seeds first two
articles (Cascades of Tucson, GuruRMM) from 60+ session logs.

Updates CLAUDE.md to check wiki first on all context-loading triggers.
Captures verified ACG IP/hostname map and Neptune physical-location
clarification (Dataforth D2, subnet overlap TODO) in memory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-24 15:42:38 -07:00
parent 435e921300
commit cd80f5e447
19 changed files with 1533 additions and 6 deletions

View File

@@ -119,12 +119,15 @@ Load context **before responding** when any trigger fires. Never ask for info th
| Trigger | Action | | Trigger | Action |
|---------|--------| |---------|--------|
| GuruRMM / Dataforth / project keywords | Read `projects/<project>/CONTEXT.md`, query coord API status + components | | Client name mentioned | Read `wiki/clients/<slug>.md` FIRST, then `clients/<name>/session-logs/` for recent detail |
| "continue", "resume", "back to", "finish" | Read project CONTEXT.md, check coord API for locks + unread messages | | GuruRMM / Dataforth / project keywords | Read `wiki/projects/<slug>.md` FIRST, then `projects/<project>/CONTEXT.md`, query coord API status + components |
| Servers, IPs, credentials, deploy questions | Read CONTEXT.md — answer from it, never ask | | Server/hostname/IP mentioned | Read `wiki/systems/<slug>.md` FIRST for synthesized knowledge |
| Uncertainty >5% about infra or recent work | Read CONTEXT.md before asking the user | | "continue", "resume", "back to", "finish" | Read project wiki article + CONTEXT.md, check coord API for locks + unread messages |
| Servers, IPs, credentials, deploy questions | Check wiki/systems first, then CONTEXT.md — answer from it, never ask |
| Uncertainty >5% about infra or recent work | Check wiki first, then CONTEXT.md before asking the user |
CONTEXT.md locations: `projects/msp-tools/guru-rmm/CONTEXT.md`, `projects/dataforth-dos/CONTEXT.md`, `CONTEXT.md` (root). CONTEXT.md locations: `projects/msp-tools/guru-rmm/CONTEXT.md`, `projects/dataforth-dos/CONTEXT.md`, `CONTEXT.md` (root).
Wiki location: `wiki/` (root) — `wiki/clients/`, `wiki/projects/`, `wiki/systems/`, `wiki/patterns/`. Index: `wiki/index.md`.
--- ---
@@ -212,6 +215,7 @@ Also scan session logs pulled during `/sync` for legacy `## Note for <user>` sec
## Context Recovery ## Context Recovery
When user references previous work, use `/context` command. Never ask for info in: When user references previous work, use `/context` command. Never ask for info in:
- `wiki/` — **Check first.** LLM-compiled synthesized knowledge by client/project/system. Index: `wiki/index.md`
- `credentials.md` — Infrastructure reference (being migrated to SOPS vault) - `credentials.md` — Infrastructure reference (being migrated to SOPS vault)
- `session-logs/` — Daily work logs (also in `projects/*/session-logs/` and `clients/*/session-logs/`) - `session-logs/` — Daily work logs (also in `projects/*/session-logs/` and `clients/*/session-logs/`)
- **Coordination API** — current locks, component states, workflows, messages: `GET http://172.16.3.30:8001/api/coord/status` - **Coordination API** — current locks, component states, workflows, messages: `GET http://172.16.3.30:8001/api/coord/status`
@@ -246,7 +250,9 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t
|---------|---------| |---------|---------|
| `/checkpoint` | Dual checkpoint: git commit + database context | | `/checkpoint` | Dual checkpoint: git commit + database context |
| `/save` | Comprehensive session log | | `/save` | Comprehensive session log |
| `/context` | Search session logs, credentials.md, and 1Password | | `/context` | Search wiki first, then session logs, credentials.md, and 1Password |
| `/wiki-compile` | Compile session logs into wiki articles for a client/project/system/all |
| `/wiki-lint` | Health-check wiki for stale IPs, broken backlinks, orphaned articles |
| `/1password` | 1Password secrets management | | `/1password` | 1Password secrets management |
| `/sync` | Sync config from Gitea repository | | `/sync` | Sync config from Gitea repository |
| `/create-spec` | Create app specification for AutoCoder | | `/create-spec` | Create app specification for AutoCoder |

View File

@@ -20,6 +20,7 @@
| GuruRMM Session Logs | RMM work | `session-logs/YYYY-MM-DD-session.md` (root — NOT in gururmm submodule) | | GuruRMM Session Logs | RMM work | `session-logs/YYYY-MM-DD-session.md` (root — NOT in gururmm submodule) |
| General Session Logs | Mixed work | `session-logs/YYYY-MM-DD-session.md` | | General Session Logs | Mixed work | `session-logs/YYYY-MM-DD-session.md` |
| Credentials | All credentials | `credentials.md` (root - shared) | | Credentials | All credentials | `credentials.md` (root - shared) |
| **Wiki articles** | Compiled knowledge | `wiki/clients/`, `wiki/projects/`, `wiki/systems/`, `wiki/patterns/` — LLM-maintained, do not edit manually |
--- ---

View File

@@ -27,4 +27,11 @@ ACG office LAN is 172.16.0.0/22, routed via Tailscale through pfSense node `pfse
**VMs on Jupiter (virsh):** GuruRMM, Unifi, OwnCloud, Claude-Builder (running); Windows 7, Windows Server 2016, Windows Server 2016_Template (shut off). **VMs on Jupiter (virsh):** GuruRMM, Unifi, OwnCloud, Claude-Builder (running); Windows 7, Windows Server 2016, Windows Server 2016_Template (shut off).
**Why:** How to apply: see [[power-failure-runbook]] for full post-outage recovery steps. **Neptune (ACG infra, physically at Dataforth D2):**
- neptune.acghosting.com | internal 172.16.3.11 | external 67.206.163.124
- Exchange Server 2016 — active mail server for multiple ACG-hosted clients
- Physically colocated at Dataforth's D2 facility, NOT at ACG office
- Access from ACG office: must route through D2TESTNAS (192.168.0.9) because Dataforth's UDM runs a subnet overlapping ACG office LAN (both use 172.16.x.x range), making direct routing ambiguous
- **TODO:** Resubnet Dataforth UDM to a non-overlapping range to fix routing and simplify Neptune access
**Why:** How to apply: see [[power-failure-runbook]] for full post-outage recovery steps. Neptune is NOT on ACG office LAN despite the 172.16.x.x IP — always route via D2TESTNAS or Dataforth VPN.

View File

@@ -0,0 +1,197 @@
# Wiki Layer — Implementation Plan
**Status:** Phase 1 in progress (Cascades + GuruRMM seeded 2026-05-24)
---
## Phase 0 — Structure (no code, low effort, high foundation value)
**Goal:** Create the directory skeleton and article templates. Nothing else runs yet.
### Tasks
- [ ] Create `wiki/` root with `index.md` stub
- [ ] Create `wiki/clients/`, `wiki/projects/`, `wiki/systems/`, `wiki/patterns/` directories
- [ ] Write article templates (see `standards.md`)
- [ ] Add `.gitkeep` files so empty dirs commit
- [ ] Verify GrepAI watcher picks up `wiki/` automatically (it should — it watches the repo root)
- [ ] Add `wiki/` mention to `CLAUDE.md` context-loading section (read `wiki/<topic>.md` when topic is known)
**Done when:** Directory exists, templates written, GrepAI indexes it.
---
## Phase 1 — Seed Pass (high effort, immediate ROI)
**Goal:** Compile first-pass wiki articles for the two highest-value targets: GuruRMM and Cascades.
### Approach
Run `/wiki-compile` manually, read and correct output, commit. This is a supervised pass — we're validating the compilation quality before automating.
### Tasks
- [ ] Compile `wiki/projects/gururmm.md` from `projects/msp-tools/guru-rmm/session-logs/` + root session logs mentioning GuruRMM
- [ ] Compile `wiki/clients/cascades.md` from `clients/cascades/session-logs/` + root logs mentioning Cascades
- [ ] Compile `wiki/systems/` entries for Neptune, Jupiter, Pluto, Saturn (from session logs + credentials.md)
- [ ] Mike reviews each article, corrects factual errors, commits
- [ ] Run second seed pass for remaining active clients and projects
**Done when:** All active clients and systems have wiki articles. Mike has reviewed at least Cascades + GuruRMM.
---
## Phase 2 — `/wiki-compile` Command
**Goal:** Implement the compile command as a repeatable skill.
### Command behavior
```
/wiki-compile [scope]
Scopes:
client:<name> Compile from clients/<name>/session-logs/ + matching root logs
project:<name> Compile from projects/<name>/session-logs/ + matching root logs
system:<name> Compile from all logs mentioning the system name or hostname
all Full pass — all articles, oldest-first by last_compiled date
Without scope: prompt user to select
```
### Steps Claude takes when `/wiki-compile` is invoked
1. Identify scope from argument
2. Find relevant session logs (by directory + GrepAI keyword search)
3. Read existing wiki article if it exists (note `last_compiled` date)
4. Read session logs newer than `last_compiled` (or all if new article)
5. Read relevant CONTEXT.md and memory entries for cross-reference
6. Call Ollama (qwen3:8b) to generate/update the article
- Summarize new information, merge with existing
- Update backlinks section
- Update `last_compiled` frontmatter
7. Present diff to user for review before writing
8. On approval: write article, update `wiki/index.md`
### Tasks
- [ ] Write `.claude/commands/wiki-compile.md` (the skill file)
- [ ] Write Ollama prompt template for article compilation
- [ ] Implement index.md update logic
- [ ] Test with GuruRMM and Cascades
**Done when:** `/wiki-compile client:cascades` runs end-to-end, produces correct article, updates index.
---
## Phase 3 — `/context` Integration
**Goal:** `/context` checks wiki before raw logs.
### Updated `/context` flow
```
1. Search wiki/index.md for topic mentions
2. Read matching wiki/<type>/<name>.md articles (synthesized knowledge)
3. THEN search session logs for recent/specific detail not yet compiled
4. Check credentials.md for access details
5. Return synthesized answer: wiki content + relevant log excerpts
```
### Tasks
- [ ] Update `.claude/commands/context.md` with wiki-first search step
- [ ] Add "Check wiki/index.md" as Step 1 in context command
- [ ] Add note: if topic has a wiki article, read it first; use logs only for recency
**Done when:** Running `/context cascades` returns the wiki article summary + any recent-session additions.
---
## Phase 4 — `/wiki-lint` Command
**Goal:** Health-check the wiki for staleness and integrity issues.
### Lint rules
| Rule | What It Checks | Action |
|---|---|---|
| Stale IPs | IPs in wiki vs CONTEXT.md + credentials.md | Flag discrepancies |
| Rate conflicts | Billing rates in wiki vs memory entries | Flag conflicts |
| Orphaned articles | Wiki articles with no session log activity in 90+ days | Flag for review/archive |
| Missing articles | Clients/projects mentioned in recent logs with no wiki article | Suggest compile |
| Broken backlinks | `[[links]]` pointing to non-existent articles | Flag |
| Memory conflicts | Memory entries contradicting wiki facts | Flag with both versions |
### Tasks
- [ ] Write `.claude/commands/wiki-lint.md`
- [ ] Implement lint checks
- [ ] Decide: run as scheduled cron or manual-only? (recommend: manual initially, cron after 30 days)
**Done when:** `/wiki-lint` runs, produces a report with actionable issues.
---
## Phase 5 — `/save` Integration (optional, lower priority)
After writing session log, detect which clients/projects/systems were mentioned, prompt:
```
Session log saved. Detected topics: GuruRMM, Cascades.
Update wiki for these topics? [y/n]
```
If y: run `/wiki-compile` for each detected topic before sync.
### Tasks
- [ ] Add topic detection to `/save` (regex on session log content)
- [ ] Add post-save wiki update prompt
- [ ] Wire through to `/wiki-compile`
---
## Migration Strategy
| What | Risk | How |
|---|---|---|
| Session logs | None | Not touched, wiki reads from them |
| CONTEXT.md | None | Not touched, wiki reads from it |
| Memory entries | Low | Wiki links to memory; memory links to wiki |
| `/context` command | Low | Additive — wiki step added before log search |
| GrepAI index | None | Automatic — picks up `wiki/` files on next index cycle |
| Cross-machine sync | None | Wiki is tracked in Git, syncs via normal `sync.sh` |
**No existing functionality is removed or broken.** All changes are additive.
---
## Recommended Start Order
1. ~~Phase 0 (structure)~~ — DONE 2026-05-24
2. Phase 1 seed:
- [x] Cascades of Tucson — `wiki/clients/cascades-tucson.md` — DONE 2026-05-24
- [x] GuruRMM — `wiki/projects/gururmm.md` — DONE 2026-05-24
- [ ] Systems: neptune, jupiter, pluto, saturn
- [ ] `wiki/overview.md` — compile AFTER systems are seeded (reads other wiki articles)
3. Phase 2 command — implement and test — 2-3 hours
4. Phase 3 context integration — 30 minutes
5. Phase 4 lint — schedule for after 30 days of real wiki use
6. Phase 5 save integration — nice-to-have, do if Phase 1-4 prove value
## Overview Article (added 2026-05-24)
`wiki/overview.md` — scope `/wiki-compile overview`
The "cold-start" doc. Compiles from other wiki articles (not raw logs directly), so it is cheap to maintain.
Sections:
- **Team** — Mike (admin/owner), Howard (tech/employee), machines, roles
- **Active Clients** — one-liner per client with billing type, hours, primary open project, link to wiki article
- **Active Projects** — one-liner per project with version/status, link to wiki article
- **Key Infrastructure** — what runs where (Neptune, Jupiter, Pluto, Saturn, plus client-side infra)
- **Tooling Stack** — coord API, GrepAI, Ollama, SOPS vault, Gitea, Syncro PSA, GuruRMM
- **State of the Business** — current snapshot: active projects, open billing, near-term priorities
This article is the one Howard reads on a new machine before touching anything else.

View File

@@ -0,0 +1,75 @@
# Wiki Layer — References
## Files This Feature Reads From
| File / Path | Role |
|---|---|
| `session-logs/*.md` | Raw source — root general logs |
| `projects/*/session-logs/*.md` | Raw source — project-specific logs |
| `clients/*/session-logs/*.md` | Raw source — client logs |
| `projects/*/CONTEXT.md` | Cross-reference for live state facts |
| `.claude/memory/*.md` | Cross-reference for discrete facts |
| `credentials.md` | Cross-reference for access details |
## Files This Feature Writes To
| File / Path | Role |
|---|---|
| `wiki/index.md` | LLM-maintained master index |
| `wiki/clients/<slug>.md` | Per-client synthesized knowledge |
| `wiki/projects/<slug>.md` | Per-project synthesized knowledge |
| `wiki/systems/<slug>.md` | Per-system synthesized knowledge |
| `wiki/patterns/<slug>.md` | Cross-cutting pattern articles |
## Commands This Feature Touches
| Command | Change |
|---|---|
| `.claude/commands/context.md` | Add wiki-first search step (Phase 3) |
| `.claude/commands/save.md` | Add post-save wiki update prompt (Phase 5) |
| `.claude/commands/wiki-compile.md` | New — created in Phase 2 |
| `.claude/commands/wiki-lint.md` | New — created in Phase 4 |
## Integration Points
### GrepAI
- `wiki/` is under the repo root — GrepAI watcher picks it up automatically
- No config change needed
- Wiki articles get full semantic search immediately after creation
- CLI: `D:/claudetools/grepai.exe search "cascades billing rate" --json -c -n 5`
### Ollama (Compilation Engine)
- Model: `qwen3:8b` on DESKTOP-0O8A1RL (86 tok/s, fits in 12GB VRAM)
- Model: `qwen3:14b` on other machines
- Prompt style: structured-format, JSON frontmatter first, then article body
- Endpoint: `http://localhost:11434` (DESKTOP-0O8A1RL) | `http://100.92.127.64:11434` (others)
- See `.claude/OLLAMA.md` for full routing table
### Git / Gitea Sync
- `wiki/` is tracked in Git — syncs via `sync.sh` on every `/save` and `/sync`
- Cross-machine sync is automatic: Mike compiles on DESKTOP, Howard gets it on next pull
- No special handling needed
### Coord API
- Wiki does NOT replace coord API — they serve different purposes
- Coord API: real-time state (locks, messages, component state)
- Wiki: synthesized historical knowledge
- `/wiki-compile` may read coord API component state for currency checks but does not write to it
## Existing Relevant Docs
| Doc | Why Relevant |
|---|---|
| `.claude/CLAUDE.md` | Auto-context-loading section needs updating to reference wiki |
| `.claude/OLLAMA.md` | Compilation uses Ollama — routing table and examples |
| `.claude/FILE_PLACEMENT_GUIDE.md` | `wiki/` placement is new — add section |
| `.claude/memory/MEMORY.md` | Memory entries should link to wiki articles where applicable |
| `CONTEXT.md` (root) | Add wiki to "Where to Find Things" section |
## Karpathy Reference
Original post: https://x.com/karpathy/status/2039805659525644595
Posted: 2026-04-02
Key insight adopted: raw data → LLM-compiled wiki → Q&A against wiki → outputs filed back into wiki
Key insight NOT adopted: Obsidian IDE, RAG pipelines, finetuning (unnecessary at our scale)
Key divergence: We already have GrepAI (replaces his naive search engine) and coord API (real-time comms he lacks)

View File

@@ -0,0 +1,70 @@
# Wiki Layer — Shape Spec
**Project:** ClaudeTools Internal Tooling
**Author:** Mike Swanson
**Date:** 2026-05-24
**Status:** Specced, awaiting implementation
---
## Problem
Session logs accumulate as raw chronological records but are never synthesized. Every session start requires re-reading 5-15 logs to reconstruct institutional knowledge about a client, project, or system. This is:
1. Slow — GrepAI helps but synthesis still happens in-session
2. Lossy — patterns and connections that span multiple logs don't surface naturally
3. Non-durable — synthesized understanding lives only in the current context window
4. Howard-unfriendly — he can't benefit from what Mike's sessions discovered without reading raw logs
The coord API handles live state (locks, messages). CONTEXT.md handles live project state. Memory handles isolated facts. Nothing handles **synthesized knowledge** — the compiled, connected, queryable layer between raw logs and live state.
## Solution: A Wiki Layer
A `wiki/` directory of LLM-compiled Markdown articles organized by topic (clients, projects, systems, patterns). The LLM writes and maintains these articles from session logs. Humans rarely touch them directly.
Key properties:
- Plain `.md` files — no new tooling, syncs via Git, indexed by GrepAI automatically
- LLM-maintained — compiled from session logs, not written by hand
- Cross-linked — backlinks between clients, projects, systems
- Queryable — `/context` checks wiki first; much faster signal than raw logs
- Auditable — compilation metadata tracks what was read and when
## Analogy to Current System
| Layer | What It Is | Purpose |
|---|---|---|
| `session-logs/` | Raw chronological records | Audit trail, raw source of truth |
| `wiki/` | **[NEW]** LLM-compiled knowledge articles | Synthesized, queryable knowledge |
| `.claude/memory/` | Isolated discrete facts | Fast-access facts (links into wiki) |
| `CONTEXT.md` | Live project state | Current state, not knowledge |
| Coord API | Real-time inter-session comms | Locks, messages, component state |
## Scope
### In Scope (Phase 1)
- `wiki/` directory structure and templates
- `/wiki-compile` command — compiles session logs into wiki articles for a given scope
- `/context` update — checks wiki before raw logs
- Seed pass: GuruRMM + Cascades as pilot (most logs, highest value)
### In Scope (Phase 2)
- `/wiki-lint` command — health-checks wiki for stale IPs, broken backlinks, orphaned articles
- `/save` integration — prompt after save: "Update wiki for [detected topics]?"
- Additional seeds: all active clients, all known systems
### Out of Scope
- Obsidian or any special viewer — editors + GrepAI are sufficient
- RAG pipelines — flat files at our scale, GrepAI handles search
- Automated compilation without review — human spot-check required
- Finetuning on wiki content — not worth the complexity
- Replacing session logs, CONTEXT.md, memory, or coord API — additive only
## Success Criteria
- Starting work on GuruRMM or Cascades takes one wiki read instead of 10 log reads
- Howard can get full context on any active client by reading one file
- A stale IP or rate change gets caught by lint before causing a session failure
- `/context` returns synthesized answers in seconds, not minutes of log spelunking

View File

@@ -0,0 +1,243 @@
# Wiki Layer — Standards & Templates
## Article Frontmatter (all types)
```yaml
---
type: client | project | system | pattern
name: <slug>
display_name: <Human Readable Name>
last_compiled: YYYY-MM-DD
compiled_by: <session_id>
sources:
- session-logs/YYYY-MM-DD-session.md
- clients/<name>/session-logs/YYYY-MM-DD-session.md
backlinks:
- projects/<name>
- systems/<name>
---
```
`last_compiled` is used by `/wiki-compile` to find session logs newer than this date. Never edit manually.
---
## Article Template: Client
File: `wiki/clients/<slug>.md`
```markdown
---
[frontmatter]
---
# <Client Display Name>
## Profile
- **Contract type:** Managed / Break-fix / Project / Prepaid block
- **Key contacts:** Name (title, email/phone)
- **Billing rate:** $X/hr | notes on exceptions
- **Hours remaining (if prepaid):** N hrs as of YYYY-MM-DD
- **Ticket system:** Syncro ticket #XXXXX (most recent active)
## Infrastructure
### Servers & Services
| Host | IP | Role | OS | Notes |
|---|---|---|---|---|
### Email & Identity
- **M365 tenant:** tenant.onmicrosoft.com
- **MX / mail flow:** ...
- **MFA status:** ...
### Network
- **ISP / WAN:** ...
- **Firewall:** ...
- **VPN:** ...
## Access
- SSH: `ssh user@IP` (key in vault: `clients/<name>/...`)
- RDP: IP:port
- Admin URL: ...
- Vault path: `clients/<name>/`
## Patterns & Known Issues
<!-- Recurring ticket types, common failure modes, things that always come up -->
## Active Work
<!-- Current open projects or tickets — brief, link to CONTEXT.md or ticket# for detail -->
## History Highlights
<!-- Major incidents, big projects, key decisions — one-liners with dates -->
## Backlinks
<!-- [[projects/name]] [[systems/name]] -->
```
---
## Article Template: Project
File: `wiki/projects/<slug>.md`
```markdown
---
[frontmatter]
---
# <Project Display Name>
## Summary
<!-- What it is, current maturity, who uses it -->
## Architecture
### Components
| Component | Location | Tech | State |
|---|---|---|---|
### Key Files & Repos
- Repo: gitea link
- Config: path
- Logs: path
## Development
### Current Focus
<!-- Active dev areas, recent decisions -->
### Patterns & Anti-Patterns
<!-- Code patterns enforced, anti-patterns discovered, reasons -->
### Build & Deploy
<!-- How to build, how to deploy, what to watch -->
## Active State
<!-- Brief — link to CONTEXT.md for live state detail -->
## History Highlights
<!-- Major milestones, pivots, incident resolutions -->
## Backlinks
<!-- [[clients/name]] [[systems/name]] -->
```
---
## Article Template: System
File: `wiki/systems/<slug>.md`
```markdown
---
[frontmatter]
---
# <Hostname>
## Identity
- **Hostname:** ...
- **IP:** ...
- **Role:** ...
- **Location:** Physical / VM on <host>
- **OS:** ...
## Specs
<!-- CPU, RAM, disk, NIC — or VM config -->
## Services
| Service | Port | Notes |
|---|---|---|
## Access
- SSH: `ssh user@IP`
- RDP: ...
- Console: ...
- Vault: `infrastructure/<name>/`
## Known Issues & Quirks
<!-- Historical problems, workarounds, things that surprised us -->
## Backlinks
<!-- [[projects/name]] [[clients/name]] -->
```
---
## Article Template: Pattern
File: `wiki/patterns/<slug>.md`
```markdown
---
[frontmatter]
---
# <Pattern Name>
## Rule
<!-- One-sentence statement of the pattern -->
## Why
<!-- Why this rule exists — incident, constraint, strong preference -->
## How to Apply
<!-- When and where this applies; edge cases -->
## Examples
<!-- Session log references where this played out -->
## Backlinks
<!-- [[projects/name]] [[clients/name]] -->
```
---
## Index File: `wiki/index.md`
LLM-maintained. Do not edit manually except to bootstrap.
```markdown
# Wiki Index
Last updated: YYYY-MM-DD
## Clients
| Article | Summary | Last Compiled |
|---|---|---|
| [Cascades](clients/cascades.md) | Prepaid block, $175/hr, ~37.5 hrs remaining | 2026-05-24 |
## Projects
| Article | Summary | Last Compiled |
|---|---|---|
| [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum, active development | 2026-05-24 |
## Systems
| Article | Summary | Last Compiled |
|---|---|---|
| [Neptune](systems/neptune.md) | Primary server, 172.16.3.30, MariaDB + API | 2026-05-24 |
## Patterns
| Article | Summary | Last Compiled |
|---|---|---|
## Cross-Reference
<!-- Which clients use which systems, which projects run on which systems -->
```
---
## Naming Conventions
- Slugs: lowercase, hyphens, no spaces (`az-computer-guru`, `guru-rmm`, `cascades`)
- System slugs match hostname exactly (`neptune`, `jupiter`, `pluto`)
- Pattern slugs describe the rule (`no-mock-db-tests`, `labor-not-taxable`)
## Linking Convention
Use `[[slug]]` for cross-references within wiki. Full relative path in frontmatter `backlinks` field.
## What the LLM Should NOT Put in Wiki Articles
- Credentials or passwords (vault paths only)
- Full session log transcripts (summaries and highlights only)
- Speculative or uncertain facts (mark with `[unverified]` if uncertain)
- Real-time state (IPs are OK, but current ticket status belongs in coord API)

View File

@@ -0,0 +1,113 @@
# Session Log — 2026-05-24 Wiki Layer Implementation
## User
- **User:** Mike Swanson (mike)
- **Machine:** DESKTOP-0O8A1RL
- **Role:** admin
- **Session span:** Afternoon
---
## Session Summary
Mike shared a tweet from Andrej Karpathy describing an LLM knowledge base workflow: raw source documents compiled by an LLM into a markdown wiki, then queried via LLM agents. Mike asked whether this approach was applicable to ClaudeTools and, if beneficial, how to implement it.
Analysis confirmed strong applicability. ClaudeTools already had the raw data layer (session logs) and live state layer (CONTEXT.md, coord API), but was missing a compiled knowledge layer in between. Every session start required re-synthesizing context from 5-15 raw logs — wasteful, lossy, and Howard-hostile. The wiki layer fills exactly that gap.
A full spec was written (.claude/specs/wiki-layer/), covering shape, phased implementation plan, article standards/templates, and all integration references. Phase 0 (directory structure, templates, CLAUDE.md wiring) was completed in-session. Phase 1 (seed pass) was also completed in-session for the two highest-value targets: Cascades of Tucson and GuruRMM.
Two parallel agents read all available session logs, memory files, CONTEXT.md, and infrastructure docs for each target and synthesized full wiki articles. Both articles were reviewed and written to disk. During review, several hostname/IP assumption errors were caught through user correction — Neptune is the old ACG mail server (physically at Dataforth D2, not the ACG office), Jupiter is 172.16.3.20 (not .30), and the machine at 172.16.3.30 is the GuruRMM VM (hostname: gururmm-build). A full IP/hostname audit was done against credentials.md, pluto.md, and infra_office_network.md to produce a verified map. Neptune's unusual situation (ACG infrastructure at a client's physical location, with a subnet overlap TODO) was captured in both the memory file and wiki index.
Mike also proposed a top-level overview.md article (cold-start orientation doc: team, all clients, all projects, key infra, tooling). Added to spec plan as Phase 1 final step, compiling from other wiki articles rather than raw logs.
---
## Key Decisions
- **Wiki compiles from wiki, not raw logs, for `overview.md`** — overview article reads other wiki articles as its source material, making it cheap to maintain and always internally consistent. Goes last in the seed queue, after systems are compiled.
- **`/wiki-compile overview` scope added** — distinct scope that reads compiled wiki articles rather than session logs directly.
- **Slug: `cascades-tucson` not `cascades`** — canonical client folder is `clients/cascades-tucson/`; wiki article named to match.
- **System slug: `gururmm-build`** — the GuruRMM VM at 172.16.3.30 uses this slug throughout the wiki, matching its SSH hostname.
- **Neptune is active, not decommissioned** — Exchange 2016, active mail for multiple ACG-hosted clients, physically at Dataforth D2. Distinct from Saturn (decommissioned) and from ACG office infrastructure.
- **Dataforth UDM subnet overlap flagged as TODO** — Dataforth UDM uses overlapping 172.16.x.x addressing with ACG office LAN, forcing Neptune access through D2TESTNAS. Captured as a TODO in memory and wiki index for eventual resubnetting.
- **credentials.md NPM proxy is stale** — entry shows `rmm-api.azcomputerguru.com → 172.16.3.20:3001` but GuruRMM API is on 172.16.3.30 (migrated from Jupiter container to own VM). Stale entry noted in wiki backlinks; NPM should be updated on Jupiter.
---
## Problems Encountered
- **Multiple hostname/IP assumption errors** — Initial wiki entries incorrectly associated Neptune with 172.16.3.30, then Jupiter with 172.16.3.30. Caught by user through two rounds of correction. Root cause: Claude synthesized from session logs without consulting the authoritative machines/ and memory/ files first. Resolution: read credentials.md, pluto.md, and infra_office_network.md to produce verified map before writing any system references.
- **Cascades client folder path** — Agent correctly identified that `clients/cascades/` does not exist; canonical path is `clients/cascades-tucson/`. Wiki article named accordingly.
- **Saturn in GuruRMM fleet** — Agent found "Saturn" listed as an enrolled GuruRMM agent but Saturn is decommissioned (IP 172.16.3.21 reused by Uranus, Apr 2026). Flagged in wiki article as possibly stale or actually Uranus.
---
## Configuration Changes
**Created:**
- `.claude/specs/wiki-layer/shape.md` — problem, solution, scope, success criteria
- `.claude/specs/wiki-layer/plan.md` — phased implementation plan (Phases 0-5) + overview article spec
- `.claude/specs/wiki-layer/standards.md` — article templates (client, project, system, pattern), index format, naming rules
- `.claude/specs/wiki-layer/references.md` — all files read/written, integration points, Karpathy reference
- `wiki/index.md` — master wiki index with compilation queue
- `wiki/clients/.gitkeep`
- `wiki/projects/.gitkeep`
- `wiki/systems/.gitkeep`
- `wiki/patterns/.gitkeep`
- `wiki/_templates/client.md`
- `wiki/_templates/project.md`
- `wiki/_templates/system.md`
- `wiki/_templates/pattern.md`
- `wiki/clients/cascades-tucson.md` — full synthesized client article (25+ session logs, 7 memory files, 5 infra docs)
- `wiki/projects/gururmm.md` — full synthesized project article (37 session logs, multiple memory/doc files)
**Modified:**
- `.claude/CLAUDE.md` — auto-context-loading table updated (wiki checked first for client/project/system mentions); Context Recovery section updated; Commands table updated with `/wiki-compile` and `/wiki-lint`
- `.claude/FILE_PLACEMENT_GUIDE.md` — wiki row added to quick reference table
- `.claude/memory/infra_office_network.md` — Neptune section added (physical location at Dataforth D2, subnet overlap TODO, routing note)
---
## Credentials & Secrets
None created or discovered this session.
---
## Infrastructure & Servers
**Verified IP/hostname map (from credentials.md + pluto.md + infra_office_network.md):**
| IP | Hostname | Role |
|---|---|---|
| 172.16.0.1 | pfSense | Router, DNS, Tailscale subnet router |
| 172.16.3.20 | Jupiter | Unraid NAS; virsh host for all VMs; Docker: Gitea, NPM, Seafile |
| 172.16.3.21 | Uranus | OwnCloud additional storage only; Dell R730xd; formerly Saturn's IP |
| 172.16.3.30 | gururmm-build (GuruRMM VM) | Linux VM on Jupiter; GuruRMM API 3001, ClaudeTools API 8001, Coord API, MariaDB, PostgreSQL, build pipeline |
| 172.16.3.36 | Pluto / Claude-Builder | Windows Server 2019 virsh VM on Jupiter; Windows MSI/cargo build server |
| 172.16.3.11 | neptune.acghosting.com | Exchange 2016; ACG infra physically at Dataforth D2; 67.206.163.124 external |
| Saturn | DECOMMISSIONED | Was 172.16.3.21; IP reused by Uranus Apr 2026 |
**Stale entry to fix:** NPM on Jupiter still proxies `rmm-api.azcomputerguru.com → 172.16.3.20:3001`. Should point to 172.16.3.30:3001.
---
## Pending / Incomplete Tasks
- **Wiki seed — systems:** `system:gururmm-build`, `system:jupiter`, `system:pluto`, `system:uranus` — all in compilation queue in wiki/index.md
- **Wiki seed — overview:** `wiki/overview.md` — compile AFTER systems seeded; reads other wiki articles
- **wiki/commands/wiki-compile.md** — command skill not yet implemented (Phase 2)
- **wiki/commands/wiki-lint.md** — command skill not yet implemented (Phase 4)
- **NPM stale proxy:** Update `rmm-api.azcomputerguru.com` proxy on Jupiter from 172.16.3.20:3001 → 172.16.3.30:3001
- **Dataforth UDM resubnet:** TODO captured in memory and wiki index; eliminates Neptune routing workaround via D2TESTNAS
- **Saturn GuruRMM agent:** Verify whether enrolled "Saturn" agent is stale or is actually Uranus; clean up if stale
---
## Reference Information
- Wiki spec: `.claude/specs/wiki-layer/`
- Wiki root: `wiki/`
- Karpathy post: https://x.com/karpathy/status/2039805659525644595 (2026-04-02)
- Cascades wiki article: `wiki/clients/cascades-tucson.md` (last_compiled: 2026-05-24)
- GuruRMM wiki article: `wiki/projects/gururmm.md` (last_compiled: 2026-05-24)

56
wiki/_templates/client.md Normal file
View File

@@ -0,0 +1,56 @@
---
type: client
name: <slug>
display_name: <Human Readable Name>
last_compiled: YYYY-MM-DD
compiled_by: <session_id>
sources: []
backlinks: []
---
# <Client Display Name>
## Profile
- **Contract type:** Managed | Break-fix | Project | Prepaid block
- **Key contacts:** Name (title, email/phone)
- **Billing rate:** $X/hr
- **Hours remaining (if prepaid):** N hrs as of YYYY-MM-DD
- **Active ticket:** Syncro #XXXXX
## Infrastructure
### Servers & Services
| Host | IP | Role | OS | Notes |
|---|---|---|---|---|
### Email & Identity
- **M365 tenant:** tenant.onmicrosoft.com
- **MX / mail flow:** ...
- **MFA status:** ...
### Network
- **ISP / WAN:** ...
- **Firewall:** ...
- **VPN:** ...
## Access
- SSH: `ssh user@IP` (key in vault: `clients/<name>/...`)
- RDP: IP:port
- Admin URL: ...
- Vault path: `clients/<name>/`
## Patterns & Known Issues
*(Recurring ticket types, common failure modes, things that always come up)*
## Active Work
*(Current open projects or tickets — brief, link to CONTEXT.md or ticket# for detail)*
## History Highlights
*(Major incidents, big projects, key decisions — one-liners with dates)*
## Backlinks
*(Other wiki articles related to this client)*

View File

@@ -0,0 +1,31 @@
---
type: pattern
name: <slug>
display_name: <Pattern Name>
last_compiled: YYYY-MM-DD
compiled_by: <session_id>
sources: []
backlinks: []
---
# <Pattern Name>
## Rule
*(One-sentence statement of the pattern — the thing to always do or never do)*
## Why
*(Why this rule exists — the incident, constraint, or hard-learned lesson behind it)*
## How to Apply
*(When and where this applies. Edge cases. What "good" looks like vs. what to avoid)*
## Examples
*(Session log references where this played out — dates and brief context)*
## Backlinks
*(Projects, clients, or systems where this pattern is especially relevant)*

View File

@@ -0,0 +1,50 @@
---
type: project
name: <slug>
display_name: <Human Readable Name>
last_compiled: YYYY-MM-DD
compiled_by: <session_id>
sources: []
backlinks: []
---
# <Project Display Name>
## Summary
*(What it is, current maturity, who uses it, what problem it solves)*
## Architecture
### Components
| Component | Location | Tech | State |
|---|---|---|---|
### Key Files & Repos
- **Repo:** gitea link
- **Config:** path
- **Logs:** path
- **API:** URL
## Development
### Current Focus
*(Active dev areas, recent decisions, in-flight work)*
### Patterns & Anti-Patterns
*(Code patterns enforced, anti-patterns discovered, reasons — reference memory entries where applicable)*
### Build & Deploy
*(How to build, how to deploy, what to watch for, rollback procedure)*
## Active State
*(Brief current state — link to CONTEXT.md for live detail; do not duplicate live state here)*
## History Highlights
*(Major milestones, pivots, incident resolutions — one-liners with dates)*
## Backlinks
*(Other wiki articles related to this project)*

42
wiki/_templates/system.md Normal file
View File

@@ -0,0 +1,42 @@
---
type: system
name: <hostname>
display_name: <Hostname>
last_compiled: YYYY-MM-DD
compiled_by: <session_id>
sources: []
backlinks: []
---
# <Hostname>
## Identity
- **Hostname:** ...
- **IP:** ...
- **Role:** ...
- **Location:** Physical | VM on <host>
- **OS:** ...
- **Tailscale IP:** ... (if applicable)
## Specs
*(CPU, RAM, disk, NIC — or VM config: vCPU, vRAM, virtual disk)*
## Services
| Service | Port | Notes |
|---|---|---|
## Access
- **SSH:** `ssh user@IP` (key in vault: `infrastructure/<name>/...`)
- **RDP:** IP:port (if applicable)
- **Console:** iDRAC / Proxmox / Unraid UI / etc.
- **Vault path:** `infrastructure/<name>/`
## Known Issues & Quirks
*(Historical problems, workarounds, things that have surprised us — the stuff not in any doc)*
## Backlinks
*(Projects running on this system, clients whose infra lives here)*

0
wiki/clients/.gitkeep Normal file
View File

View File

@@ -0,0 +1,258 @@
---
type: client
name: cascades-tucson
display_name: Cascades of Tucson
last_compiled: 2026-05-24
compiled_by: DESKTOP-0O8A1RL/claude-main
sources:
- session-logs/2026-03-24-session.md
- session-logs/2026-03-31-session.md
- session-logs/2026-04-01-session.md
- session-logs/2026-04-16-session.md
- session-logs/2026-04-16-howard-client-docs-import.md
- session-logs/2026-04-17-session.md
- session-logs/2026-04-17-howard-session.md
- session-logs/2026-04-18-session.md
- session-logs/2026-04-20-session.md
- session-logs/2026-04-20-mac-session.md
- session-logs/2026-04-21-mac-vault-setup.md
- session-logs/2026-04-21-howard-remediation-vault-gap.md
- session-logs/2026-04-28-session.md
- session-logs/2026-04-29-session.md
- session-logs/2026-04-30-session.md
- session-logs/2026-05-01-session.md
- session-logs/2026-05-01-howard-syncro-billing-batch-and-tmp-path-incident.md
- session-logs/2026-05-10-session.md
- session-logs/2026-05-18-session.md
- session-logs/2026-05-18-howard-billing-review-and-ticket-updates.md
- session-logs/2026-05-20-session.md
- session-logs/2026-05-21-session.md
- session-logs/2026-05-23-session.md
- session-logs/2026-05-24-GURU-KALI-session.md
- clients/cascades-tucson/session-logs/2026-05-22-session.md
- clients/cascades-tucson/docs/overview.md
- clients/cascades-tucson/docs/network/topology.md
- clients/cascades-tucson/docs/network/vlans.md
- clients/cascades-tucson/docs/servers/cs-server.md
- clients/cascades-tucson/docs/billing-log.md
- .claude/memory/project_cascades_admin_accounts.md
- .claude/memory/project_cascades_ca_phased_rollout.md
- .claude/memory/project_cascades_pilot_cleanup.md
- .claude/memory/feedback_syncro_cascades_contact.md
- .claude/memory/feedback_cascades_user_security_group.md
- .claude/memory/project-cascades-migration-plan.md
- .claude/memory/feedback_cascades_folder_redirect.md
backlinks:
- projects/gururmm
---
# Cascades of Tucson
Senior living / assisted living facility in Tucson, AZ. Single 6-floor building plus a MemCare (Memory Care) wing on floors 5-6. ACG took over from a previous MSP. Primary compliance driver is HIPAA. Active multi-phase migration project ongoing as of 2026-05-24.
---
## Profile
- **Contract type:** Prepaid hour block
- **Key contacts:**
- Winter — front desk / billing; handles invoice processing and prepaid block purchases
- Meredith Kuhn — Assistant Manager (ASSISTMAN-PC); internal billing contact. **NEVER set her as ticket contact in Syncro** — she is the wrong default that keeps being selected.
- John Trozzi — Maintenance staff, Mac at 201cascades@gmail.com (shared facility account)
- Lauren Hasselman — Accounting
- Crystal Rodriguez — staff
- Sharon Edwards — Life Enrichment Assistant (DESKTOP-DLTAGOI)
- Ashley Jensen — Accountant (DESKTOP-U2DHAP0)
- Shelby Trozzi — MemCare Director (MDIRECTOR-PC)
- **Billing rate:** $175/hr all labor (prepaid block customer)
- **Hours remaining:** ~37.5 hrs as of 2026-05-20. Always live-check via `GET /customers/20149445` before billing — balance is unreliable across sessions. [verify]
- **Syncro customer ID:** 20149445
- **Active tickets:**
- #110680053 — Dept-by-dept domain migration (primary active project; plan: `C:\Users\Howard\.claude\plans\wise-discovering-panda.md`)
- #109412123 — Entra setup project (may be invoiced as of 2026-05-18; verify status)
- #109225085 — Yealink phone inventory
- #109035475 — John Trozzi desktop WiFi upgrade (billed)
---
## Infrastructure
### Servers & Services
| Host | IP | Role | OS | Notes |
|---|---|---|---|---|
| CS-SERVER | 192.168.2.254 | DC, DNS, DHCP (no scopes), File Server, Hyper-V host, Print Server | Windows Server 2019 Standard | Dell PowerEdge R610 (~2009 hardware, 16+ years old). **Single DC — CRITICAL risk. No backup.** GuruRMM agent ID: `6766e973-e703-47c1-be56-76950290f87c` |
| CS-SERVER iDRAC | 192.168.2.65 | Out-of-band management | — | Dell OOB interface |
| CS-QB (Hyper-V VM on CS-SERVER) | 192.168.2.228 | VoIP server | — | Phones go down if R610 dies |
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM | Port 5000 HTTP. Workgroup name is "CASCADES" — same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. |
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing | pfSense 24.0 | Dual-WAN. All DHCP served here (CS-SERVER DHCP role has no scopes). MAC: 00:f1:f5:34:b3:4a |
**[WARNING] CS-SERVER hardware:** Dell R610 with mixed SATA laptop drives (OS array, no hot spare) and enterprise SAS drives from 2015-2016. No backup exists. No second DC. Hardware will fail — DC migration is urgent.
**[WARNING] HIPAA violation:** No backup for CS-SERVER (§164.308(a)(7)). Synology Active Backup for Business is blocked (ext4 filesystem, not Btrfs).
### Email & Identity
- **M365 tenant:** cascadestucson.com | Tenant ID: `207fa277-e9d8-4eb7-ada1-1064d2221498`
- **M365 license:** Business Standard (34 seats). Business Premium upgrade proposed (net -$56.50/mo savings after shared mailbox cleanup). 31 SPB seats reportedly free as of 2026-05-22 — relicensing time-sensitive.
- **On-prem AD domain:** cascades.local | UPN suffix: cascadestucson.com (added 2026-04-13 for Entra Connect SSO readiness)
- **MX / mail flow:** Exchange Online (M365). SPF strict (`-all`). DKIM: both M365 selectors published. DMARC: `p=none` (monitoring only) — **action needed: upgrade to `p=quarantine`**. DMARC reports to `info@cascadestucson.com` (unmonitored).
- **MFA:** CA policy "Require MFA for all users" is enabled. Caregiver bypass pilot in progress — caregivers cannot satisfy MFA (no personal device), so three scoped CA policies use BLOCK instead. See Patterns section.
- **Entra Connect:** Installed on CS-SERVER in staging mode as of 2026-04-25. **Not yet exited staging.** Exit from staging is a pending task.
- **Break-glass accounts:** Two planned (`breakglass1-csc@cascadestucson.com`, `breakglass2-csc@cascadestucson.com`). FIDO2 YubiKeys ordered. Vault entries not yet created. [unverified — check if YubiKeys arrived and accounts created]
- **Admin accounts:**
- `admin@cascadestucson.com` — Mike's working admin (cloud-only, Connect-excluded by design)
- `sysadmin@cascadestucson.com` — Howard's working admin (cloud-only, Connect-excluded by design)
- **ALIS (clinical SaaS):** https://www.go-alis.com/ — Entra SSO configured but **BLOCKED on Medtelligent enabling it** on Cascades tenant. App registration values ready in vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`.
- **Yealink SDM:** 16 SIP-T54W phones via YMCS portal. SDM token success 2026-05-08. ~30 phones still to roll as of 2026-05-10. [unverified — check current count]
- **Audit retention:** Approved 2026-04-29. Azure Log Analytics (90d) + Storage Account (6yr) in ACG subscription `e507e953-2ce9-4887-ba96-9b654f7d3267`, RG `rg-audit-cascadestucson`. **Not yet built.** Runbook: `.claude/skills/remediation-tool/references/audit-retention-runbook.md`.
### Network
- **ISP / WAN:** Dual-WAN Cox Fiber (primary, static `184.191.143.62/30`, gateway `184.191.143.61`) + Cox Coax (secondary, DHCP `72.211.21.217`). Both WAN IPs added as Cascades Named Location in Entra (ID: `061c6b06-b980-40de-bff9-6a50a4071f6f`).
- **Firewall:** pfSense 24.0 at 192.168.0.1. All DHCP. Inter-VLAN routing. 236 resident room VLANs (per-room /28, `10.[floor].[room].0/28`). Staff/infra VLAN 20 (`10.0.20.0/24`, gateway `10.0.20.1`). Guest VLAN 50 (`10.0.50.0/24`, RFC1918 blocked).
- **Switching:** Full UniFi. 82 APs + 5 managed switches (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). Floors 2/3/4 switches pending hardware replacement.
- **WiFi SSIDs:**
- CSCNet — staff, VLAN 20
- CSC ENT — legacy SSID, main LAN (192.168.0.0/22), being deprecated as migration proceeds
- Guest — isolated, VLAN 50
- **VoIP:** AudioCodes phones (8 units) on USW-16-PoE. CS-QB VM at 192.168.2.228. Not MSP-managed but infra must stay static.
---
## Access
- **CS-SERVER:** Via ScreenConnect or GuruRMM (agent ID: `6766e973-e703-47c1-be56-76950290f87c`)
- **CS-SERVER iDRAC:** 192.168.2.65
- **pfSense admin:** https://192.168.0.1 — vault: `clients/cascades-tucson/pfsense-firewall.sops.yaml`
- **Synology DSM:** http://192.168.0.120:5000 — vault: `clients/cascades-tucson/` (existing entry)
- **M365 admin:** admin@cascadestucson.com — vault: `clients/cascades-tucson/m365-admin.sops.yaml`
- **M365 sysadmin:** sysadmin@cascadestucson.com — vault: `clients/cascades-tucson/m365-sysadmin.sops.yaml`
- **WiFi CSCNet:** vault: `clients/cascades-tucson/wifi-cscnet.sops.yaml`
- **MDM service account:** vault: `clients/cascades-tucson/mdm-service-account.sops.yaml`
- **ALIS SSO app registration:** vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`
- **GuruRMM — RECEPTIONIST-PC:** agent ID `9c91d324-1073-449c-8cc0-45c5bccfc218` (flaky WebSocket, may lag fleet updates)
- **Yealink YMCS portal:** https://us.ymcs.yealink.com/manager/login — vault: `infrastructure/voip-phones.sops.yaml`
- **Remediation tool:** Still on old app `fabb3421` (ComputerGuru - AI Remediation) as of 2026-04-20. New tiered app suite not yet consented. [unverified — check if consented since then]
- **Vault root:** `clients/cascades-tucson/` in vault repo
---
## Patterns & Known Issues
### Syncro / Billing
- **NEVER set a contact on Cascades tickets.** Leave `contact_id` blank. Blank routes notifications to the correct distribution emails. Setting any contact (Meredith Kuhn is the recurring wrong default) overrides distribution. Source: `feedback_syncro_cascades_contact.md`.
- **Billing product for prepaid block draw:** Use a real labor type (Remote, Onsite, etc.) — NOT "Prepaid project labor" (exempt, won't decrement the block).
- **Always live-check hours before billing:** `GET /customers/20149445` in Syncro. The 2026-05-01 invoice debit may not have fired correctly — treat all cached hour counts as approximate.
### Active Directory / User Management
- **Security group assignment is always explicit.** When creating or adding any Cascades user, always ask which security group(s). OU → group auto-mirror was explicitly declined 2026-05-14. OU placement controls Entra Connect sync scope; group membership controls CA policy — two separate deliberate decisions. Source: `feedback_cascades_user_security_group.md`.
- **New user mandatory order (folder redirection):**
1. Create AD user
2. Run `New-HomeFolder -Username "<sam>"` on CS-SERVER (creates root + Desktop/Documents/Downloads/Music/Pictures with correct ACL)
3. Add to SG-FolderRedirect
4. THEN first domain logon
- Skipping step 2 causes fdeploy to cache a failure silently and never retry. Source: `feedback_cascades_folder_redirect.md`.
- **Folder redirect recovery:** If fdeploy cached a failure ("No changes detected"), run `clients/cascades-tucson/scripts/fix-shell-redirect.ps1` via GuruRMM while user is logged in. Must set both GUID-based and legacy-name registry keys. Folders must already exist on server.
- **fdeploy1.ini flags:** Changed from `Flags=1211` (included `Grant Exclusive Rights` bit 0x400, causing WRITE_DAC failures on new subfolders) to `Flags=187`. File at `{512B43A4-F049-4CE5-BFAC-860AD13E92BE}\User\Documents & Settings\fdeploy1.ini` on CS-SERVER.
### Conditional Access / Caregiver Pilot
- **Phased rollout — never tenant-wide.** CA policies for caregivers target `SG-Caregivers-Pilot` only (then `SG-Caregivers` after Entra Connect exits staging). The legacy "Require MFA for all users" policy stays in place. Expansion to other departments uses PATCH on `excludeGroups`, never replace. Source: `project_cascades_ca_phased_rollout.md`.
- **Caregiver CA policy set:**
- PATCH legacy MFA-all-users: add `SG-Caregivers-Pilot` to excludeGroups
- CREATE `CSC - Block caregivers off Cascades network` (BLOCK if location not Cascades)
- CREATE `CSC - Block caregivers on non-compliant device` (BLOCK if device non-compliant)
- CREATE `CSC - Caregiver sign-in frequency 8h`
- **GDAP exclusion:** CA policy 3 must exclude "Service provider users" (GDAP foreign principals) + `SG-External-Signin-Allowed` + `SG-Break-Glass`, otherwise ACG partner admins lose access at CA cutover.
- **Pilot cleanup required when done:** Delete `pilot.test@cascadestucson.com`, clean up `howard.enos@cascadestucson.com`, remove `SG-Caregivers-Pilot` from CA policy targets and delete the group. Source: `project_cascades_pilot_cleanup.md`.
### Security Incidents (historical)
- **Megan Hiatt (2026-04-16):** Active credential-stuffing — 126 failed sign-ins, bursts from Belfast GB, Hamburg DE. Password reset and SMTP AUTH disable were action items. Mailbox was clean (not breached).
- **John Trozzi (2026-04-16, 2026-04-20):** Investigated twice — both times NO BREACH. First: credential stuffing flag (clean). Second: inbound phishing email (clean). Reports in `clients/cascades-tucson/reports/`.
- **Crystal Rodriguez (2026-04-19):** Phishing investigation. Report: `clients/cascades-tucson/reports/2026-04-19-crystal-rodriguez-phish-investigation.md`.
- **Canva email delivery (2026-05-20):** Alma Montt not receiving Canva invites. Resolved by adding canva.com domains to AllowedSenderDomains in EOP policies.
- **dunedolly21@gmail.com:** External guest invited 2026-04-14 by Lauren Hasselman from mobile. Status unknown — confirm with Lauren. [unverified]
### HIPAA Compliance
- **Primary objective.** Cascades stores PHI on CS-SERVER and uses ALIS for clinical records.
- **Critical open gaps:** No backup (§164.308(a)(7)); no audit logging on D:\Homes (§164.312(b)); Object Access auditing disabled; no SMB encryption on homes share; no file access auditing.
- **Restored 7 deleted mailboxes (2026-04-25)** for HIPAA §164.316(b)(2) 7-year retention.
- **Termination policy established:** Convert to shared mailbox, hide from GAL, retain 7 years.
---
## Active Work
Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #110680053).
**Migration phase status (approx. as of 2026-05-22):**
| Machine / User | Status |
|---|---|
| Sharon Edwards (DESKTOP-DLTAGOI) | Domain-joined, folder redirect working via registry workaround |
| Ashley Jensen (DESKTOP-U2DHAP0) | Domain-joined, folder redirect incomplete (manually fixed) |
| RECEPTIONIST-PC (frontdesk) | Domain-joined 2026-05-22; loopback Replace mode, no folder redirect by design |
| NURSESTATION-PC | Domain-joined, folder redirect complete |
| Lauren Hasselman | Passwords didn't work 2026-05-21, machine not accessible — pending |
| DESKTOP-KQSL232, CHEF-PC, SALES4-PC, MDIRECTOR-PC | Not yet started |
**Blocking issues / pending:**
- Entra Connect: exit staging (requires OU=Administrative UPN changes + cascadestucson.com UPN suffix for that OU)
- M365 relicensing: 31 Business Standard → Business Premium (time-sensitive, 31 SPB seats reportedly free)
- ALIS SSO: blocked on Medtelligent
- Break-glass accounts: not created
- Audit retention infra: not built
- RECEPTIONIST-PC GuruRMM agent (9c91d324): flaky WebSocket, lagging fleet
---
## History Highlights
| Date | Event |
|---|---|
| 2026-03-06 | ACG onboarding begins. Initial audit (CS-SERVER Dell R610, pfSense, UniFi, Synology). 19 machines. No backup, no HIPAA compliance. |
| 2026-03-09 | AD security hardening: Monica Ramirez removed from Domain Admins, lockout policy fixed, AD Recycle Bin enabled, MachineAccountQuota set to 0. |
| 2026-03-31 | Cascades onboarded to remediation tool. Tenant ID documented. 50 users, Secure Score 34%. |
| 2026-04-13 | Major onsite: 13 stale AD accounts deleted, OU structure cleaned, UPNs migrated to cascadestucson.com, Homes share created, Folder Redirection GPO deployed (registry workaround), first domain joins. |
| 2026-04-14 | Sandra Fish global admin revoked. ALIS SSO confirmed. Business Premium proposal created. |
| 2026-04-16 | Breach checks: Megan Hiatt (credential stuffing, not breached; password reset). John Trozzi (clean). Crystal Rodriguez phish. /remediation-tool skill built. |
| 2026-04-17 | Howard onsite: folder redirect Sharon Edwards diagnosis. John Trozzi WiFi (TP-Link + UniFi roaming instability). |
| 2026-04-25 | Entra Connect installed on CS-SERVER (staging mode). 7 deleted mailboxes restored for HIPAA. Dual-WAN discovered. |
| 2026-04-28-29 | CA policy reconciliation. Audit retention architecture (ACG-billed, LAW 90d + Storage 6yr). Break-glass design (2 accounts, YubiKeys). Caregiver pilot scope corrected (phased only). |
| 2026-04-30 | CA rollout (Report-only mode): 3 caregiver policies created. SDM bootstrap. |
| 2026-05-01 | Howard billed 33.5 hrs against prepaid block on Entra project ticket #32214 ($0 invoice). |
| 2026-05-07-08 | SDM phone provisioning. SDM token success. ALIS SSO app registration values captured to vault. |
| 2026-05-14-16 | Caregiver AD accounts created. Security groups always deliberate (no OU→group automation). Wireless diagnostic. |
| 2026-05-18 | Billing review. 39.5 hrs remaining before session. 7 hrs billed separately. |
| 2026-05-20 | Canva email delivery resolved (canva.com domains added to EOP). |
| 2026-05-21 | Lauren Hasselman + Crystal Rodriguez domain join attempted — passwords didn't work. Comment posted to migration ticket. |
| 2026-05-22 | Ashley Jensen domain-joined. RECEPTIONIST-PC domain-joined. GPO ILT fixes (FrontDesk printer + R: drive). cascadesDS auth failure diagnosed (workgroup collision) and deferred. |
| 2026-05-24 | RECEPTIONIST-PC GuruRMM agent noted as 0.6.37 straggler while fleet at 0.6.38. Flaky WebSocket. |
---
## Compilation Notes
**Session logs read:** 25 root session logs + client-specific logs in `clients/cascades-tucson/session-logs/` + 7 memory files + 5 structured docs. Date range: 2026-03-06 through 2026-05-24.
**Client folder:** `clients/cascades-tucson/` (NOT `clients/cascades/` — that directory does not exist).
**Open items flagged as unverified:**
- Hour balance — always live-check; 2026-05-01 invoice debit may not have fired correctly
- New tiered remediation app suite — Cascades still on old `fabb3421` as of 2026-04-20; unknown if consented since
- DMARC p=none — action item from 2026-04-20, no evidence of resolution
- Break-glass accounts + YubiKeys — decision 2026-04-29, no evidence of execution
- Audit retention infra — approved 2026-04-29, not yet built
- dunedolly21@gmail.com guest invite — confirm with Lauren
## Backlinks
- [[projects/gururmm]] — RECEPTIONIST-PC enrolled (site CascadesTucson); CS-SERVER enrolled

63
wiki/index.md Normal file
View File

@@ -0,0 +1,63 @@
# Wiki Index
Last updated: 2026-05-24
Compiled by: DESKTOP-0O8A1RL/claude-main
This wiki is LLM-maintained. Do not edit articles manually — run `/wiki-compile` to update.
Run `/wiki-lint` to check for stale entries and broken backlinks.
---
## Overview
| Article | Summary | Last Compiled |
|---|---|---|
| [Overview](overview.md) | State of the business: team, all clients, all projects, key infra, tooling — cold-start orientation doc | *(not yet compiled — run `/wiki-compile overview`)* |
## Clients
| Article | Summary | Last Compiled |
|---|---|---|
| [Cascades of Tucson](clients/cascades-tucson.md) | Prepaid block $175/hr, ~37.5 hrs remaining; senior living; active domain migration + HIPAA compliance project; single DC on aging R610 hardware | 2026-05-24 |
## Projects
| Article | Summary | Last Compiled |
|---|---|---|
| [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; v0.6.38; 55 enrolled agents; active development | 2026-05-24 |
## Systems
| Article | Summary | Last Compiled |
|---|---|---|
| *(run `/wiki-compile system:neptune`, `system:jupiter`, `system:pluto`, `system:saturn`)* | | |
## Patterns
| Article | Summary | Last Compiled |
|---|---|---|
| *(none yet — patterns will be extracted during system/project compilation passes)* | | |
---
## Cross-Reference
| Client | Systems | Projects |
|---|---|---|
| Cascades of Tucson | CS-SERVER (192.168.2.254), pfSense (192.168.0.1), cascadesDS (192.168.0.120) | GuruRMM (RECEPTIONIST-PC + CS-SERVER enrolled) |
| ACG Internal | gururmm-build (172.16.3.30), Jupiter (172.16.3.20), Pluto (172.16.3.36), Uranus (172.16.3.21) | GuruRMM server + ClaudeTools API on gururmm-build; Windows MSI builds on Pluto; Gitea/NPM/Seafile on Jupiter. Saturn DECOMMISSIONED. |
---
## Compilation Queue
| Scope | Priority | Notes |
|---|---|---|
| `overview` | High | Compile after systems are seeded; reads other wiki articles, not raw logs |
| `system:gururmm-build` | High | 172.16.3.30 — Linux VM on Jupiter; GuruRMM API 3001, ClaudeTools API 8001, Coord API, MariaDB 3306, PostgreSQL 5432, build pipeline (webhook-handler.py); Gitea also accessible here via SSH port forward from .30:3000 |
| `system:jupiter` | High | 172.16.3.20 — Unraid NAS; virsh hosts: GuruRMM VM (.30), Unifi, OwnCloud, Claude-Builder/Pluto (.36); Docker: NPM, Seafile, Gitea (port 3000); iptables PREROUTING: :443 → NPM container; iDRAC at 172.16.1.73 |
| `system:pluto` | Medium | 172.16.3.36 — Windows Server 2019 virsh VM on Jupiter (Claude-Builder); Windows MSI + cargo builds for GuruRMM |
| `system:uranus` | Medium | 172.16.3.21 — OwnCloud additional storage only; Dell R730xd; formerly Saturn's IP (reused Apr 2026); NOT a proxy or general host |
| `system:neptune` | Low | neptune.acghosting.com, 172.16.3.11 internal / 67.206.163.124 external — Exchange Server 2016; ACG infrastructure physically colocated at Dataforth D2 facility; active mail server for multiple ACG-hosted clients; internal access requires routing through D2TESTNAS because Dataforth UDM runs a subnet that duplicates/overlaps ACG office LAN (172.16.x.x) — TODO: resubnet Dataforth UDM to eliminate overlap |
| `client:birthbiologic` | Medium | GuruRMM enrolled (site BRIGHT-PEAK-5980) |
| `client:key-paul` | Low | GuruRMM enrolled (KEY-MEDIA) |

0
wiki/patterns/.gitkeep Normal file
View File

0
wiki/projects/.gitkeep Normal file
View File

315
wiki/projects/gururmm.md Normal file
View File

@@ -0,0 +1,315 @@
---
type: project
name: gururmm
display_name: GuruRMM
last_compiled: 2026-05-24
compiled_by: DESKTOP-0O8A1RL/claude-main
sources:
- projects/msp-tools/guru-rmm/CONTEXT.md
- projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md
- projects/msp-tools/guru-rmm/docs/UI_GAPS.md
- projects/msp-tools/guru-rmm/docs/ARCHITECTURE_DECISIONS.md
- projects/msp-tools/guru-rmm/docs/tech-stack.md
- projects/msp-tools/guru-rmm/docs/DESIGN.md
- .claude/memory/reference_gururmm_server.md
- .claude/memory/reference_gururmm_api.md
- .claude/memory/gururmm-development-principles.md
- .claude/memory/feedback_gururmm_agent_parity.md
- .claude/memory/reference_pluto_build_server.md
- .claude/memory/project_mac_gururmm_setup_pending.md
- credentials.md
- session-logs/2025-12-15-session.md
- session-logs/2025-12-20-session.md
- session-logs/2026-04-19-session.md
- session-logs/2026-04-21-session.md
- session-logs/2026-04-29-session.md
- session-logs/2026-05-12-guru-rmm-macos-agent-phase1.md
- session-logs/2026-05-15-session.md
- session-logs/2026-05-16-session.md
- session-logs/2026-05-17-session.md
- session-logs/2026-05-19-gururmm-backup-fixes.md
- session-logs/2026-05-19-session.md
- session-logs/2026-05-21-session.md
- session-logs/2026-05-23-session.md
- session-logs/2026-05-24-session.md
- session-logs/2026-05-24-GURU-KALI-session.md
backlinks:
- clients/cascades-tucson
- systems/gururmm-build
- systems/jupiter
- systems/pluto
---
# GuruRMM
## Summary
GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints.
**Current version:** 0.6.38 (as of 2026-05-24; fleet converged within ~10 minutes of publish)
**Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a stale reference submodule — do NOT develop there; all real work happens in the Gitea repo.
**Goal:** Full-featured MSP platform rivaling commercial RMMs, with a companion PSA (GuruPSA, separate future repo) designed as a truly integrated unified system — not bolted-together products.
---
## Architecture
### Components
| Component | Location | Tech | State |
|---|---|---|---|
| Server | 172.16.3.30:3001, systemd `gururmm-server`, binary `/usr/local/bin/gururmm-server` | Rust, Axum | deployed, production |
| Dashboard | https://rmm.azcomputerguru.com, nginx at `/var/www/gururmm/dashboard/` | React + TypeScript + Vite, shadcn/ui, Tailwind CSS v4 | deployed, production |
| Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed, fleet on 0.6.38 |
| Agent (Linux) | Endpoints, systemd `gururmm-agent`, binary `/usr/local/bin/gururmm-agent` | Rust, musl static | deployed |
| Agent (macOS) | Endpoints, LaunchDaemon `com.azcomputerguru.gururmm-agent.plist` | Rust, aarch64/x86_64 | Phase 1 deployed 2026-05-12; code signing issue on Apple Silicon |
| Tray (Windows) | System tray, named pipe IPC | Rust | deployed |
| Tray (Linux) | System tray, Unix socket IPC, libappindicator/GTK | Rust, GTK | deployed 2026-05-24 (PR #13+#14 merged) |
| Tray (macOS) | Menu bar | Rust | stub/TODO (issue #18) |
| PostgreSQL DB | localhost:5432 on 172.16.3.30, database `gururmm` | PostgreSQL | deployed |
| Coord API | 172.16.3.30:8001/api/coord | FastAPI (part of ClaudeTools API) | deployed |
| Build pipeline | 172.16.3.30:9000 webhook + `/opt/gururmm/` scripts | Python (webhook-handler.py), Bash | deployed; split into per-platform scripts 2026-05-24 |
| Pluto (Windows build VM) | 172.16.3.36, Windows Server 2019 VM on Jupiter (Unraid) | Rust MSVC, WiX v4 | operational |
### Key Files & Repos
- **Active repo:** `azcomputerguru/gururmm` — http://172.16.3.20:3000/azcomputerguru/gururmm
- **Reference clone:** `D:\claudetools\projects\msp-tools\guru-rmm` — stale submodule, do not develop here
- **Server binary:** `/usr/local/bin/gururmm-server` on 172.16.3.30
- **Agent binary (Linux):** `/usr/local/bin/gururmm-agent`
- **Agent config (Linux/macOS):** `/etc/gururmm/agent.toml` (root, mode 600); macOS uses `/usr/local/etc/gururmm/site.plist`
- **Agent registry (Windows):** `HKLM\SOFTWARE\GuruRMM\SiteId` (written by MSI)
- **Windows service name:** `GuruRMMAgent` (NOT `gururmm-agent`)
- **Downloads dir:** `/var/www/gururmm/downloads/` on 172.16.3.30
- **Webhook handler:** `/opt/gururmm/webhook-handler.py` (port 9000, systemd `gururmm-webhook`)
- **Build scripts:** `/opt/gururmm/build-shared.sh`, `build-linux.sh`, `build-windows.sh`, `build-mac.sh` (split 2026-05-24; `build-agents.sh` is now a compat wrapper)
- **Server build script:** `/opt/gururmm/build-server.sh` (separate pipeline — manual trigger required for server code changes)
- **Per-platform SHA tracking:** `/opt/gururmm/last-built-commit-{linux,windows,mac}`
- **Pluto known-hosts:** `/opt/gururmm/pluto_known_hosts` (pinned SSH keys; installed 2026-05-24)
- **Build log (Linux):** `/var/log/gururmm-build-linux.log`
- **Build log (Windows):** `/var/log/gururmm-build-windows.log`
- **API (internal):** http://172.16.3.30:3001
- **API (external):** https://rmm-api.azcomputerguru.com (Cloudflare)
- **Dashboard:** https://rmm.azcomputerguru.com
- **DB URL:** `postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm`
- **Vault path:** `infrastructure/gururmm-server.sops.yaml`
### Repo Structure
```
gururmm/
├── agent/ Rust agent (managed endpoints)
│ └── src/
│ ├── ipc.rs Unix socket IPC (Linux); Windows named pipe
│ ├── tunnel/ TunnelManager state machine
│ ├── metrics/ sysinfo-based collection (temp NOT yet wired — BUG-001)
│ ├── registry_ops/ Windows registry read/write
│ ├── updater/ Self-update handler
│ └── main.rs systemd unit template generation
├── server/ Rust/Axum API server
│ └── src/
│ ├── api/ REST handlers
│ ├── db/ Database layer (sqlx)
│ ├── ws/ WebSocket handler
│ └── mspbackups/ MSP360 backup integration
├── tray/ System tray binary
├── installer/ WiX v4 MSI (gururmm-agent.wxs)
├── scripts/ Build/ops scripts
└── docs/ FEATURE_ROADMAP.md, UI_GAPS.md, ARCHITECTURE_DECISIONS.md, tech-stack.md, DESIGN.md, specs/
```
---
## Development
### Current Focus
As of 2026-05-24 (v0.6.38):
- **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19).
- **Agent self-update hardening** — ProtectSystem=strict needs `ReadWritePaths=/var/log /usr/local/bin /etc/gururmm` and `RuntimeDirectory=gururmm`. Fixed in PR #21.
- **Auto-update reliability** — BB-SERVER and RECEPTIONIST-PC (Cascades) miss dispatch windows due to flaky WebSockets. Re-querying pending updates on reconnect: incomplete as of 2026-05-24.
- **Watchdog alerts UI** — backend complete but `PUT /watchdog-alerts/:id/resolve` and `DELETE /watchdog-alerts/:id` routes missing on server (found in 2026-05-23 audit).
- **MSP360 backup integration** — Phase 1 complete (monitoring, alerts, mapping, storage thresholds). Phase 2 (management) not started.
- **Security audit backlog:** `credentials/:id/reveal` horizontal privilege escalation (HIGH), `internal_err()` raw DB errors at ~130 call sites (HIGH).
### Patterns & Anti-Patterns
**Anti-patterns — never repeat:**
| Pattern | What Went Wrong |
|---|---|
| `useMemo` with stable deps for data-dependent values | queryClient is stable, memo never recomputes after queries resolve. Use `useQuery` instead. |
| CSS variable text colors inside the sidebar | Sidebar bg is hardcoded dark; CSS vars flip in light mode. Use `text-white` explicitly inside sidebar. |
| Deploying without stopping the server first | "text file busy" kernel error. Always `systemctl stop` before `cp`. |
| Building without `DATABASE_URL` | sqlx compile-time macros fail. `DATABASE_URL` is in `/home/guru/.cargo/env`. |
| DB migrations without inserting into `_sqlx_migrations` | Server crashes on start. Must insert SHA-384 checksum manually. |
| WiX MSI builds on Linux | WiX requires `msi.dll`. MSI must be built on Pluto (Windows). |
| Manual builds via SSH | All builds go through `webhook-handler.py`. Never SSH and run `cargo build` + artifact copy manually. |
| TOML/config for agent endpoint or site_id | Server URL compiled into binary, site_id baked into MSI. No runtime config files for these values. |
| `path.find('\\')` in `#[cfg(windows)]` files | Compiles on Linux silently, fails on Pluto MSVC with unterminated char literal. Use `'\\\\'`. |
| `STATUS_BADGE_CLASSES` Record const | Vite/Rollup may optimize away the lookup. Use explicit `getStatusBadgeClass()` if/else function. |
| SSH heredoc for TypeScript edits | Shell strips double-quote characters. Edit locally in submodule, push to Gitea, pull on server. |
| `Restart-Service GuruRMMAgent -Force` in command scripts | Kills agent before it can report result. Commands stay forever `running`. Use scheduled task with delay instead. |
| `sudo -u guru git` in systemd build context | git rejects repo as dubious ownership when running as root on guru-owned repo. Use `safe.directory` config or `sudo -u guru git`. |
| Self-updating running bash script | bash reads line-by-line from disk; replacing mid-execution silently skips remaining blocks. |
| `+1.77` legacy builds without `--ignore-rust-version` | Fail MSRV check after adding `rust-version` to Cargo.toml. Add `--ignore-rust-version` to legacy build lines only. |
| `StrictHostKeyChecking=no` for Pluto SSH | Replaced with pinned known-hosts at `/opt/gururmm/pluto_known_hosts`. MITM would compromise build artifacts. |
| CRLF line endings in migration files | sqlx SHA-384 checksum mismatch causes server crash on start. `.gitattributes` + `core.autocrlf=false` + pre-commit hook prevents this. |
| Dead WebSocket write half | WS write fails, send task dies, receive loop keeps agent in `ConnectedAgents` with dead write half. Commands silently fail. Fix: `tokio::select!` monitoring both tasks. |
**Good patterns:**
- **Platform parity rule** — any agent feature goes on Windows + Linux + macOS in the same commit. If a real implementation isn't feasible, add a working stub + `// TODO(platform): <os> — <reason>`. No silent no-ops.
- **Per-platform last-built-commit tracking** — Linux builds succeed and record progress independently of Windows builds.
- **Holistic feature development** — every feature ships backend + API + dashboard UI + docs together. Backend-only features are rejected.
- **sqlx offline mode** — compile-time query validation requires DB reachable or offline cache present.
- **`RuntimeDirectory=gururmm` in systemd unit** — systemd-native way to give agent writable `/run/gururmm/` for IPC socket.
- **Registry-first path resolution** — read `HKLM:\SOFTWARE\GuruRMM` for install dir, fall back to service PathName, then hardcoded default.
- **`interrupt_running_commands()` at reconnect** — flips all `status='running'` commands for reconnecting agent to `status='interrupted'`.
### Build & Deploy
**CRITICAL: Never trigger builds manually via SSH. All builds go through the webhook pipeline.**
```
Gitea push to main
-> webhook-handler.py (172.16.3.30:9000, parallel threads per platform)
-> build-shared.sh (auto-version bump, git sync — runs once)
-> build-linux.sh (cargo build on .30; log: /var/log/gururmm-build-linux.log)
-> build-windows.sh (SSH -> Pluto 172.16.3.36 via pinned known-hosts
cargo build --release x64 MSVC + i686 MSVC
+1.77 legacy builds with --ignore-rust-version
WiX MSI build for site-specific base
sign-windows.sh (jsign + Azure Trusted Signing)
SCP artifacts back; log: /var/log/gururmm-build-windows.log)
-> build-mac.sh (stub — no build machine configured yet)
-> artifacts -> /var/www/gururmm/downloads/ with sha256 + -latest symlinks
-> per-platform last-built-commit files updated
-> systemctl restart gururmm-agent (local agent on .30)
```
**Auto-version:** `build-shared.sh` diffs `agent/`, `server/`, `dashboard/` against last built SHA. For each changed component, bumps patch version in `Cargo.toml` or `package.json`, commits `[ci-version-bump]`, pushes. Webhook skips builds where all commits are version bumps.
**Server code changes** — separate manual step, NOT in agent pipeline:
```bash
sudo /opt/gururmm/build-server.sh
```
**Dashboard deploy** — also separate:
```bash
cd /home/guru/gururmm/dashboard && sudo -u guru npm run build
sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dashboard/
```
**DB migrations** — manual; must insert SHA-384 checksum into `_sqlx_migrations` or server crashes on start.
**Pluto (172.16.3.36):**
- Windows Server 2019 VM on Jupiter (Unraid)
- SSH: `ssh -o UserKnownHostsFile=/opt/gururmm/pluto_known_hosts Administrator@172.16.3.36`
- Rust stable 1.95.0 + 1.77 pinned for legacy builds
- VS Build Tools (MSVC), sccache at `C:\sccache`, WiX v4, Gitea clone at `C:\gururmm\`
**Auto-update delivery:**
- Server scans every 300s; dispatches update command on agent heartbeat
- Gated on effective policy `auto_update` (default on when policy is null)
- Agent: downloads to PrivateTmp, verifies SHA-256, replaces binary, restarts service
- Force-trigger: `POST /api/agents/:id/update`
---
## Active State
**Fleet (as of 2026-05-24 12:33 MST):**
- ~55 enrolled agents total; ~39 online
- 37 of 39 online agents on 0.6.38
- Laggards on 0.6.37: BB-SERVER (BirthBiologic, `6c02baa7-...`) and RECEPTIONIST-PC (Cascades of Tucson, `9c91d324-...`) — flaky WebSockets, miss dispatch windows. Force-update via API when their WS is up.
**Known enrolled clients/sites:**
- Cascades of Tucson — site CascadesTucson (RECEPTIONIST-PC, CS-SERVER)
- BirthBiologic — site BRIGHT-PEAK-5980
- Paul Key — site IRON-WOLF-5819
- ACG internal — AD2, DESKTOP-0O8A1RL, GURU-KALI, and one agent labeled "Saturn" [Saturn is DECOMMISSIONED as of Apr 2026; IP 172.16.3.21 reused by Uranus — this agent entry may be stale or may actually be Uranus]
**API auth:**
- `POST /api/auth/login` → JWT (~24h)
- Creds: vault `infrastructure/gururmm-server.sops.yaml``credentials.gururmm-api.admin-email` / `admin-password`
- Key endpoints: `GET /api/agents`, `POST /api/agents/:id/command`, `GET /api/commands/:id`, `POST /api/agents/:id/update`
- Command fields: `command_type` (powershell/shell/exec), `command` (script text, JSON-encoded). Windows agent runs as LocalSystem.
- Response: `stdout`, `stderr`, `exit_code`, `status` (running/completed/failed/timeout/interrupted)
**Dashboard — complete and working:**
Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO, Policies Dashboard (all tabs), Registry editor, MSP360 backup status card.
**Dashboard — incomplete (see UI_GAPS.md):**
- Temperature monitoring (BUG-001) — UI ready, agent-side collection never wired
- Enrollment management UI (revoke keys, audit log, duplicate hostname warnings)
- Watchdog alerts UI — blocked on 2 missing server routes
- MSPBackups management UI — backend complete, no frontend
- Organizations management UI — multi-tenancy backend done, no frontend
- Tunnel session management (interactive terminal — backend skeleton, not production-ready)
**Open Gitea issues:**
- #15 — Pipeline tray build (publish tray binary to downloads)
- #16 — Windows IPC peer authz
- #17 — logind console user resolution
- #18 — macOS tray
- #19 — subscriber broadcast
**Security backlog (HIGH):**
- `credentials/:id/reveal` — horizontal privilege escalation (no ownership scope check)
- `internal_err()` — ~130 call sites returning raw DB errors to callers
---
## Key Architecture Decisions (LOCKED)
These decisions are locked. Do not reverse without explicit user approval.
1. **Per-agent enrollment keys** — MSI contains server URL + site_id only. Agent calls `POST /api/enroll` on first run; server issues unique per-agent key stored hashed. Enables revocation, clone detection, audit trail.
2. **Site-specific MSI generation** — Universal base MSI from CI; dashboard endpoint generates site-specific MSI with site_id baked in via WiX property → `HKLM\SOFTWARE\GuruRMM\SiteId`.
3. **No TOML/config for endpoints** — Server URL compiled into binary. No runtime config files for server URL or site_id.
4. **Policy inheritance chain** — global → site → client → agent. Server computes merged effective policy and pushes via `ConfigUpdate` WebSocket message.
5. **Platform parity rule** — Any agent feature ships on Windows, Linux, and macOS in the same change. Stub + TODO required if a real implementation is not yet feasible.
6. **Watchdog as separate process** — Main agent cannot reliably restart itself after a crash.
7. **Build pipeline is the only path to production** — Enforces signing, checksum generation, consistent artifact layout.
8. **Multi-tenancy identity model (ADR-001)** — Dev team with partner impersonation. Three levels: Dev → Partner → Client. Computer Guru is partner #1.
9. **Holistic feature development (DESIGN.md)** — Every feature requires backend + API + dashboard UI + documentation. Backend-only features are rejected.
10. **AI-optional operation** — GuruRMM must be fully functional without AI. AI features are enhancements, not requirements.
---
## History Highlights
| Date | Event |
|---|---|
| 2025-12-15 | Project genesis: Windows service + Linux installer + site code auth + build server. DB migrated from Jupiter Docker to local PostgreSQL. |
| 2026-04-19 | Full drill-down navigation, auto-install on first run, Pluto build VM setup started. |
| 2026-04-21 | MSI build fix (missing WiX extension flag). DESIGN.md created (holistic development mandate). BirthBiologic onboarded. |
| 2026-04-29 | UI_GAPS.md created. Holistic development principle formalized. |
| 2026-05-12 | macOS agent Phase 1 deployed from Mikes-MacBook-Air. Code signing issue on Apple Silicon noted. |
| 2026-05-15 | Dead WebSocket write-half bug fixed. Temperature struct field name mismatch fixed. |
| 2026-05-16 | Watchdog bugs fixed (sc.exe fallback, suppress_until, hypervisor detection). /feature-request skill created. |
| 2026-05-17 | Syncro PSA Integration added to roadmap (P1) after Howard /feature-request. Office power failure recovery — all VMs recovered. |
| 2026-05-18 | Multi-tenancy architecture (ADR-001) decided. 5 SPEC documents created (SPEC-001 through SPEC-006). |
| 2026-05-19 | 4-bug fix for AD2 crash loop. MSP360 backup integration completed (6 fixes). Clickable CPU/Memory gauge cards + process drill-down modal. |
| 2026-05-23 | /rmm-audit pass. Agent optimization Phases 1A-3. Auto-version bump mechanism. MSRV bumped to 1.85. Fleet at 0.6.29. |
| 2026-05-24 | Linux tray IPC + GTK (PR #13+#14) and peer-cred authz (PR #14) merged. PR #21 (ReadWritePaths fix) merged. Build pipeline split into per-platform scripts. Pluto known-hosts pinned. Fleet converged to 0.6.38. |
---
## Compilation Notes
- macOS build status: Phase 1 was deployed manually from Mikes-MacBook-Air (2026-05-12). `build-mac.sh` is a stub as of 2026-05-24 — unclear if automated pipeline includes macOS yet. [unverified]
- Tunnel subsystem: agent-side substantially complete; server-side is dead-code skeleton. Current live status unconfirmed. [unverified]
- Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified]
- Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified]
## Backlinks
- [[clients/cascades-tucson]] — RECEPTIONIST-PC enrolled (site CascadesTucson)
- [[systems/gururmm-build]] — Linux VM at 172.16.3.30 on Jupiter; GuruRMM API 3001, ClaudeTools API 8001, Coord API, MariaDB, PostgreSQL, build pipeline; originally a container on Jupiter, migrated to own VM
- [[systems/jupiter]] — Unraid host at 172.16.3.20; virsh host for all VMs (GuruRMM VM, Unifi, OwnCloud, Pluto/Claude-Builder); Docker: Gitea port 3000, NPM, Seafile; iptables PREROUTING routes :443 to NPM (NPM proxy `rmm-api -> 172.16.3.20:3001` in credentials.md is STALE — actual GuruRMM API is on 172.16.3.30)
- [[systems/pluto]] — Windows build server (MSI, WiX) at 172.16.3.36

0
wiki/systems/.gitkeep Normal file
View File