feat: implement agent-os standards system and feature planning tools

- Split CODING_GUIDELINES.md into 19 indexed standards files under .claude/standards/
  - 9 from CODING_GUIDELINES (conventions, powershell, security, api, git, gururmm)
  - 10 from session log tribal knowledge (syncro, ssh, gitea, python, client, gururmm)
- Add .claude/standards/index.yml for cheap relevance-based lookup
- Add /inject-standards command: load targeted standards per task instead of full guidelines
- Add /shape-spec command: pre-implementation spec for GuruRMM features (plan.md,
  shape.md, references.md, standards.md) with mandatory out-of-scope gate
- Add docs/tech-stack.md and docs/mission.md for ClaudeTools API
- Add projects/msp-tools/guru-rmm/docs/tech-stack.md and mission.md for GuruRMM
- Update CLAUDE.md commands table with /inject-standards and /shape-spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-16 12:59:49 -07:00
parent 2fa8295ad0
commit dd0ef45645
26 changed files with 1757 additions and 1 deletions

View File

@@ -249,6 +249,7 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t
| `/frontend-design` | Modern frontend design (auto-invoke after UI changes) |
| `/remediation-tool` | M365 breach checks, tenant sweeps, gated remediation |
| `/feature-request` | Howard submits a GuruRMM feature request — Claude classifies it and messages Mike |
| `/shape-spec` | Pre-implementation spec for a GuruRMM feature — produces plan.md, shape.md, references.md, standards.md |
---

View File

@@ -0,0 +1,94 @@
# /inject-standards — Load relevant coding standards into context
Loads one or more standards files from `.claude/standards/` and displays their full content.
## Usage
```
/inject-standards — auto-select based on the current task
/inject-standards powershell/execution-pattern — load a specific standard by path
/inject-standards "syncro billing comment" — load standards relevant to a task description
/inject-standards syncro/comment-dedup syncro/time-entry-protocol — load multiple specific standards
```
## Procedure
Follow these steps exactly when /inject-standards is invoked:
### Step 1 — Parse $ARGUMENTS
- If $ARGUMENTS is empty: proceed to Step 2 (auto-select based on conversation context).
- If $ARGUMENTS contains one or more paths that match known standards slugs (e.g., `powershell/execution-pattern`, `syncro/comment-dedup`): skip Step 2, go directly to Step 3 with those paths.
- If $ARGUMENTS is a task description (plain English, not a path): use it as the query in Step 2.
### Step 2 — Auto-select relevant standards
1. Read `D:/claudetools/.claude/standards/index.yml`.
2. Review the descriptions for all entries.
3. Select the 25 standards most relevant to either:
- The task description in $ARGUMENTS, or
- The current conversation context (what has the user been working on?).
4. Prefer specificity: `syncro/comment-dedup` is more relevant than `conventions/no-emojis` for a Syncro billing task.
5. Always include `conventions/no-emojis` when writing any output that will go into scripts, logs, or client-facing text.
### Step 3 — Load and display the selected standards
For each selected standard (in order of relevance):
1. Read the file at `D:/claudetools/.claude/standards/<slug>.md`.
2. Display a header:
```
=== STANDARD: <slug> ===
```
3. Display the full file content (including frontmatter).
4. Add a blank line between standards.
### Step 4 — Report
After displaying all standards, print a one-line summary:
```
[INFO] Loaded N standards: <slug1>, <slug2>, ...
```
If auto-selection was used, briefly explain why each standard was chosen (one phrase per standard).
## Examples
**Specific standards:**
```
/inject-standards powershell/execution-pattern git/commit-style
```
Loads those two files directly and displays them.
**Task description:**
```
/inject-standards "writing a PowerShell script to check Windows service status"
```
Would select: `powershell/execution-pattern`, `conventions/no-emojis`, `conventions/output-markers`
**Syncro billing task:**
```
/inject-standards "billing Syncro ticket for emergency onsite"
```
Would select: `syncro/time-entry-protocol`, `syncro/comment-dedup`, `syncro/html-formatting`
**GuruRMM agent feature:**
```
/inject-standards "adding Linux temperature collection to the agent"
```
Would select: `gururmm/platform-parity`, `gururmm/build-pipeline`, `conventions/no-emojis`
**Empty (auto from context):**
```
/inject-standards
```
Reads the recent conversation, infers the task type, selects 25 most relevant standards.
## Standards index location
`D:/claudetools/.claude/standards/index.yml`
## Standards files location
`D:/claudetools/.claude/standards/<folder>/<name>.md`

View File

@@ -0,0 +1,218 @@
Pre-implementation planning command for a GuruRMM feature. Produces a structured spec folder in `specs/<slug>/` that persists across sessions, eliminating the need to re-explain context when resuming work.
---
## Input
`$ARGUMENTS` = feature name slug (e.g., `policy-wiring`, `asset-location-tracking`, `windows-update-check`).
If `$ARGUMENTS` is empty, ask the user: "What should the feature slug be? Use kebab-case (e.g., `windows-update-check`)." Wait for the answer before continuing.
Set `SLUG` = the feature slug (lowercase, kebab-case). Set `SPEC_DIR` = `specs/<SLUG>`.
---
## Phase 1 — Feature description
Ask the user in a single message (wait for response before continuing):
> What does this feature do? Give me a user-facing description — what a user sees or what the system does.
>
> What problem does it solve, or what prompted this? (ticket, gap report, field request, etc.)
Do NOT proceed to Phase 2 until the user responds.
---
## Phase 2 — Scope boundaries
Ask the user in a single message (wait for response before continuing):
> A few scoping questions:
>
> 1. What is explicitly OUT of scope for this implementation? What should we NOT build yet?
> 2. Are there any hard constraints? (e.g., must work offline, no new crates/dependencies, must not touch the agent binary)
> 3. Priority level: P1 (blocking other work), P2 (important, near-term), or P3 (nice-to-have)?
If the user does not provide out-of-scope items, ask specifically before writing any files:
> "I need at least one explicit non-goal for shape.md — what should this implementation deliberately NOT do?"
Do NOT write any files until both Phase 1 and Phase 2 are complete and out-of-scope items exist.
---
## Phase 3 — Codebase research (no user input needed)
Search the codebase for relevant context. Use Grep — do NOT open files just to scan them.
Search targets:
- Files or functions that will be touched or extended by this feature
- Similar existing implementations (e.g., if adding a new agent metric, find how other metrics are collected and reported)
- The feature roadmap: `projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md` — search for the slug keywords and adjacent section headers
Collect:
- File paths and a brief note on what each does and what will need to change
- Any existing pattern (with file:line) that this feature should follow
- The roadmap section/entry if one exists
---
## Phase 4 — Standards matching (no user input needed)
Read `.claude/standards/index.yml` if it exists. If it does not exist, Glob `.claude/standards/**/*.md` to discover available standards files, and read the frontmatter of each.
Select the standards that apply to this feature. Minimum defaults:
| Feature type | Standards to include |
|---|---|
| Rust agent feature | `conventions/no-emojis`, `conventions/output-markers` (for any shell scripts), plus any gururmm-specific standards found |
| Server/API feature | any api/ standards found |
| Dashboard/UI feature | `conventions/output-markers`, any frontend standards found |
| Git/build workflow | any git/ or gitea/ standards found |
| Any script output | `conventions/output-markers` |
When in doubt, include the standard — it costs nothing to list it.
---
## Phase 5 — Write spec files
Write all four files. Do not ask for confirmation before writing — the user approved the spec by completing Phases 1 and 2.
### `specs/<SLUG>/plan.md`
```markdown
# <Feature Name> — Implementation Plan
> Spec created: <date>
> Status: not started
## Task 0: Commit this spec
Commit the `specs/<SLUG>/` directory before writing any code:
```
git add specs/<SLUG>/
git commit -m "spec: add <feature-slug> shape spec"
```
Do not start Task 1 until this commit exists.
## Task 1: [first concrete implementation step]
Files touched: `path/to/file.rs`, `path/to/other.rs`
[Description of what to do, specific enough that a future session can start without re-reading the full conversation]
## Task 2: [next step]
Files touched: ...
[Description]
[... continue for all tasks ...]
## Task N: Verification
How to confirm the feature works end-to-end:
- [specific check 1 — e.g., run agent on Windows, confirm metric appears in dashboard]
- [specific check 2]
- [expected log output or API response]
```
Rules for plan.md:
- Task 0 is always "commit this spec" — never omit it
- Each task names the specific files it touches
- Tasks must be specific enough that a future Claude session reading only `specs/<SLUG>/` can continue without re-reading the conversation
- The final task is always verification with concrete, observable checks
### `specs/<SLUG>/shape.md`
```markdown
# <Feature Name> — Shape & Constraints
## What this is
[1-2 sentence description — what the user sees or what the system does]
## What this is NOT (out of scope)
- [explicit non-goal 1]
- [explicit non-goal 2]
[at least one item is required — do not write this file without it]
## Hard constraints
- [constraint 1 — or "None stated" if user provided none]
## Key decisions
- [decision + rationale — derived from Phase 1/2 answers and Phase 3 codebase research]
## Priority
P1 | P2 | P3 ← keep only the one that applies
## Roadmap reference
[Link to section in `projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md` if found in Phase 3, or "Not yet in roadmap"]
```
### `specs/<SLUG>/references.md`
```markdown
# <Feature Name> — Code References
## Files that will be touched
- `path/to/file.rs` — [what it does, what needs to change]
## Similar existing implementations
- `path/to/similar.rs:<line>` — [how this pattern is already done, what to follow]
[If no similar implementations were found, write: "No directly analogous implementation found in current codebase."]
## Database schema (if applicable)
[Relevant existing tables, or the migration pattern used by other features, or "No database changes required"]
```
### `specs/<SLUG>/standards.md`
```markdown
# <Feature Name> — Applicable Standards
The following standards from `.claude/standards/` apply to this feature:
## <standard name>
[The key rule from that standard most relevant to this feature]
Source: `.claude/standards/<relative-path>.md`
[Repeat for each applicable standard]
```
---
## After writing files
Tell the user:
- The spec was written to `specs/<SLUG>/` (list all four files)
- **Task 0 in plan.md is to commit the spec before writing any code** — do it now with `/checkpoint` or `git add specs/<SLUG>/ && git commit -m "spec: add <SLUG> shape spec"`
- To load this spec in a future session: read all four files in `specs/<SLUG>/` — that is the full context needed to resume
- `plan.md` is the source of truth for implementation order; check tasks off by adding `[DONE]` markers as they complete
---
## Quality rules
- Do NOT start writing files until Phase 1 AND Phase 2 are complete and at least one out-of-scope item exists
- Use Grep to find code references — never open files speculatively to scan for context
- Task 0 in plan.md is always "commit this spec" — non-negotiable
- shape.md must have at least one explicit out-of-scope item — ask before writing if the user didn't provide any
- Do not invent file paths for references.md — only include paths found via Grep in Phase 3
- Standards.md must name at least one standard — if index.yml does not exist, Glob the standards directory

View File

@@ -0,0 +1,62 @@
---
name: response-format
description: RESTful plural nouns, consistent error format, pagination, OpenAPI documentation
applies-to: claudetools-api, gururmm
---
# API Standards
## URL structure
- RESTful with plural nouns: `/api/users`, `/api/agents`, `/api/clients`
- Nested resources for sub-items: `/api/agents/:id/checks`, `/api/sites/:id/channel`
- Use kebab-case for multi-word segments: `/api/policy-assignments`, `/api/check-results`
## Error format
All error responses use a consistent envelope:
```json
{
"detail": "Human-readable error description",
"error_code": "MACHINE_READABLE_CODE",
"status_code": 404
}
```
- `detail` — for humans; may be shown in UI
- `error_code` — for client code to switch on; use `UPPER_SNAKE` format
- `status_code` — redundant with HTTP status but helps clients that lose the HTTP layer
## Pagination
Paginate all list endpoints that can return more than ~50 items. Use cursor-based or offset-based pagination:
```json
{
"items": [...],
"total": 148,
"page": 1,
"per_page": 25
}
```
Do not return unbounded arrays from production endpoints.
## Documentation
- Document with OpenAPI — FastAPI generates this automatically from type annotations and docstrings
- For Axum/Rust endpoints, add route comments with request/response shapes until an OpenAPI generator is wired
## sqlx query style (GuruRMM server)
Use `sqlx::query()` (runtime) not `sqlx::query!()` (compile-time macro) for new queries. The compile-time macro requires `cargo sqlx prepare` after every schema change and rebuilding the `.sqlx/` cache. Runtime queries avoid this overhead.
The offline cache (`server/.sqlx/`) only needs updating when `query!()` macros change. When adding a new query, default to `sqlx::query()` unless there is a specific reason to use the proc macro.
## Migration discipline (GuruRMM)
- Never manually pre-apply migrations via psql without also recording the checksum in `_sqlx_migrations` — sqlx will fail on startup if it finds a missing row for a migration it doesn't recognize
- Use `ADD COLUMN IF NOT EXISTS` in all migrations so they are idempotent
- Apply migrations by letting the server binary run them on startup (`sqlx::migrate!()`)
- The correct sequence for a new migration: add SQL file → apply to DB (server startup) → `cargo sqlx prepare` → commit `.sqlx/` → rebuild

View File

@@ -0,0 +1,51 @@
---
name: communication-tone
description: Expert partner posture; state findings and act; ask only the one specific unknown; never ask clients to explain their own infrastructure
applies-to: all
---
# Client Communication Tone
## Expert partner posture
When working on client issues, take the posture of an expert partner — not a questioner. State what you found, what you did, and what the outcome is. Ask only the one specific thing you cannot reasonably determine from available context.
**Wrong:** "What kind of server is this? What version of Windows are you running? Is this a domain-joined machine? What's your normal backup procedure?"
**Correct:** "Your machine is joined to the Cascades domain, running Windows 10 22H2. The wireless adapter is getting APIPA addresses on CSCNet — I'm checking if this is a DHCP lease issue or a driver roaming problem."
Research the answer before asking. If you don't know something, look it up in the vault, session logs, CONTEXT.md, or Syncro customer record. The client should never be the first source of information about their own infrastructure.
## When to ask
Ask the client only when:
1. The information does not exist in any accessible system (vault, Syncro, session logs, CONTEXT.md)
2. The decision is genuinely theirs (budget, timeline, whether to proceed with a change)
3. You need physical confirmation (e.g., "is the device powered on?", "do you see the prompt on screen?")
Never ask the client to explain how their own infrastructure works. That is your job.
## Findings format for client-facing comments
Structure client-facing Syncro comments as:
1. **What the problem was** — one sentence, plain language
2. **What was done** — what you changed or fixed
3. **Current status** — is the issue resolved, or is there follow-up?
4. **Next step (if any)** — one clear action item with an owner
Example:
```
Your Intel wireless driver was causing a BSOD (stop code 0xD1, checked crash dump). Updated
to the latest driver from Intel's site. Machine is stable — customer running overnight memtest
to confirm. No further action needed unless memtest flags errors.
```
## Internal comments vs. public comments
- Use `hidden: true` for notes containing passwords, internal troubleshooting details, or anything not client-appropriate
- Use `hidden: false` for the customer-visible summary
- When in doubt: write the internal hidden comment first with all details, then write a short public comment with only the customer-facing summary
## Note on Cascades of Tucson
Never set `contact_id` on Cascades tickets. Leaving it blank lets Syncro route to the correct email distribution. Setting it to any specific contact (Meredith Kuhn has been the incorrect default in past sessions) overrides the distribution and breaks notification routing for this client.

View File

@@ -0,0 +1,51 @@
---
name: grepai-first
description: Search with GrepAI or Grep before opening any file for context; Read only when full content is needed
applies-to: all
---
# Context Lookup — GrepAI First
Before reading any file for context, search with GrepAI or Grep. Only open a file when you need its full content for editing or line-by-line review.
## Lookup table
| Goal | Tool |
|------|------|
| Find where a function is defined | `grepai_search` or `Grep` |
| Understand how a feature works | `grepai_search` |
| Find all callers of a function | `grepai_trace_callers` |
| Find what a function calls | `grepai_trace_callees` |
| Full file content needed (edit, review) | `Read` |
| Recent changes to a file | `git log`, then `Read` specific file |
| "What did we do with X?" | `grepai_search` over session logs |
| "How is Y configured?" | `grepai_search` before checking any specific file |
## Token cost rationale
Reading a 500-line file to find one function costs approximately 3,000 tokens. A targeted GrepAI or Grep search costs approximately 100 tokens and returns only the relevant lines. At scale, reading files for context without searching first wastes context window and increases latency.
Never open a large file to scan for context. Search first; read only if the search result is insufficient or if you need to edit the file.
## GrepAI specifics
- CLI: `D:/claudetools/grepai.exe search "query" --json -c -n 5`
- MCP tools: `grepai_search` (primary), `grepai_trace_callers`, `grepai_trace_callees`
- The GrepAI index covers session logs, skill files, and project docs with boosted relevance for `.claude/` content
- The watcher (scheduled task "GrepAI Watcher - claudetools") keeps the index current automatically
## Common search queries
```bash
# Find where a setting is configured
grepai_search "vault_path"
# Understand how the Syncro skill handles billing
grepai_search "timer_entry charge_timer"
# Find what we did with a specific client or feature
grepai_search "Cascades DMARC"
# Locate a function definition
Grep --pattern "fn collect_temps" --type rust
```

View File

@@ -0,0 +1,38 @@
---
name: naming
description: Naming conventions for Python, PowerShell, Bash, database tables and columns
applies-to: all
---
# Naming Conventions
## Python
- Functions: `snake_case`
- Classes: `PascalCase`
- Constants: `UPPER_SNAKE_CASE`
- Follow PEP 8 for everything else
## PowerShell
- Variables: `$PascalCase` — e.g., `$TaskName`, `$ServiceStatus`
- Functions/cmdlets: use approved verbs — `Get-`, `Set-`, `New-`, `Remove-`, `Invoke-`
- Do not invent new verb forms; stick to the PowerShell approved verb list
## Bash
- Functions: `lowercase_underscore` — e.g., `get_token`, `check_status`
- All variables: quote them — `"$var"`, not `$var`
- Script names: `lowercase-with-hyphens.sh`
## Database
**Tables:**
- Lowercase plural nouns: `users`, `user_sessions`, `agent_updates`
- Junction/relation tables: `{table_a}_{table_b}` — e.g., `policy_assignments`
- Foreign keys: `{referenced_table_singular}_id` — e.g., `agent_id`, `client_id`, `site_id`
**Columns:**
- Timestamps: `created_at`, `updated_at` (not `created`, `modified`, `timestamp`)
- Booleans: `is_` prefix for state — `is_active`, `is_hidden`; `has_` prefix for possession — `has_alert`
- Status enums: use a `TEXT` column with a CHECK constraint listing valid values, not an integer code

View File

@@ -0,0 +1,37 @@
---
name: no-emojis
description: Never use emojis in code, scripts, config, or output; use ASCII markers instead
applies-to: all
---
# No Emojis — Ever
Never use emojis in code, scripts, config files, log messages, or output strings.
## Rationale
Causes PowerShell parsing errors, encoding issues, and terminal rendering problems. PowerShell profile scripts that interact with emoji characters can trigger codepage changes (`chcp 65001`) which alter the Claude Code CLI font and break rendering. Emoji bytes can also corrupt log files that are read by tools expecting ASCII or Latin-1.
## Use instead
```
[OK] [SUCCESS] [INFO] [WARNING] [ERROR] [CRITICAL]
```
These are the mandatory ASCII status markers for all scripts, tools, session logs, commit messages, and any output visible in a terminal or log file.
## Exception
User-facing web UI with proper UTF-8 handling. If a React component or HTML page is intentionally rendering emoji for end users, that is acceptable. The prohibition is on everything else: server logs, CLI output, scripts, config files, PowerShell, Bash, Python print statements, and any text that may flow through a terminal.
## Scope
This applies to:
- All scripts (PowerShell, Bash, Python)
- Config files (YAML, TOML, JSON comments)
- Log messages and print statements
- Git commit messages
- Markdown documentation in `.claude/`
- Syncro ticket comments
- Session logs
- Any output written to a terminal or log file

View File

@@ -0,0 +1,49 @@
---
name: output-markers
description: ASCII status markers [OK] [ERROR] [WARNING] [SUCCESS] [INFO] [CRITICAL] for all scripts and tools
applies-to: all
---
# Output Markers
All scripts and tools use ASCII status markers instead of emoji, colored text, or ambiguous symbols.
## Standard set
```
[INFO] — informational, no action required
[SUCCESS] — operation completed successfully
[WARNING] — something is off but not fatal; attention recommended
[ERROR] — operation failed; action required
[CRITICAL] — severe failure; immediate attention required
[OK] — shorter form of [SUCCESS], used in status tables and inline checks
[GAP] — used in parity matrices; feature not implemented on this platform
[WARN] — shorter form of [WARNING], used in parity matrices and inline checks
```
## Usage examples
```bash
echo "[INFO] Starting process"
echo "[SUCCESS] Task completed"
echo "[WARNING] Configuration file missing, using defaults"
echo "[ERROR] Failed to connect to 172.16.3.30:3001"
echo "[CRITICAL] Database unavailable — all writes are failing"
```
## In tables and parity matrices
| Feature | Windows | Linux | macOS |
|---------|---------|-------|-------|
| CPU metrics | [OK] | [OK] | [OK] |
| Temperature | [OK] primary | [WARN] partial | [WARN] partial |
| Idle time | [OK] | [GAP] | [GAP] |
## Applies to
- Bash scripts
- Python scripts and log messages
- PowerShell scripts
- Session log entries
- Commit messages (use plain text, not markers, but no emoji)
- Any output that may appear in a terminal or log file

View File

@@ -0,0 +1,70 @@
---
name: commit-style
description: Conventional commit types, Co-Authored-By for Claude commits, files never to commit
applies-to: all
---
# Git Commit Style
## Commit types
Use conventional commit prefixes:
| Prefix | When to use |
|--------|-------------|
| `feat` | New feature or capability |
| `fix` | Bug fix |
| `refactor` | Code change with no behavior change |
| `docs` | Documentation only |
| `test` | Tests only |
| `config` | Config/settings change |
| `build` | Build pipeline, Dockerfile, Cargo.toml |
| `chore` | Maintenance (submodule pointer updates, sync commits) |
| `sync` | Auto-sync commit from a machine |
## Co-Authored-By
All commits made by or with Claude include the co-author trailer:
```
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
```
Place it at the end of the commit body, after a blank line:
```
feat(agent): add temperature collection via /sys/class/thermal
Reads millidegrees from thermal_zone*/temp, classifies by type label.
Correct approach for Linux where sysinfo Components returns empty.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
```
## Files never to commit
- `.env` — environment files
- `credentials.json`, `*.pem`, `*.p12`, `*.pfx` — key material
- `venv/`, `.venv/` — Python virtual environments
- `__pycache__/`, `*.pyc` — Python bytecode
- `*.log` — log files
- `node_modules/` — npm dependencies
- `target/` — Rust build output
- Any SOPS plaintext file before encryption (`.sops.yaml` without encryption headers)
## Commit authorship
- `git config user.name` and `git config user.email` are set per-machine based on `.claude/identity.json`
- Mike: `Mike Swanson <mike@azcomputerguru.com>`
- Howard: `Howard Enos <howard@azcomputerguru.com>`
- Commits pushed via the shared Gitea push account `azcomputerguru`, but authorship in `git log` tracks the actual person
## Submodule commits
When the gururmm submodule pointer is advanced:
```
chore: update guru-rmm submodule pointer (v0.6.22 watchdog fix)
```
Always include a brief reason — "session log" or the version + feature that advanced it.

View File

@@ -0,0 +1,63 @@
---
name: internal-api
description: Use http://172.16.3.20:3000 for Gitea API; git.azcomputerguru.com is behind Cloudflare and blocks curl
applies-to: all
---
# Gitea API Access
## Internal IP for API calls
Always use the internal IP for Gitea API calls:
```
http://172.16.3.20:3000
```
The public URL `git.azcomputerguru.com` is proxied through Cloudflare. Cloudflare's bot protection blocks programmatic curl requests with HTTP 403 or challenge pages. The internal IP bypasses Cloudflare entirely and is accessible from all machines on the office network and Tailscale.
## Git remotes (push/pull)
For git push and pull, the public URL works fine — git's SSH/HTTPS protocols are not affected by Cloudflare's bot challenge:
```bash
git push https://azcomputerguru@git.azcomputerguru.com/azcomputerguru/gururmm.git main
git remote set-url origin https://azcomputerguru@git.azcomputerguru.com/azcomputerguru/claudetools.git
```
Only API calls (REST HTTP) need the internal IP.
## API base URLs
| Use case | URL |
|----------|-----|
| Gitea REST API | `http://172.16.3.20:3000/api/v1/` |
| Gitea web UI | `http://172.16.3.20:3000/` |
| GuruRMM repo | `http://172.16.3.20:3000/azcomputerguru/gururmm` |
| ClaudeTools repo | `http://172.16.3.20:3000/azcomputerguru/claudetools` |
## Authentication
Gitea API tokens are stored in the vault. The push account `azcomputerguru` is shared, but individual API tokens should be scoped appropriately.
## Example API call
```bash
# Correct — internal IP
curl -s "http://172.16.3.20:3000/api/v1/repos/azcomputerguru/gururmm/releases" \
-H "Authorization: token <GITEA_TOKEN>"
# Wrong — blocked by Cloudflare
curl -s "https://git.azcomputerguru.com/api/v1/repos/azcomputerguru/gururmm/releases" \
-H "Authorization: token <GITEA_TOKEN>"
```
## Webhook
The build webhook is separate — it runs on Saturn (172.16.3.30:9000), not Gitea:
```
POST http://172.16.3.30:9000/webhook/build
Header: X-Hub-Signature-256: sha256=<HMAC>
Secret: gururmm-build-secret
```

View File

@@ -0,0 +1,84 @@
---
name: build-pipeline
description: Never run build-agents.sh manually; all builds go through the Gitea webhook pipeline (push to main)
applies-to: gururmm
---
# GuruRMM Build Pipeline
## The rule
Never run `build-agents.sh` manually via SSH unless recovering from a specific failure (e.g., a stale zombie lock file). All agent and server builds go through the Gitea webhook pipeline: push to `main` on `azcomputerguru/gururmm` triggers the build automatically.
Running the build script manually can create version conflicts, bypass the Authenticode signing step for Windows binaries, and leave the build log in an inconsistent state that causes false "build complete" notifications on the next real build.
## Normal build flow
```
1. Edit code locally or on the server
2. Commit changes
3. Push to Gitea: git push origin main (or push to azcomputerguru/gururmm via Gitea remote)
4. Gitea webhook fires POST to http://172.16.3.30:9000/webhook/build (HMAC-SHA256 signed)
5. webhook-handler.py on Saturn spawns build-agents.sh (Linux) and triggers Pluto (Windows)
6. Monitor: tail -f /var/log/gururmm-build.log on Saturn
```
## Build lock
The build script uses `/var/run/gururmm-build.lock` to prevent concurrent builds. If a build crashes mid-run, the lock file is not cleaned up and the next webhook trigger will fail silently (webhook handler sees the lock and exits).
**Check for zombie lock before triggering any manual build:**
```bash
# Check lock exists and get PID
cat /var/run/gururmm-build.lock
# Check if PID is a zombie (os.kill returns 0 for zombies)
# If PID no longer exists or is zombie, remove the lock
sudo rm -f /var/run/gururmm-build.lock
```
The build script should add a defensive `rm -f /var/run/gururmm-build.lock` at startup (pending improvement). Until that is added, manual lock cleanup before triggered builds is required after any build failure.
## Server build
The server binary is built separately from agents:
```bash
# Trigger server build (as guru user on Saturn, or via SSH)
ssh guru@172.16.3.30 "sudo /opt/gururmm/build-server.sh 2>&1"
# Monitor
ssh guru@172.16.3.30 "tail -f /var/log/gururmm-build.log"
```
The server build uses `SQLX_OFFLINE=true` (set in `/home/guru/.cargo/env`) to avoid the sqlx proc macro querying the live database during compilation.
## Service binary path
The deployed server binary is at `/opt/gururmm/gururmm-server` — this is what systemd's ExecStart points to. Do not deploy to `/usr/local/bin/gururmm-server` (that path has no service backing and has caused "deployed but not running" confusion in past sessions).
```bash
# Correct deploy sequence
sudo systemctl stop gururmm-server
sudo cp /home/guru/gururmm/server/target/release/gururmm-server /opt/gururmm/gururmm-server
sudo systemctl start gururmm-server
```
## Windows agent build (Pluto)
Windows agent builds and Authenticode signing run on Pluto (172.16.3.36). The webhook handler triggers Pluto automatically. Signed binaries avoid the Windows SmartScreen warning that affected unsigned 0.6.2 builds.
The build generates all agent variants:
- `gururmm-agent-linux-x86_64`
- `gururmm-agent-linux-aarch64`
- `gururmm-agent-windows-x86_64.exe`
- `gururmm-agent-windows-x86.exe`
- `gururmm-agent-macos-x86_64`
- `gururmm-agent-macos-aarch64`
All are placed in `/var/www/gururmm/downloads/` on Saturn after build.
## Changelog generation
`build-agents.sh` calls `generate-changelog.sh` before the "Build complete" log line. This creates `changelogs/agent/v{VERSION}.md` and updates `changelogs/LATEST_AGENT.md` automatically on each build. Do not create changelog files manually.

View File

@@ -0,0 +1,89 @@
---
name: platform-parity
description: All agent features must ship on Windows, Linux, and macOS; silent no-ops on one platform are bugs
applies-to: gururmm
---
# GuruRMM Agent — Platform Parity
All agent features that are not inherently platform-specific must ship on Windows, Linux, and macOS. A feature that silently no-ops on one platform is a gap, not a cross-platform implementation.
## The rule
> "Add feature X to the agent" means Windows + Linux + macOS. All three, in the same change.
> No exceptions for convenience. If a real implementation is not feasible on a given platform,
> add a working stub and a `// TODO(platform): <os> — <reason>` comment in the same commit.
> A feature that silently no-ops on one platform without a stub and TODO is a bug, not a gap.
## cfg gating — choose the right target
| Condition | Attribute | When to use |
|-----------|-----------|-------------|
| Windows only | `#[cfg(windows)]` | Windows API (Win32, WMI, SCM, OpenSSH registry) |
| Linux + macOS | `#[cfg(unix)]` | POSIX: nix crate, signals, `/proc`, `/sys`, sockets |
| Linux only | `#[cfg(target_os = "linux")]` | `/sys/class/thermal`, systemd, procfs, D-Bus |
| macOS only | `#[cfg(target_os = "macos")]` | CoreFoundation, IOKit, launchd, NSStatusBar |
| Build flag | `#[cfg(feature = "native-service")]` | Service harness (Windows only in Cargo.toml) |
Never use `#[cfg(not(windows))]` as a proxy for "Linux + macOS works the same" without verifying the macOS codepath. Linux and macOS diverge on `/sys`, D-Bus, and GUI IPC.
## Current parity matrix (as of 2026-05-15)
| Feature | Windows | Linux | macOS |
|---------|---------|-------|-------|
| CPU / memory / disk / network metrics | [OK] | [OK] | [OK] |
| Temperature via sysinfo | [OK] fallback | [WARN] empty if no hwmon | [WARN] empty if no sensors |
| Temperature via LibreHardwareMonitor | [OK] primary | N/A | N/A |
| Temperature via /sys/class/thermal | N/A | [GAP] not implemented | N/A |
| User detection (logged-in user) | [OK] | [OK] nix crate | [OK] nix crate |
| User idle time | [OK] GetLastInputInfo | [GAP] returns None | [GAP] returns None |
| IPC / tray | [OK] named pipe + WinTray | [GAP] stub no-op | [GAP] stub no-op |
| Watchdog (process monitor) | [OK] native-service | [GAP] stub no-op | [GAP] stub no-op |
| Script execution | [OK] cmd / PowerShell | [OK] bash / sh | [OK] bash / sh |
| Hardware inventory | [OK] WMI | [OK] /proc + lshw | [OK] system_profiler |
| Auto-updater | [OK] full | [OK] simpler | [OK] simpler |
| Checks (AV, updates, firewall) | [OK] full | [WARN] partial stub | [WARN] partial stub |
| Network discovery | [OK] | [OK] | [OK] |
## Known gaps — priority order
**1. Linux temperature collection** (`agent/src/metrics/mod.rs`)
- sysinfo `Components` returns empty on most Linux systems (requires kernel hwmon driver exposure).
- Correct approach: read `/sys/class/thermal/thermal_zone*/temp` directly (always available on Linux).
- Pattern:
```rust
#[cfg(target_os = "linux")]
fn collect_temps_linux() -> (Option<f32>, Option<f32>, Vec<TemperatureReading>) {
// read /sys/class/thermal/thermal_zone*/temp
// parse millidegrees, classify by type label in /sys/class/thermal/thermal_zone*/type
}
```
**2. Linux / macOS user idle time** (`agent/src/metrics/mod.rs` — `get_user_idle_time()`)
- Linux: use X11 `XScreenSaverQueryInfo` (display sessions) or parse `/proc/interrupts` delta (headless).
- macOS: use `CGEventSourceSecondsSinceLastEventType` (IOKit, always available).
- Stub is acceptable short-term; mark with `// TODO(platform): linux/macos idle time`.
**3. Watchdog on Linux / macOS** (`agent/src/watchdog/`)
- Windows: Windows Service Control Manager restarts the agent.
- Linux: systemd `Restart=on-failure` in the unit file is the correct equivalent — no in-process watchdog needed.
- macOS: launchd `KeepAlive` key in the plist.
- Document the OS-native mechanism in `build-agents.sh` / installer rather than porting the Rust watchdog.
**4. Checks on Linux / macOS** (`agent/src/checks.rs`)
- Windows-specific checks (Windows Update pending, Windows Defender status, Windows Firewall) have no direct equivalents; that is expected.
- Cross-platform checks (disk SMART, certificate expiry, open ports) should run on all platforms.
- Add `// TODO(platform): linux/macos — <check name>` for each unimplemented cross-platform check.
## Cargo.toml dependency discipline
- Platform-specific crates go in `[target.'cfg(...)'.dependencies]`, never in `[dependencies]`.
- Keep `lhm` (LibreHardwareMonitor) and `windows-service` under `cfg(windows)`.
- Keep `nix` under `cfg(unix)`.
- When adding a new crate, verify it compiles on all three targets before merging. Use the build server for Windows; CI covers Linux. macOS cross-compile via `--target aarch64-apple-darwin` on Linux (requires `osxcross` toolchain — see build-agents.sh TODO-MACOS).
## Additional notes from past sessions
**service.rs must mirror main.rs AppState**: On Windows, the agent runs as a Windows Service via a separate entry point in `service.rs` that constructs `AppState` independently. Any field added to `AppState` in `main.rs` must also be added to the `AppState` struct literal in `service.rs`. This has caused Windows-only build failures in the past (missing `agent_id` field). There is no shared constructor — both sites must be updated manually.
**sc.exe over Get-Service**: `Get-Service` silently fails to enumerate `GuruRMMAgent` even with the exact service name in some session contexts. `sc.exe queryex "GuruRMMAgent"` is reliable. All PS1-based service checks in agent code use `sc.exe query` equivalents.

View File

@@ -0,0 +1,77 @@
---
name: sqlx-migrations
description: Never manually pre-apply migrations without tracking rows; use IF NOT EXISTS; let the server apply its own migrations
applies-to: gururmm
---
# GuruRMM sqlx Migration Discipline
## The core rule
Never manually pre-apply migrations via psql without also recording the corresponding row in `_sqlx_migrations`. If the row is missing, the server binary will attempt to re-run the migration at startup and fail when it finds the table or column already exists.
## The correct workflow
Let the server binary apply its own migrations on startup:
```
1. Write the SQL migration file (server/migrations/NNN_description.sql)
2. Use ADD COLUMN IF NOT EXISTS / CREATE TABLE IF NOT EXISTS for idempotence
3. Run cargo sqlx prepare (keeps .sqlx/ offline cache current)
4. Commit the migration file + .sqlx/ changes
5. Build the server binary (push to Gitea triggers build-server.sh)
6. Deploy: stop → copy binary → start
7. sqlx applies the migration on startup and records the checksum row
```
Do not pre-apply the SQL with psql. Do not insert rows into `_sqlx_migrations` manually unless recovering from a specific failure.
## Why: the proc macro excludes pre-applied rows
When `DATABASE_URL` is set at compile time, `sqlx::migrate!()` queries `_sqlx_migrations` during compilation and embeds only the migrations not yet present in the DB. If you pre-apply migration 026 via psql and its row is in `_sqlx_migrations` before the build, the compiled binary will not contain migration 026 — then at runtime, finding a row for version 26 with no matching embedded migration causes a fatal startup error.
The fix: delete the pre-applied `_sqlx_migrations` row(s), rebuild with `SQLX_OFFLINE=true`, let the server apply them naturally.
## Write idempotent SQL
All migrations use `IF NOT EXISTS` or `IF EXISTS` forms:
```sql
-- Tables
CREATE TABLE IF NOT EXISTS policy_checks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
...
);
-- Columns
ALTER TABLE agents ADD COLUMN IF NOT EXISTS update_channel TEXT
CHECK (update_channel IN ('stable', 'beta'));
```
This protects against the "table already exists" error if a migration is somehow applied twice, and allows the migration to be run safely during development resets.
## SQLX_OFFLINE build environment
`SQLX_OFFLINE=true` is set permanently in `/home/guru/.cargo/env` on Saturn. All cargo builds by the `guru` user use the `.sqlx/` offline cache rather than querying the live DB at compile time. This eliminates the proc macro/DB interaction entirely.
After any schema change that adds or modifies a `query!()` macro, re-run:
```bash
cd /home/guru/gururmm/server && cargo sqlx prepare
git add server/.sqlx && git commit -m "build: update sqlx offline query cache"
```
## Recovery from _sqlx_migrations mismatch
If the server fails to start with "migration N was previously applied but is missing in the resolved migrations":
```bash
# Option 1: Delete the row (if the migration was manually applied and tables exist)
PGPASSWORD=<pw> psql -h localhost -U gururmm -d gururmm \
-c "DELETE FROM _sqlx_migrations WHERE version IN (N);"
# Then rebuild so the binary embeds the migration
# Option 2: If checksum mismatch (binary embedded wrong content)
# Fix the SQL file, rerun cargo sqlx prepare, rebuild, deploy
```
Never delete `_sqlx_migrations` rows for migrations that the current binary does NOT embed — those rows protect against re-running already-applied migrations.

View File

@@ -0,0 +1,24 @@
# Standards index — one entry per file, used by /inject-standards for relevance matching
# Format: "folder/name": "one-sentence description of what this standard covers"
# Every standards file must have an entry here.
standards:
"conventions/no-emojis": "Never use emojis in code, scripts, config, or output; use ASCII markers instead"
"conventions/naming": "Naming conventions for Python, PowerShell, Bash, database tables and columns"
"conventions/output-markers": "ASCII status markers [OK] [ERROR] [WARNING] [SUCCESS] [INFO] [CRITICAL] for all scripts"
"powershell/execution-pattern": "Always write PowerShell to .ps1 files and run with -NoProfile -File, never -Command inline"
"powershell/tmp-path-windows": "/tmp resolves to different directories in Git Bash vs Write tool; use .claude/tmp/ or heredocs"
"context-lookup/grepai-first": "Search with GrepAI or Grep before opening any file for context; Read only when full content is needed"
"security/credential-handling": "No hardcoded credentials; use SOPS vault or env vars; JWT auth; Argon2 hashing; log auth attempts"
"api/response-format": "RESTful plural nouns, consistent error format, pagination, idempotent sqlx migrations, OpenAPI"
"git/commit-style": "Conventional commit types, Co-Authored-By for Claude commits, files never to commit"
"gururmm/platform-parity": "All agent features must ship on Windows Linux and macOS; silent no-ops are bugs; cfg gating rules"
"gururmm/build-pipeline": "Never run build-agents.sh manually; all builds go through the Gitea webhook pipeline (push to main)"
"gururmm/sqlx-migrations": "Never manually pre-apply migrations; use IF NOT EXISTS; SQLX_OFFLINE build; let the server apply its own migrations"
"syncro/comment-dedup": "Never retry POST /comment without first GET /tickets/{id} to confirm it landed; use heredoc payloads"
"syncro/time-entry-protocol": "Use timer_entry flow for all billing; ask minutes and labor type before logging; never use bare add_line_item for labor"
"syncro/html-formatting": "Use <br> for line breaks in Syncro comments; ul/li collapses; no emojis; correct hidden/do_not_email flags"
"ssh/windows-openssh": "Use bare `ssh` command (system OpenSSH); never Git for Windows SSH; backslash hook blocks full Windows paths"
"gitea/internal-api": "Use http://172.16.3.20:3000 for Gitea API; git.azcomputerguru.com is behind Cloudflare and blocks curl"
"python/windows-runtime": "Use `py` on Windows (not python3/python); use jq for JSON; use Python scripts over heredocs with apostrophes"
"client/communication-tone": "Expert partner posture; state findings and act; ask only the one specific unknown you cannot look up"

View File

@@ -0,0 +1,58 @@
---
name: execution-pattern
description: Always write PowerShell to .ps1 files and run with -NoProfile -File, never -Command inline
applies-to: powershell, all
---
# PowerShell Execution Pattern (Windows)
## Rule: Always use -NoProfile -File
Never use inline PowerShell commands (`-Command` or `-c`). Always write scripts to `.ps1` files and execute with `-NoProfile -File`.
## Rationale
- **Prevents font/codepage changes**: PowerShell profile scripts often set `chcp 65001` or modify `[Console]::OutputEncoding`, which changes the Claude Code CLI font and breaks rendering. `-NoProfile` skips all profile scripts.
- **Avoids Git Bash quoting issues**: Inline commands have unpredictable quote escaping and variable expansion (`$_`, `$foo`) before PowerShell sees them. What you write is not what PowerShell receives.
- **Enforced by hooks**: `.claude/hooks/pre-bash-pwsh-script.sh` blocks inline execution and rejects the command before it runs.
## Correct pattern
```bash
# Write script to file using the Write tool
# (Write tool creates the file; Bash tool executes it)
# Execute with -NoProfile -File
pwsh -NoProfile -File /tmp/script.ps1
```
Or using a temp file in the claudetools tmp directory (safe from /tmp path mismatch):
```bash
# Write to .claude/tmp/ which both Write tool and Bash agree on
pwsh -NoProfile -File D:/claudetools/.claude/tmp/script.ps1
```
## Incorrect (BLOCKED BY HOOKS)
```bash
# These will be rejected by the pre-bash-pwsh-script.sh hook
powershell -Command "Get-Process"
pwsh -c "Get-Date"
powershell.exe -Command '$x = 5; Write-Host $x'
powershell.exe -Command "Get-Service GuruRMMAgent"
```
## Hook enforcement
The hook at `.claude/hooks/pre-bash-pwsh-script.sh` intercepts any `Bash` tool call and checks for the patterns `powershell.*-Command`, `powershell.*-c`, `pwsh.*-c`, `pwsh.*-Command`. If matched, the hook returns a non-zero exit code and the command is rejected.
The hook extracts only the `tool_input.command` field via `jq` before grepping — this prevents false positives from grep matching echo arguments or heredoc content.
## Note on /tmp path ambiguity on Windows
On Windows, `/tmp` resolves differently in Git Bash vs. the Write tool. Git Bash maps `/tmp` to `%LOCALAPPDATA%\Temp\`, while the Write tool may create files in `C:\tmp\`. This mismatch has caused wrong-content file reads in past sessions. Use `D:/claudetools/.claude/tmp/` as the temp directory for any file that needs to be read back by a Bash command.
## Reference
See `.claude/hooks/pre-bash-pwsh-script.sh` for enforcement details.

View File

@@ -0,0 +1,68 @@
---
name: tmp-path-windows
description: /tmp resolves to different directories in Git Bash vs Write tool on Windows; use .claude/tmp/ instead
applies-to: all
---
# /tmp Path Mismatch on Windows
## The problem
On Windows, `/tmp` resolves to two different real directories depending on which tool accesses it:
| Tool | /tmp resolves to |
|------|-----------------|
| Git Bash / curl | `%LOCALAPPDATA%\Temp\` (e.g., `C:\Users\Howard\AppData\Local\Temp\`) |
| Claude Code Write tool | `C:\tmp\` |
These are different directories. If you write a file with the Write tool to `/tmp/payload.json` and then read it back with curl in a Bash command, curl reads a stale file from a previous session — or a completely different file.
## Incident that established this rule
2026-05-01: The Write tool created `/tmp/comment_payload.json` in `C:\tmp\`. curl read `/tmp/comment_payload.json` from `%LOCALAPPDATA%\Temp\` — a stale payload from the previous day's Cascades session. The wrong comment (containing Karen Rossini / ALDOCS content) was posted to a Sombra ticket. The comment could not be deleted via the Syncro API.
## The fix: use .claude/tmp/ or heredocs
**Option 1 — Heredoc (preferred for JSON payloads):**
```bash
curl -s -X POST "${URL}" -H "Content-Type: application/json" --data-binary @- <<'JSON'
{"key": "value", "other": "data"}
JSON
```
No file involved — no path ambiguity. Use `<<'JSON'` for static content and `<<JSON` when you need shell variable interpolation.
**Option 2 — .claude/tmp/ directory:**
Both the Write tool and Git Bash agree on the absolute Windows path when you use the repo-relative temp directory:
```
D:/claudetools/.claude/tmp/
```
Write files there using the Write tool, then read them in Bash with `D:/claudetools/.claude/tmp/filename`. Forward slashes work in Git Bash.
```bash
# After writing with Write tool to D:/claudetools/.claude/tmp/payload.json
curl -s -X POST "${URL}" -H "Content-Type: application/json" \
-d @D:/claudetools/.claude/tmp/payload.json
```
**Option 3 — Python urllib (no file at all):**
```bash
py -c "
import urllib.request, json
data = json.dumps({'key': 'value'}).encode()
req = urllib.request.Request('$URL', data=data, headers={'Content-Type': 'application/json'})
print(urllib.request.urlopen(req).status)
"
```
## Applies to
Any workflow where the Write tool creates a file that is subsequently read by a Bash command on Windows. This includes:
- Syncro API payloads
- PowerShell script files (use .claude/tmp/ or the Write tool with an absolute path)
- Any temp file exchange between Write tool and Bash tool

View File

@@ -0,0 +1,84 @@
---
name: windows-runtime
description: Use `py` on Windows (not python3/python); use jq for JSON extraction; use Python scripts over heredocs with apostrophes
applies-to: all
---
# Python on Windows
## Interpreter command
On Windows (DESKTOP-0O8A1RL, GURU-BEAST-ROG), use `py` to invoke Python:
```bash
# Correct
py script.py
py -c "import hashlib; print(hashlib.sha384(b'test').hexdigest())"
py -m bot.main
# Wrong — not in PATH on this machine
python3 script.py
python script.py
```
`python3` and `python` are not in Git Bash's PATH on the Windows machines in this MSP. The Windows Launcher (`py.exe`) is the correct entry point and selects the installed Python version automatically. Store aliases are also disabled in this environment.
## JSON extraction — use jq, not Python
For JSON extraction in shell scripts, use `jq` rather than a Python one-liner. `jq` is installed at:
```
C:/Users/guru/AppData/Local/Microsoft/WinGet/Links/jq
```
```bash
# Correct — jq for JSON extraction
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command')
# Acceptable for complex extraction when jq is insufficient
py -c "import json,sys; d=json.load(sys.stdin); print(d['tool_input']['command'])"
```
## Heredocs with apostrophes — use Write + py script
If a heredoc payload contains apostrophes (single quotes) in the content, Git Bash `<<'EOF'` terminates early at the apostrophe. Workaround: write the content to a file using the Write tool, then read it with Python:
```bash
# Write the file with Write tool first, then:
py script.py
```
Or use Python to build and send the payload without a heredoc at all:
```bash
py -c "
import urllib.request, json
data = json.dumps({'body': \"It's a note about John's machine\"}).encode()
req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'})
urllib.request.urlopen(req)
"
```
## Sending HTTP requests via Python (avoids backslash hook)
The pre-bash-backslash hook blocks commands with Windows-style backslash paths. Python's `urllib.request` is a clean alternative to curl for sending POST requests that include backslash-containing paths in the payload:
```bash
py -c "
import urllib.request, json
url = 'http://172.16.3.30:8001/api/coord/messages'
data = json.dumps({'to_session': 'target', 'body': 'message'}).encode()
req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'})
resp = urllib.request.urlopen(req)
print(resp.status, resp.read())
"
```
## Unicode in log files
When reading log files that may contain non-UTF8 bytes (Windows logs, PowerShell output):
```python
sys.stdout.buffer.write(output.encode('utf-8', errors='replace'))
```
Do not use `open(file, 'r')` without specifying encoding on Windows — the default encoding varies by system locale and may not be UTF-8.

View File

@@ -0,0 +1,64 @@
---
name: credential-handling
description: No hardcoded credentials; use SOPS vault or env vars; JWT auth; Argon2 hashing; log auth attempts
applies-to: all
---
# Credential Handling
## Core rules
- **Never hardcode credentials** — no passwords, tokens, API keys, or connection strings in any source file, script, or config file that is committed to git
- **Use SOPS vault** for all secrets in this repo — access via the ClaudeTools vault wrapper:
```bash
VAULT="$CLAUDETOOLS_ROOT/.claude/scripts/vault.sh"
bash "$VAULT" get-field <path> <field>
bash "$VAULT" get <path>
bash "$VAULT" search "keyword"
```
- **Use environment variables** as an alternative when SOPS is impractical (e.g., Docker, CI). Never put the env var value in a committed file.
- **`.env` files are gitignored** and must never be committed. Verify before staging any `.env` file.
## Vault path structure
```
infrastructure/ — servers, services, SSH keys
clients/ — per-client credentials
services/ — third-party APIs (Syncro, etc.)
projects/ — project-specific secrets (GuruRMM DB, etc.)
msp-tools/ — MSP app suite tokens
```
The vault wrapper reads `vault_path` from `.claude/identity.json` (per-machine, gitignored). Every machine sets its own vault path there — no hardcoded vault paths in any shared file.
## API authentication
- **JWT tokens** for all ClaudeTools API authentication
- **Argon2** for password hashing (not bcrypt, not MD5, not SHA-256 plain)
- Log all authentication attempts and sensitive operations — failures must be logged with timestamp, IP, and identity attempted
## What not to commit
Never commit:
- `.env` files
- `credentials.json`, `*.pem`, `*.p12`, `*.pfx`
- Any file matching `*secret*`, `*password*`, `*token*` unless it is a SOPS-encrypted `.sops.yaml`
- Vault YAML files before encryption (plaintext SOPS files)
- SSH private keys
## Accessing secrets in scripts
```bash
# Correct — vault wrapper
PASSWORD=$(bash "$VAULT" get-field projects/claudetools/database.sops.yaml credentials.password)
# Correct — environment variable (set externally, not hardcoded)
psql "postgres://${DB_USER}:${DB_PASS}@localhost:5432/gururmm"
# WRONG — hardcoded inline (caught in code review)
psql "postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm"
```
## 1Password fallback
The 1Password service account token is in `infrastructure/1password-service-account.sops.yaml`. Use 1Password as a secondary vault for secrets that need to be accessed outside the SOPS workflow.

View File

@@ -0,0 +1,64 @@
---
name: windows-openssh
description: Use system OpenSSH (bare `ssh`); never Git for Windows SSH; the backslash hook blocks full Windows paths
applies-to: all
---
# SSH on Windows
## Use the system OpenSSH client
The correct SSH binary on Windows is the system-installed OpenSSH at:
```
C:\Windows\System32\OpenSSH\ssh.exe
```
Do not use Git for Windows SSH (`C:\Program Files\Git\usr\bin\ssh.exe`). The system OpenSSH has proper Windows integration, correct key handling, and is the version that works with the Windows OpenSSH registry for key agent.
## Use bare `ssh`, not the full path
The pre-bash-backslash hook at `.claude/hooks/pre-bash-backslash.sh` blocks any Bash command that contains Windows-style backslash paths (e.g., `C:\Windows\System32\OpenSSH\ssh.exe`). This hook exists to prevent accidental backslash path usage in Git Bash, which interprets `\` as escape sequences.
The system OpenSSH is on `PATH` in Git Bash, so use the bare command:
```bash
# Correct — uses system OpenSSH via PATH
ssh guru@172.16.3.30 "sudo /opt/gururmm/build-server.sh"
ssh -i C:/Users/guru/.ssh/id_ed25519 guru@172.16.3.30 "command"
# Wrong — full backslash path blocked by hook
C:\Windows\System32\OpenSSH\ssh.exe guru@172.16.3.30 "command"
```
Note: forward-slash paths are fine in Git Bash:
```bash
# This works (forward slashes)
"C:/Windows/System32/OpenSSH/ssh.exe" guru@172.16.3.30 "command"
# But bare ssh is simpler and preferred
```
## Key file paths
SSH keys are stored in `C:/Users/guru/.ssh/` — use forward slashes when specifying `-i`:
```bash
ssh -i C:/Users/guru/.ssh/id_ed25519 guru@172.16.3.30 "command"
```
## Known servers
| Host | IP | User | Auth |
|------|----|------|------|
| Saturn / GuruRMM server | 172.16.3.30 | guru | SSH key (id_ed25519) |
| Pluto / build server | 172.16.3.36 | Administrator | Password (vault) |
| Jupiter / Unraid | 172.16.3.20 | root | Password (vault) |
For password auth (Pluto), use paramiko in Python when interactive stdin is not available — `sshpass` is not installed in the Git Bash environment.
## PuTTY tools
For servers requiring interactive key acceptance or PuTTY-specific features (IX server):
- `pscp.exe` for file transfer
- `plink.exe` for commands
Both are in `C:/Program Files/PuTTY/` — use forward slashes or quote with double quotes when calling from Git Bash.

View File

@@ -0,0 +1,60 @@
---
name: comment-dedup
description: Never retry POST /comment without first GET /tickets/{id} to confirm it didn't land; always use heredoc payloads
applies-to: syncro, all
---
# Syncro Comment Deduplication
## The rule
Never retry a POST `/comment` call without first doing a `GET /tickets/{id}` to confirm the first attempt did not land.
Syncro has no DELETE endpoint for comments. A duplicate comment cannot be removed via the API — only manually through the Syncro GUI by an admin. Duplicate comments are visible to clients (public comments) and to all technicians (hidden comments), and the content cannot be redacted without deleting the entire comment.
## Incident that established this rule
2026-05-01: A comment with completely wrong content ("Karen Rossini" and the ALDOCS Cascades share) was posted to a Sombra Residential ticket. Root cause: a `/tmp` path mismatch on Windows caused curl to read a stale payload from a previous session rather than the newly written one. The wrong comment could not be deleted via API — Howard had to manually delete it through the Syncro GUI.
## Required pattern
**Before retrying any failed comment POST:**
```bash
# Check if the comment landed before retrying
curl -s "https://computerguru.syncromsp.com/api/v1/tickets/${TICKET_ID}?api_key=${API_KEY}" \
| jq '.ticket.comments[-1]'
```
If the comment is there, do not retry. If it is missing, diagnose why before retrying (network failure, wrong API key, wrong ticket ID).
## Payload pattern — use heredocs, not temp files
Do not write comment payloads to `/tmp/` files and read them back with curl. On Windows, `/tmp` resolves differently in Git Bash vs. the Claude Code Write tool, causing stale-file reads.
**Correct — heredoc piped directly to curl:**
```bash
curl -s -X POST "${BASE}/tickets/${TICKET_ID}/comment?api_key=${API_KEY}" \
-H "Content-Type: application/json" \
--data-binary @- <<'JSON'
{"subject": "Resolution", "body": "...", "hidden": false, "do_not_email": false}
JSON
```
Use `<<'JSON'` (single-quoted, literal) for static payloads. Use `<<JSON` (double-quoted, interpolating) when the payload includes shell variables like `${TIMER_ID}`.
**Wrong — temp file approach (prone to path mismatch):**
```bash
# Write tool writes to C:\tmp\, Git Bash reads from %LOCALAPPDATA%\Temp\
# These are different directories on Windows
cat > /tmp/comment.json << 'EOF'
{"subject": "Resolution", ...}
EOF
curl ... -d @/tmp/comment.json # May read wrong file
```
## Applies also to
All Syncro POST endpoints, not just comments. Timer entries, line items, and invoices all have the same no-retry-without-check rule since duplicates are visible and often cannot be deleted via API.

View File

@@ -0,0 +1,58 @@
---
name: html-formatting
description: Use <br> for line breaks in Syncro comments; <ul>/<li> collapses; no emojis
applies-to: syncro
---
# Syncro Comment HTML Formatting
## Line breaks
Use `<br>` for line breaks in Syncro ticket comment bodies. Do not use `\n` (literal newline) — Syncro renders comment bodies as HTML, and plain newlines are collapsed to a single space.
```json
{
"body": "Work performed:<br>- Replaced wireless adapter driver<br>- Verified network connectivity<br>- Customer running overnight memtest"
}
```
## Lists
`<ul>/<li>` does not render correctly in Syncro comments — it collapses into a single line with no visible separation. Use `<br>- ` as a bullet pattern instead:
**Wrong:**
```json
{"body": "<ul><li>Item one</li><li>Item two</li></ul>"}
```
**Correct:**
```json
{"body": "Items completed:<br>- Item one<br>- Item two"}
```
## No emojis
No emojis in Syncro comments. The same encoding rule that applies to all other output (see `conventions/no-emojis`) applies here. Syncro sends public comments as email; emoji characters can display incorrectly in various email clients.
Use `[OK]`, `[COMPLETE]`, `[PENDING]`, `[ACTION REQUIRED]` instead.
## Hidden vs. public comments
Set the visibility appropriately:
```json
{
"hidden": true,
"do_not_email": true
}
```
- `hidden: true` — internal only; client does not see it, no email sent
- `hidden: false` — client-visible; Syncro sends an email unless `do_not_email: true`
- `do_not_email: true` — suppress the email notification even for public comments
If a resolution comment contains passwords, internal notes, or credentials: always use `hidden: true, do_not_email: true`.
## Cascades-specific note
Never set `contact_id` on Cascades of Tucson tickets. Leaving it blank lets Syncro route the email to the correct distribution. Setting it (Meredith Kuhn has been the incorrect default in past sessions) overrides the distribution and breaks notification routing.

View File

@@ -0,0 +1,62 @@
---
name: time-entry-protocol
description: Always use timer_entry flow for billing; ask minutes and labor type before logging any time; never assume defaults
applies-to: syncro
---
# Syncro Time Entry Protocol
## Always ask before logging time
Before logging any time entry, ask the user:
1. How many minutes?
2. What labor type? (onsite, remote, emergency, warranty, project, etc.)
Never assume a default. Never round up or fill in a number. Billing errors are client-facing, hard to reverse, and affect prepaid block balances. An incorrect time entry requires Winter (billing) to manually reverse it.
## The required flow
All time-bearing work must use `timer_entry → charge_timer_entry`, not bare `add_line_item`. This is a hard rule.
```
1. POST /tickets/{id}/timer_entry — create the time record
2. POST /tickets/{id}/charge_timer_entry — generate the line item from the timer
3. Verify line item: GET /tickets/{id} → check price_retail on the new line item
4. If price_retail is wrong: PUT /tickets/{id}/line_items/{item_id} to patch it
5. POST /invoices — roll line item onto invoice
6. PUT /tickets/{id} — set status to Invoiced
```
The `add_line_item` endpoint bypasses Syncro's time-tracking table entirely. Using it for labor means hours appear in the invoice but not in time-tracking reports (hours per client, technician productivity, average resolution time, prepay burn rate). After the 2026-04-30 audit, 31 closed tickets had 00:00:00 in time tracking because bare `add_line_item` was used for all of them.
## When bare add_line_item is acceptable
Only for non-time items:
- Hardware/parts
- Flat-fee services with no labor component
- Software licenses
Even warranty or free labor must use `timer_entry` with `billable: false`. The only exception is cancelled tickets where no work was performed.
## Labor type reference
| Situation | Product | Note |
|-----------|---------|-------|
| Standard onsite | `26118` Onsite Business | At `hours × $175` |
| Emergency/after-hours | `26184` Emergency or After Hours | Full rate, no quantity multiplier |
| Prepaid project labor | `9269129` Prepaid Project Labor | At `$0/hr`; debits from prepay block |
| Warranty | Any labor product | `billable: false` on timer_entry |
## Prepaid customers
Before applying any rate, verify `prepay_hours` on the customer record:
```bash
curl -s "https://computerguru.syncromsp.com/api/v1/customers/${CUSTOMER_ID}?api_key=${API_KEY}" \
| jq '.customer.prepay_hours'
```
If `prepay_hours > 0`, use the prepaid product at `$0/hr` and verify the balance debits correctly after the invoice posts (Syncro may not debit until the invoice is paid in the GUI — flag for Winter if uncertain).
## Note on billable: false
The Syncro API ignores `billable: false` on `timer_entry` calls silently — the entry is created but the billing flag has no effect through the API. If a warranty/free entry is needed, create the timer entry, then verify through the GUI that the line item generated by `charge_timer_entry` is at $0. Patch with `update_line_item` if it came in at a non-zero rate.

93
docs/mission.md Normal file
View File

@@ -0,0 +1,93 @@
# ClaudeTools — Mission & Product Direction
## Mission
ClaudeTools is the internal operations platform for Arizona Computer Guru LLC. It tracks client work, billable time, infrastructure inventory, and encrypted credentials — and it provides a real-time coordination layer so that multiple Claude Code sessions (running on different machines or by different team members) can work in parallel without stepping on each other. It is built to support a 2-person MSP that uses AI-assisted workflows as a core part of how work gets done.
---
## Target User
**Primary:** Mike Swanson and Howard Enos — the two team members at Arizona Computer Guru LLC. They use Claude Code sessions throughout the day to handle client work, MSP tooling development, and infrastructure operations. ClaudeTools gives those sessions a shared source of truth.
**Claude Code sessions themselves** are also first-class consumers of the API — particularly the coordination subsystem, which sessions query at startup and before writing to any shared resource.
There is no external user base. This is internal infrastructure.
---
## Current Scope (what it does today)
**Work tracking:**
- Client management (`/api/clients`)
- Project tracking (`/api/projects`)
- Work session logging (`/api/sessions`)
- Billable time entries with rate and amount (`/api/billable-time`)
- Work items and task management (`/api/work-items`, `/api/tasks`)
- Tagging system across entities (`/api/tags`)
**Infrastructure inventory:**
- Machine inventory (`/api/machines`)
- Physical sites (`/api/sites`)
- IT assets/infrastructure (`/api/infrastructure`)
- Application services (`/api/services`)
- Network configurations (`/api/networks`)
- Firewall rule documentation (`/api/firewall-rules`)
- M365 tenant records (`/api/m365-tenants`)
**Credential management:**
- Encrypted credential storage (AES-256-GCM) for client and service credentials (`/api/credentials`)
- Immutable audit log of all credential access (`/api/credential-audit-logs`)
- Security incident tracking (`/api/security-incidents`)
**Authentication:**
- JWT-based auth with Argon2 password hashing (`/api/auth/token`)
**Coordination subsystem (`/api/coord`):**
- Component state tracking per project (GuruRMM, ClaudeTools, Dataforth, client work)
- Work locks: sessions claim a lock on a resource before writing; TTL-based auto-release
- Inter-session messaging: one Claude session can leave a note for another (e.g., "I left the server mid-deploy")
- No auth required — internal LAN only
**MCP integration:**
- `mcp-servers/feature-management/` — GuruRMM feature request tracking, accessible from Claude Code via MCP
---
## Near-Term Roadmap
- Auto-deploy via Gitea webhook (planned, not yet active)
- Optional Phase 7 extensions (all low-priority):
- File Changes API — track file modifications over time
- Command Runs API — command execution history
- Problem Solutions API — internal knowledge base
- Failure Patterns API — error pattern recognition
- Environmental Insights API — contextual learning across sessions
The API is considered feature-complete for current operational needs. New endpoints are added only when a specific workflow gap appears.
---
## Explicit Non-Goals
- **Not a PSA replacement** — ClaudeTools tracks work for internal record-keeping. Syncro PSA handles client-facing ticketing and invoicing; the two are separate.
- **Not a multi-tenant SaaS product** — single-tenant, self-hosted on ACG infrastructure. No plans to expose this to external users or clients.
- **Not a monitoring platform** — GuruRMM handles endpoint monitoring. ClaudeTools tracks the work done in response to what monitoring surfaces.
- **No external credential access UI** — credentials stored in ClaudeTools are accessed via API by Claude Code sessions. There is no web UI for browsing credentials.
- **No hardcoded credentials anywhere** — all secrets go through SOPS vault (primary) or 1Password (fallback). This is a non-negotiable constraint, not a goal to eventually achieve.
---
## Design Principles
**Coordination is first-class** — the coord API is not an afterthought. Multi-session, multi-machine Claude Code workflows are the normal operating mode, and the platform is built around making that safe.
**Claude sessions are API consumers** — the API is designed so that Claude Code can call it directly without human-in-the-loop for reads and non-destructive writes. The coord API in particular is designed for machine callers, not humans.
**Credentials never leave the vault unencrypted** — every credential stored via the API is AES-256-GCM encrypted at the service layer before hitting the database. Audit logs are immutable and automatic.
**Softfail over hard-fail** — if the coord API is unreachable, sessions queue their calls to `.claude/coord-queue.jsonl` and continue working. The platform degrades gracefully.
**Internal-only, LAN-scoped** — the coordination API has no authentication because it is network-scoped to 172.16.3.x. External exposure would require adding auth first.
**Two users, real workflows** — features are added when a real operational gap appears, not speculatively. The Phase 7 extensions are listed but not prioritized until a specific need arises.

137
docs/tech-stack.md Normal file
View File

@@ -0,0 +1,137 @@
# ClaudeTools API — Tech Stack
## Purpose
ClaudeTools is an MSP work-tracking and internal tooling platform built by Arizona Computer Guru LLC. It provides a REST API for tracking clients, projects, work sessions, billable time, infrastructure inventory, and encrypted credentials — plus a real-time coordination subsystem that multiple Claude Code sessions use to avoid clobbering each other. The API is production-stable with 95+ endpoints and 38 database tables.
---
## Components
### API Server
- **Framework:** Python, FastAPI
- **ASGI server:** Uvicorn
- **ORM/query layer:** SQLAlchemy (models in `api/models/`), Pydantic schemas (in `api/schemas/`)
- **Host:** 172.16.3.30, port 8001 (production)
- **Deployment:** <!-- TODO: verify — systemd unit or direct process? auto-deploy via Gitea webhook is planned but not confirmed active -->
- **OpenAPI docs:** http://172.16.3.30:8001/api/docs (also `http://localhost:8000/api/docs` in dev)
- **Repo path:** `api/` directory in `azcomputerguru/claudetools` on Gitea (http://172.16.3.20:3000)
**Repo layout:**
```
api/
├── main.py # Entry point
├── models/ # SQLAlchemy models (38 tables)
├── routers/ # Endpoint handlers (95+ endpoints across ~10 router files)
├── schemas/ # Pydantic request/response schemas
├── services/ # Business logic layer
├── middleware/ # Auth and error handling
└── utils/ # Crypto utilities (AES-256-GCM)
```
Key endpoint groups:
- `/api/machines`, `/api/clients`, `/api/projects`, `/api/sessions`, `/api/tags` — core entities
- `/api/work-items`, `/api/tasks`, `/api/billable-time` — MSP work tracking
- `/api/sites`, `/api/infrastructure`, `/api/services`, `/api/networks`, `/api/firewall-rules`, `/api/m365-tenants` — infrastructure inventory
- `/api/credentials`, `/api/credential-audit-logs`, `/api/security-incidents` — encrypted credential storage and audit
- `/api/auth/token` — JWT issuance
- `/api/coord/*` — coordination subsystem (no auth required; see below)
---
### Database
- **Engine:** MariaDB 10.6.22
- **Host:** 172.16.3.30:3306, database `claudetools`
- **Tables:** 38 (as of last audit)
- **Sensitive field encryption:** AES-256-GCM (Fernet) — applied at the service layer before write, decrypted on read. Credential passwords are the primary encrypted fields.
- **Audit logging:** All credential read/write operations logged to `credential_audit_logs` table automatically.
- **Migration approach:** Alembic. Migrations in `migrations/` directory. Commands: `alembic current`, `alembic upgrade head`.
- **Credentials:** Stored in SOPS vault at `projects/claudetools/database.sops.yaml`. Retrieve with:
```bash
bash $CLAUDETOOLS_ROOT/.claude/scripts/vault.sh get-field projects/claudetools/database.sops.yaml credentials.password
```
1Password fallback: `op read "op://Projects/ClaudeTools Database/password"`
---
### Authentication
- **Mechanism:** JWT tokens issued by `POST /api/auth/token` (email + password)
- **Password hashing:** Argon2
- **API key encryption:** AES-256-GCM (Fernet) applied to stored credential values
- **Coordination API:** No auth required — it is an internal-network-only subsystem
---
### Coordination Subsystem (`/api/coord`)
The coordination API is a first-class subsystem used by all Claude Code sessions (across machines and users) to prevent concurrent writes to the same resource. It requires no authentication and is intended for internal LAN use only.
Key endpoints:
- `GET /api/coord/status` — current component states for all projects
- `GET /api/coord/messages` — inter-session messages (filtered by `to_session`, `unread_only`)
- `PUT /api/coord/messages/<id>/read` — mark message read
- `POST /api/coord/locks` — claim a work lock on a resource (with TTL)
- `DELETE /api/coord/locks/<id>` — release a lock
- `PUT /api/coord/components/<project_key>/<component>` — update component state
Project keys: `gururmm`, `claudetools`, `dataforth-dos`, `clients/<name>`
Component states tracked per project:
- `gururmm`: `server`, `agents`, `dashboard`, `db_migrations` — states: `building`, `built`, `deploying`, `deployed`, `degraded`
- `claudetools`: `api`, `db_migrations`, `coord_api` — states: `deploying`, `deployed`, `degraded`
Softfail behavior: if the coord API is unreachable, sessions log failed calls to `.claude/coord-queue.jsonl` and drain on next `/sync`.
---
### Build / Deploy Pipeline
- **Trigger:** Gitea webhook on push to main — auto-deploy is planned but not confirmed active <!-- TODO: verify current deploy mechanism -->
- **Dev start:**
```bash
api\venv\Scripts\activate
uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
```
- **Environment:** `.env` file for local dev; production environment variables managed separately
---
### MCP Servers
Custom MCP servers live in `mcp-servers/`:
- `mcp-servers/feature-management/` — feature tracking MCP server (used by Claude Code to manage GuruRMM feature requests)
Config: `.mcp.json` at repo root.
---
## Key Architecture Decisions
- **FastAPI + SQLAlchemy** — standard Python async API stack. FastAPI chosen for automatic OpenAPI generation and Pydantic validation; SQLAlchemy for ORM flexibility across 38 tables.
- **MariaDB, not PostgreSQL** — chosen for the ClaudeTools API (GuruRMM uses PostgreSQL on the same host). Both run on 172.16.3.30.
- **AES-256-GCM at the service layer, not the DB layer** — credentials are encrypted before write in `api/utils/`, not via DB-native encryption. Allows key rotation and audit without DB-level access.
- **Coordination API with no auth** — deliberately unauthenticated; scoped to internal LAN. Simplifies cross-session coordination without requiring Claude sessions to manage API tokens.
- **Alembic for migrations** — standard SQLAlchemy migration tooling; migration history tracked in DB. All schema changes go through migration files.
- **SOPS vault for credentials** — no plaintext credentials in any checked-in file. Vault wrapper reads `vault_path` from per-machine `identity.json`; no hardcoded paths in shared files.
- **Agents do not run code** — Claude Code coordinator delegates all DB queries, code writes, and git operations to specialized sub-agents. The coordinator only reads 12 files directly.
---
## Development Workflow
1. Changes developed locally on developer machine (Windows: `D:\claudetools`; Mac: `~/claudetools`).
2. `api\venv\Scripts\activate` + `uvicorn api.main:app --reload` for local dev server on port 8000.
3. DB migrations: `alembic upgrade head` — must be applied before testing new schema.
4. Code reviewed via Code Review Agent (mandatory before merge).
5. Push to `azcomputerguru/claudetools` on Gitea (http://172.16.3.20:3000); production deploy to 172.16.3.30:8001.
6. Coordination API on prod (port 8001) is always-on; dev sessions can point to it directly for lock/message coordination.
---
## Current Version
- **API:** Production-stable, 95+ endpoints, 38 tables. No explicit version number tracked in the codebase. <!-- TODO: verify if there is a version field in main.py -->
- **Last context update:** 2026-04-14 (CONTEXT.md timestamp)