From 0c000109dc2bd37f7101b576e0b0a96d9b777c92 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Mon, 1 Jun 2026 16:25:45 -0700 Subject: [PATCH] chore(memory): consolidate scattered feedback/project/reference files Compressed memory store 104 -> 71 files via four passes: - Syncro: 19 scattered feedback_syncro_* files merged into 3 rule files (api/billing/workflow) + an on-demand feedback_syncro_history.md for incident detail, quotes, and tech/product ID tables. - Four near-duplicate merges: Howard paste-safety, Pluto build server, Howard backend deferral, IX server access (ssh+tailscale). - Per-cluster rule/state/history split applied to GuruConnect (2->1), Dataforth (3->2), Cascades (7->3), GuruRMM (13->3). - New reference_resource_map.md: single auto-loaded cheatsheet for "do I have access to X and how do I connect from this machine?" - MEMORY.md rewritten to match the new layout. Health: broken backlinks 8->7, overlap clusters 12->5, orphans 17->0. --- .claude/memory/MEMORY.md | 191 ++++++------- .../memory/feedback-rmm-unc-path-encoding.md | 19 -- .../memory/feedback_autonomous_infra_setup.md | 2 +- .claude/memory/feedback_cascades.md | 39 +++ .../feedback_cascades_folder_redirect.md | 26 -- .../feedback_cascades_user_security_group.md | 12 - .../feedback_check_patterns_before_asking.md | 34 +++ .claude/memory/feedback_command_formatting.md | 24 +- .claude/memory/feedback_gururmm.md | 86 ++++++ .../memory/feedback_gururmm_agent_parity.md | 16 -- .claude/memory/feedback_gururmm_builds.md | 14 - .claude/memory/feedback_howard_delegation.md | 12 - .../feedback_no_botalerts_internal_rmm.md | 21 -- .../feedback_no_indented_code_blocks.md | 12 - .claude/memory/feedback_psa_default_syncro.md | 2 +- .claude/memory/feedback_rmm_dev_is_mike.md | 15 - .claude/memory/feedback_rmm_identify_by_ip.md | 12 - .claude/memory/feedback_syncro_api.md | 82 ++++++ .../feedback_syncro_appointment_date_check.md | 31 -- .../feedback_syncro_appointment_owner.md | 40 --- .claude/memory/feedback_syncro_billing.md | 112 +++++++- .../memory/feedback_syncro_blank_contact.md | 19 -- .../feedback_syncro_cascades_contact.md | 13 - .../memory/feedback_syncro_comment_dedup.md | 20 -- .../memory/feedback_syncro_content_type.md | 12 - ...edback_syncro_corrections_preserve_tech.md | 18 -- .../feedback_syncro_emergency_billing.md | 22 -- .../feedback_syncro_estimate_hardware.md | 12 - .claude/memory/feedback_syncro_history.md | 140 ++++++++++ .claude/memory/feedback_syncro_html.md | 17 -- .claude/memory/feedback_syncro_labor_tax.md | 14 - .claude/memory/feedback_syncro_labor_type.md | 24 -- .claude/memory/feedback_syncro_line_items.md | 24 -- .claude/memory/feedback_syncro_live_rates.md | 18 -- .../feedback_syncro_no_madeup_labor_items.md | 12 - .claude/memory/feedback_syncro_timer_first.md | 18 -- .../feedback_syncro_timer_response_shape.md | 52 ---- .../feedback_syncro_warranty_product.md | 22 -- .claude/memory/feedback_syncro_workflow.md | 69 +++++ .../memory/gururmm-development-principles.md | 108 ------- .../memory/project-cascades-migration-plan.md | 20 -- .claude/memory/project_cascades.md | 55 ++++ .../memory/project_cascades_admin_accounts.md | 16 -- .claude/memory/project_cascades_billing.md | 14 - .../project_cascades_ca_phased_rollout.md | 26 -- .claude/memory/project_cascades_history.md | 52 ++++ .../memory/project_cascades_pilot_cleanup.md | 15 - .claude/memory/project_dataforth.md | 31 ++ .claude/memory/project_dataforth_email.md | 13 - .claude/memory/project_dataforth_history.md | 46 +++ .../project_dataforth_incident_2026-03-27.md | 39 --- .claude/memory/project_guruconnect.md | 69 +++++ .claude/memory/project_guruconnect_deploy.md | 54 ---- .../project_guruconnect_v2_direction.md | 32 --- .claude/memory/project_gururmm.md | 77 +++++ .../project_mac_gururmm_setup_pending.md | 28 -- .claude/memory/project_pluto_build_server.md | 18 -- .../memory/project_rmm_webhook_docs_guard.md | 22 -- .../memory/project_versionable_products.md | 2 +- .claude/memory/reference_acg_msp_stack.md | 2 +- .claude/memory/reference_dataforth_contact.md | 7 - .../memory/reference_gitea_api_credential.md | 2 +- .../reference_guru5070_rust_toolchain.md | 2 +- .claude/memory/reference_gururmm.md | 141 ++++++++++ .claude/memory/reference_gururmm_api.md | 92 ------ .../reference_gururmm_pipeline_vendored.md | 29 -- .claude/memory/reference_gururmm_server.md | 14 - .../reference_gururmm_user_session_context.md | 19 -- .../memory/reference_ix_access_tailscale.md | 7 - .claude/memory/reference_ix_server_access.md | 25 ++ .claude/memory/reference_ix_server_ssh.md | 20 -- .../memory/reference_pluto_build_server.md | 58 ++-- .claude/memory/reference_resource_map.md | 264 ++++++++++++++++++ ...rence_rmm_agent_runs_in_systemd_sandbox.md | 36 --- .claude/memory/user_font_preference.md | 14 + 75 files changed, 1473 insertions(+), 1324 deletions(-) delete mode 100644 .claude/memory/feedback-rmm-unc-path-encoding.md create mode 100644 .claude/memory/feedback_cascades.md delete mode 100644 .claude/memory/feedback_cascades_folder_redirect.md delete mode 100644 .claude/memory/feedback_cascades_user_security_group.md create mode 100644 .claude/memory/feedback_check_patterns_before_asking.md create mode 100644 .claude/memory/feedback_gururmm.md delete mode 100644 .claude/memory/feedback_gururmm_agent_parity.md delete mode 100644 .claude/memory/feedback_gururmm_builds.md delete mode 100644 .claude/memory/feedback_howard_delegation.md delete mode 100644 .claude/memory/feedback_no_botalerts_internal_rmm.md delete mode 100644 .claude/memory/feedback_no_indented_code_blocks.md delete mode 100644 .claude/memory/feedback_rmm_dev_is_mike.md delete mode 100644 .claude/memory/feedback_rmm_identify_by_ip.md create mode 100644 .claude/memory/feedback_syncro_api.md delete mode 100644 .claude/memory/feedback_syncro_appointment_date_check.md delete mode 100644 .claude/memory/feedback_syncro_appointment_owner.md delete mode 100644 .claude/memory/feedback_syncro_blank_contact.md delete mode 100644 .claude/memory/feedback_syncro_cascades_contact.md delete mode 100644 .claude/memory/feedback_syncro_comment_dedup.md delete mode 100644 .claude/memory/feedback_syncro_content_type.md delete mode 100644 .claude/memory/feedback_syncro_corrections_preserve_tech.md delete mode 100644 .claude/memory/feedback_syncro_emergency_billing.md delete mode 100644 .claude/memory/feedback_syncro_estimate_hardware.md create mode 100644 .claude/memory/feedback_syncro_history.md delete mode 100644 .claude/memory/feedback_syncro_html.md delete mode 100644 .claude/memory/feedback_syncro_labor_tax.md delete mode 100644 .claude/memory/feedback_syncro_labor_type.md delete mode 100644 .claude/memory/feedback_syncro_line_items.md delete mode 100644 .claude/memory/feedback_syncro_live_rates.md delete mode 100644 .claude/memory/feedback_syncro_no_madeup_labor_items.md delete mode 100644 .claude/memory/feedback_syncro_timer_first.md delete mode 100644 .claude/memory/feedback_syncro_timer_response_shape.md delete mode 100644 .claude/memory/feedback_syncro_warranty_product.md create mode 100644 .claude/memory/feedback_syncro_workflow.md delete mode 100644 .claude/memory/gururmm-development-principles.md delete mode 100644 .claude/memory/project-cascades-migration-plan.md create mode 100644 .claude/memory/project_cascades.md delete mode 100644 .claude/memory/project_cascades_admin_accounts.md delete mode 100644 .claude/memory/project_cascades_billing.md delete mode 100644 .claude/memory/project_cascades_ca_phased_rollout.md create mode 100644 .claude/memory/project_cascades_history.md delete mode 100644 .claude/memory/project_cascades_pilot_cleanup.md create mode 100644 .claude/memory/project_dataforth.md delete mode 100644 .claude/memory/project_dataforth_email.md create mode 100644 .claude/memory/project_dataforth_history.md delete mode 100644 .claude/memory/project_dataforth_incident_2026-03-27.md create mode 100644 .claude/memory/project_guruconnect.md delete mode 100644 .claude/memory/project_guruconnect_deploy.md delete mode 100644 .claude/memory/project_guruconnect_v2_direction.md create mode 100644 .claude/memory/project_gururmm.md delete mode 100644 .claude/memory/project_mac_gururmm_setup_pending.md delete mode 100644 .claude/memory/project_pluto_build_server.md delete mode 100644 .claude/memory/project_rmm_webhook_docs_guard.md delete mode 100644 .claude/memory/reference_dataforth_contact.md create mode 100644 .claude/memory/reference_gururmm.md delete mode 100644 .claude/memory/reference_gururmm_api.md delete mode 100644 .claude/memory/reference_gururmm_pipeline_vendored.md delete mode 100644 .claude/memory/reference_gururmm_server.md delete mode 100644 .claude/memory/reference_gururmm_user_session_context.md delete mode 100644 .claude/memory/reference_ix_access_tailscale.md create mode 100644 .claude/memory/reference_ix_server_access.md delete mode 100644 .claude/memory/reference_ix_server_ssh.md create mode 100644 .claude/memory/reference_resource_map.md delete mode 100644 .claude/memory/reference_rmm_agent_runs_in_systemd_sandbox.md create mode 100644 .claude/memory/user_font_preference.md diff --git a/.claude/memory/MEMORY.md b/.claude/memory/MEMORY.md index 735f693..309c0de 100644 --- a/.claude/memory/MEMORY.md +++ b/.claude/memory/MEMORY.md @@ -1,102 +1,89 @@ -# Memory Index - -## Reference -- [RMM agent runs in systemd sandbox](reference_rmm_agent_runs_in_systemd_sandbox.md) — Commands dispatched via the GuruRMM agent run inside its ProtectSystem=strict namespace (/ is ro there); fs/mount probes show the agent's view NOT the host. SSH or read /proc//mountinfo for host truth. (lesson 2026-06-01, GURU-KALI ghost churn) -- [GURU-5070 Rust toolchain](reference_guru5070_rust_toolchain.md) — GURU-5070 now has cargo + MSVC + protoc; build/clippy/test guru-connect LOCALLY (set PROTOC to the winget path) instead of the build host. CI only clippy-checks the Linux server, not the Windows agent. -- [ACG Office Network Infrastructure](infra_office_network.md) — IPs/hosts/roles for pfSense/Jupiter/VMs/Docker. Check before assuming; .21 (Uranus) is storage. -- [Power Failure Runbook](../POWER_FAILURE_RUNBOOK.md) — Recovery order after a power event: Tailscale routes, libvirt/VMs, Seafile, NPM/DNS. -- [Syncro API — Invoice Verification Pattern](syncro_invoice_verification_pattern.md) — /invoices?customer_id=X returns no ticket linkage; query /invoices/{number} for ticket_id. Compare by ticket ID, not number. -- [Approval Workflow: Tools vs Projects](approval-workflow-tools-vs-projects.md) — Tools (remediation, scripts): Howard/Claude with approval. Projects (GuruRMM): Mike approval; features→roadmap, bugs→bug list. -- [Community Forum (Flarum)](reference_community_forum.md) - Flarum forum at community.azcomputerguru.com, API access, database, posting workflow -- [Radio Show Website](reference_radio_website.md) - Astro static site at radio.azcomputerguru.com on IX server -- [IX Server SSH Access](reference_ix_server_ssh.md) - SSH access notes for GURU-5070 — re-verify key auth (was CachyOS) -- [IX Access via Tailscale](reference_ix_access_tailscale.md) - IX server accessible with Tailscale on, no VPN needed -- [Matomo Analytics](reference_matomo_analytics.md) - Self-hosted analytics at analytics.azcomputerguru.com, site IDs, tracking for all 3 sites -- [Dataforth Contact - AJ](reference_dataforth_contact.md) - AJ at Dataforth, dataforthgit@ email forwarding to him -- [TickTick Integration](reference_ticktick_integration.md) - OAuth API integration, MCP server, SOPS vault creds, project/task CRUD -- [Client Docs Structure](reference_client_docs_structure.md) — clients//docs/ layout (overview, network, servers, cloud, security, rmm). Template: clients/_client_template/. -- [MSP Audit Scripts](reference_msp_audit_scripts.md) — server_audit.ps1 / workstation_audit.ps1 at projects/msp-tools/msp-audit-scripts/. -- [GuruRMM Server Layout](reference_gururmm_server.md) - SSH as `guru`, repo at /home/guru/gururmm, deploy to /var/www/gururmm/dashboard/ -- [GuruConnect deploy](project_guruconnect_deploy.md) — Deploys MANUALLY by building on the server itself (172.16.3.30 has rust+node, use a login shell). Gotchas: installed systemd unit has NO watchdog (don't run setup-systemd.sh), set CONNECT_TRUSTED_PROXIES, migrations auto-run on boot, NULL-tags decode bug. v2 live 2026-05-30 at connect.azcomputerguru.com. -- [GuruRMM API — run script on agent](reference_gururmm_api.md) — POST /api/agents/:id/command (command_type=powershell); poll /api/commands/:id for output. Beats ScreenConnect copy-paste. -- [GuruRMM user_session command context](reference_gururmm_user_session_context.md) — command API `context=user_session` runs as the logged-on user (WTS); does interactive-only cmds that fail as SYSTEM. Needs an active (admin) user. -- [Pluto Build Server](reference_pluto_build_server.md) — Windows build VM: hostname PLUTO = Unraid VM "Claude-Builder" = 172.16.3.36 (all the same box). MSVC + WiX. No `pluto` vault entry. Drive via /rmm (agent enrolls as PLUTO) when SSH key isn't authorized. -- [Coord /messages API shape](reference_coord_messages_api_shape.md) — GET /api/coord/messages returns {total,skip,limit,messages[]} NOT a bare array; parse .messages[], strip control chars, read flag may be null. -- [GuruRMM pipeline vendored](reference_gururmm_pipeline_vendored.md) — RMM build scripts version-controlled at gururmm `deploy/build-pipeline/` (2026-06-01); build-shared.sh auto-syncs them to /opt/gururmm each build. Edit-in-repo + push = live, EXCEPT build-shared.sh + webhook-handler.py (manual cp). -- [Gitea API credential](reference_gitea_api_credential.md) — Gitea API (PRs/merges) as howard uses services/gitea-howard.sops.yaml password on internal http://172.16.3.20:3000; NOT the gururmm-server SSH password. - -## Users -- [Howard Enos](user_howard.md) — Mike's brother, technician, full access. Machines: ACG-TECH03L, Howard-Home (authoritative in users.json). - -## Feedback -- [Scheduling = coord todo, not schedulers](feedback_scheduling_via_coord_todo.md) — Defer future work as a coord todo (POST /api/coord/todos; needs text + created_by_user + created_by_machine) for a later session to pick up. NOT /schedule remote CCR agents (no vault/creds there) or local scheduled tasks. -- [Identify RMM agent by IP](feedback_rmm_identify_by_ip.md) — When the target machine is known by external IP, match the IP to find the agent; don't recon every candidate. (GuruRMM doesn't store agent IPs yet — todo 7459428e.) -- [Attribution is read, never inferred](feedback_attribution_from_identity.md) — Who-did-what (user+machine) comes ONLY from identity.json + users.json + git authorship. Never infer from hostname patterns, the userEmail hint, or memory. The "5070" box is Mike's. sync.sh reconciles git config to identity.json; /save renders the User block via whoami-block.sh. -- [GuruRMM agent parity rule](feedback_gururmm_agent_parity.md) — "Add feature X to the agent" = Windows + Linux + macOS in the same change, no exceptions. Stub + TODO if real impl not feasible. -- [D2TESTNAS SSH Access](feedback_d2testnas_ssh.md) - Use root@192.168.0.9 with Paper123!@#, not sysadmin -- [Bypass Permissions Setting](feedback_bypass_permissions_setting.md) - Set permissions.defaultMode to bypassPermissions in settings.json on all machines -- [No indented code blocks](feedback_no_indented_code_blocks.md) — Never indent code inside fences; Howard copy-pastes directly and leading spaces break PowerShell -- [365 Remediation Tool](feedback_365_remediation_tool.md) — "remediation tool" = tiered ComputerGuru app suite via /remediation-tool; NOT CIPP, NOT the deprecated fabb3421 -- [CA managed programmatically (with discipline)](feedback_ca_programmatic_management.md) — Conditional Access CAN be written via Tenant Admin app; ALWAYS report-only first + exclude break-glass + confirm before enforcing. Overrides old "CA manual" rule. -- [Ollama Tier-0 Routing](feedback_ollama_tier0_routing.md) - Route drafts/summaries/classifications through Ollama (qwen3:14b). Mike designed ClaudeTools this way — not optional. -- [/save writes narrative directly](feedback_save_no_ollama.md) — No Ollama for /save; write all sections inline — too slow -- [Syncro Emergency Billing](feedback_syncro_emergency_billing.md) — Emergency = time-and-a-half (×1.5), applied once, never additive. Branch by `customer.prepay_hours`: no-prepaid → `26184` at actual hrs; prepaid → `26184` at hrs×1.5 (premium in the QUANTITY). One line. Always set `price_retail`. (Updated 2026-05-27: prepaid now uses 26184, not 26118.) -- [Identity precedence](feedback_identity_precedence.md) — Trust `.claude/identity.json` over the system-reminder `userEmail` hint when they disagree (shared-login machines). -- [1Password — always use service token](feedback_1password_service_token.md) — Source OP_SERVICE_ACCOUNT_TOKEN from SOPS for every `op` call. Desktop-app integration prompts are unacceptable in agent flows. -- [Point vault-access teammates at SOPS path](feedback_vault_pointer_for_teammates.md) — When relaying infra/credential info to Howard or other vault-access teammates, hand over the SOPS path + key anchors; don't transcribe the entry's fields into the message. -- [/tmp path mismatch on Windows](feedback_tmp_path_windows.md) — Write tool and Git Bash resolve `/tmp` to DIFFERENT real dirs. Use heredoc or workspace path for JSON payloads handed to curl. Caused wrong-comment incident on Syncro #32225. -- [Syncro — leave contact blank by default](feedback_syncro_blank_contact.md) — Default to blank contact ("Not Assigned") on tickets and billing for ALL customers. Blank lets Syncro use company-level email defaults; setting a contact may route to a secondary email and bypass distribution. Generalizes the prior Cascades-only rule per Winter 2026-05-04. -- [Syncro — Cascades contact incident (Meredith Kuhn)](feedback_syncro_cascades_contact.md) — Meredith Kuhn is the recurring wrong Syncro default at Cascades. Incident context only; global rule is in feedback_syncro_blank_contact.md. -- [Syncro — use a billable labor type, never "Prepaid project labor"](feedback_syncro_labor_type.md) — Billable line items must use in-shop / onsite / remote / web labor. "Prepaid project labor" is exempt and won't decrement prepay blocks. Default is Remote labor for typical support tickets. Winter caught this 2026-05-04. -- [Syncro — bill with add_line_item, not timers](feedback_syncro_timer_first.md) — Bill tickets with `POST /tickets/{id}/add_line_item` directly; the timer workflow (`timer_entry → charge_timer_entry`) is NOT used. Set product_id, quantity (decimal hours), price_retail, name, description, taxable:false. Supersedes the old "timers required" rule (Mike confirmed 2026-05-21). -- [Syncro — timer_entry response is FLAT (HISTORICAL)](feedback_syncro_timer_response_shape.md) — Reference only: timers are NO LONGER part of the billing workflow (superseded by add_line_item — see feedback_syncro_timer_first.md). Retained for the rare manual-timer case: response is flat (`{"id": N, ...}`), parse `.id` not `.timer.id`. Originally hit on #32253 2026-05-05. -- [Syncro — warranty has its own product, never patch dollar amounts](feedback_syncro_warranty_product.md) — Warranty/no-charge work uses product `1049360` (Labor- Warranty work, $0). Don't fake a free line by patching `price_retail` or neutralizing a regular product — pick the correct product and re-run. Hit on #32225 2026-05-06. -- [Syncro — never make up labor items](feedback_syncro_no_madeup_labor_items.md) — Labor lines MUST be an existing Syncro product used with its REAL name; never invent/rename a line. Description field is free text. Made-up items break the QuickBooks sync. Incident #32332. -- [Syncro — preserve attribution (labor + ticket owner)](feedback_syncro_corrections_preserve_tech.md) — Corrections keep the original tech's labor user_id (commission); update_line_item preserves it, remove+add defaults to the API-key owner. Adding notes/labor never changes the ticket owner. Only reassign labor or ticket ownership when explicitly asked. (#32332) -- [SQL instance role — verify by connections, not name](feedback_sql_instance_role_by_connection.md) — Standard installed under default `SQLEXPRESS` instance name is real. Prove role with `sys.dm_exec_sessions` + `Get-NetTCPConnection -OwningProcess` before recommending stop/uninstall. IMC1 2026-05-05/06 near-miss. -- [Syncro — confirm appointment owner explicitly](feedback_syncro_appointment_owner.md) — When creating tickets with appointments, always ask "who is the appointment owner?" in the preview. Don't auto-default to ticket's assigned tech. Don't add additional attendees without explicit confirmation. Howard caught on Kittle ticket #32263 2026-05-08. -- [Syncro — verify appointment date day-of-week](feedback_syncro_appointment_date_check.md) — Always compute and display the day name (e.g. "Saturday 2026-05-23") in the ticket preview — never just the numeric date. Verify with `py -c "import datetime; ..."` before posting. Wrong-day incident on #32312 2026-05-21 (Sunday booked instead of Saturday). Reported by Winter. -- [Syncro estimate hardware product](feedback_syncro_estimate_hardware.md) — All hardware on estimates uses product_id 32252 ("Hardware", $0 base); set name/price_retail per item. Never look up individual hardware product IDs. -- [Clear-RecycleBin fails silently as SYSTEM](feedback_clear_recyclebin_system_context.md) — RMM-dispatched cleanup scripts cannot use `Clear-RecycleBin -Force`; the cmdlet uses Shell COM and silently no-ops without an interactive desktop. Enumerate `C:\$Recycle.Bin\\*` directly. Hit on ASSISTMAN-PC 2026-05-08. -- [Cascades — ask security group on user creation](feedback_cascades_user_security_group.md) — When creating any Cascades user, always ask which security group(s) they go in. Deliberate per-user decision; an OU→group auto-mirror was explicitly declined 2026-05-14. OU = sync scope; group = access/CA decision. -- [Cascades folder redirect — fdeploy failure/recovery](feedback_cascades_folder_redirect.md) — Must pre-create subfolders before first logon. fdeploy caches failures silently. Recovery: fix-shell-redirect.ps1. Both GUID and legacy name keys required. -- [Graph CA policy reads are eventually consistent](feedback_graph_ca_policy_eventual_consistency.md) — After PATCHing a CA policy (204), wait ~5s before GET-verifying; immediate reads can be stale. -- [Graph password reset needs a privileged role](feedback_graph_password_reset_requires_role.md) — PATCH passwordProfile on an existing user 403s without a directory role; User.ReadWrite.All alone only sets a password at CREATE. -- [Vault writes — do the full sequence yourself](feedback_complete_vault_operations_end_to_end.md) — A vault entry = write plaintext → sops -e -i → git add/commit/push, all of it; don't stop at "encrypted on disk." - -- [GuruRMM dev is Mike's, not Howard's](feedback_rmm_dev_is_mike.md) — Never route RMM dev/bug coord notes to Howard (0 RMM commits by him). Howard only submits RMM feature requests; GuruScan is his project, RMM is not. -- [RMM user_session UNC path encoding](feedback-rmm-unc-path-encoding.md) — Never use `"\\server\..."` literals in user_session scripts; use `[char]92` to build UNC paths explicitly. -- [Howard: defer backend/server follow-up to Mike](feedback_howard_delegation.md) — Howard doesn't want to touch server/agent code unless Mike asks -- [Syncro is the default PSA; Autotask is opt-in](feedback_psa_default_syncro.md) — Ticketing/billing/customers default to Syncro (/syncro). Only use /autotask on an explicit "in Autotask" request. /autotask kept local/undistributed. -- [Command Formatting](feedback_command_formatting.md) — Always multi-line scripts, never one-liners; one-liners wrap in chat and break on copy-paste -- [No bot-alerts for internal RMM dev/infra](feedback_no_botalerts_internal_rmm.md) — Post #bot-alerts ONLY when an RMM command directly affects a client endpoint or ticket; skip for internal build/CI/dev/recon. -- [Autonomous infra/build setup](feedback_autonomous_infra_setup.md) — During infra/build/CI/dev setup, just install prerequisites and push through routine steps; reserve check-ins for genuine decisions (forks, destructive/outward, client/prod). - -## Machine -- [GURU-5070 Workstation Setup](reference_workstation_setup.md) - Mike's primary (owner confirmed 2026-05-26). Windows 11 Pro. Renamed from OC-5070 → ACG-5070/acg-guru-5070 → GURU-5070; all the same box, all Mike's. -- [GURU-BEAST-ROG Setup Status](machine_windows_guru_setup_status.md) — Windows workstation fully configured except SSH key deployment to servers. - -## Pending Setup -- [Mac gururmm setup pending](project_mac_gururmm_setup_pending.md) — ACTION REQUIRED: run `bash scripts/install-hooks.sh` in gururmm repo on Mikes-MacBook-Air before any RMM work - -## Project -- [Automate memory consolidation/lint (phased)](project_memory_consolidation_automation.md) — Eventually auto-run /memory-dream; lint+additive fixes can automate early, merges/deletes stay human-approved. Engine: .claude/skills/memory-dream/ + .claude/scripts/sync-memory.sh. -- [RMM webhook docs-only build guard](project_rmm_webhook_docs_guard.md) — RMM build webhook skips docs-only pushes (host guard in /opt/gururmm/webhook-handler.py, SPEC-020 Phase 0); repo copy is stale, don't redeploy it -- [GuruConnect v2 direction](project_guruconnect_v2_direction.md) — v2 re-architecture (SPEC-002, 2026-05-29): greenfield-salvage-cores, NATIVE-first (full key fidelity Win+R/Ctrl+Alt+Del + bidirectional file cut/paste/drag are Mike's headline must-haves; WebRTC fallback only), standalone-first + RMM contract, hardened single-tenant but tenancy-ready schema. Willing to scrap v1 entirely. -- [Apple MDM + Developer certs (GuruRMM mobile)](project_apple_mdm_certs.md) — ACG holds both Apple Developer+signing and Apple MDM Push certs (acquired 2026-05-29) for SPEC-017 mobile support. MDM push cert RENEWS ANNUALLY on the same Apple ID or all enrolled iOS devices break. Capture Apple ID + expiry. -- [Only RMM & GC are versionable products](project_versionable_products.md) — GuruRMM + GuruConnect are the only products with own repos/submodules; everything else stays in the claudetools monorepo. Split only for independent pipeline OR versioned external consumer. -- [Quantum GoDaddy M365 tenant](project_quantum_godaddy_m365_tenant.md) — quantumwms.com parked in a GoDaddy-provisioned M365 tenant (id ddf3d2c9-b76c-40d9-a216-9f11a1a26f97, netorg18235235.onmicrosoft.com); blocks Pax8 migration until GoDaddy removed. Managed = no DNS takeover; need GoDaddy/GA access. -- [Cascades Migration Plan](project-cascades-migration-plan.md) — Active multi-day migration. Plan file: `C:\Users\Howard\.claude\plans\wise-discovering-panda.md`. Syncro ticket: #110680053. Resume: "resume the Cascades migration plan". -- [GuruRMM Development Principles](gururmm-development-principles.md) - MANDATORY: every feature needs full stack (backend, API, UI, docs, scalability). Product must work without AI agents (AI features are enhancements). Documented in guru-rmm/docs/DESIGN.md. -- [Sync script bug — untracked files (RESOLVED)](project_sync_script_bug.md) — FIXED 2026-05-21: sync.sh now uses `git status --porcelain` for change detection (repo + vault), so untracked-only changes are caught. Added .gitignore for the datto BSOD dumps so the fix doesn't sweep 54MB of binaries. -- [MasterBooter Side Project](project_masterbooter.md) — Howard's Rust+Slint Windows deployment toolkit at C:\MasterBooter, separate from client work. Do not log to clients/. -- [Audio Processor Architecture](project_audio_processor_architecture.md) - Segment-first pipeline: detect breaks before transcription for complete content capture -- [Neptune SBR Email Routing Setup](project_neptune_sbr_email_routing.md) - Full SBR routing chain, config file locations, MailProtector integration, access methods. Treat routing breakage as systemic (devcon, Sorensen/rieussetcorp), not per-client. -- [Dataforth Test Datasheet Pipeline](project_datasheet_pipeline.md) - Full pipeline rebuilt 2026-03-27. Server-side generation replaces DFWDS/Uploader. Website upload still broken. -- [Dataforth Security Incident](project_dataforth_incident_2026-03-27.md) - DF-JOEL2 compromised, MFA deployed, IC3 filed. CA policies enforce April 4. -- [Radio show co-host — Tara, not Tom](radio_show_no_cohost_named_tom.md) — Co-host in 2014-s6e19 and 2016-s8e43 is Tara. "Tom" was hallucinated; rename complete. Multiple co-hosts have rotated through the show. -- [Cascades admin accounts](project_cascades_admin_accounts.md) — Howard uses sysadmin@cascadestucson.com, Mike uses admin@cascadestucson.com; daily admin, NOT break-glass. -- [Cascades CA phased rollout](project_cascades_ca_phased_rollout.md) — Caregiver CA policies scoped to SG-Caregivers-Pilot, expand by dept; PATCH excludeGroups, never delete the all-users-MFA policy. -- [Cascades caregiver pilot cleanup](project_cascades_pilot_cleanup.md) — Remove pilot accounts (pilot.test@, howard.enos@) at the end of the caregiver bypass pilot. -- [Proposal: centralize config in identity.json](proposal_identity_centralization.md) — Rationale for the identity.json machine-config centralization (claudetools_root, ollama/python); now implemented. -- [ACG MSP tool stack](reference_acg_msp_stack.md) — ScreenConnect/CW Control, Splashtop, Syncro, Datto RMM, Datto EDR/AV, GuruRMM are ACG's OWN tools; do not flag as foreign/threat on managed machines (Defender-off is expected when Datto AV is active). +# Memory Index + +## Reference +- [ACG resource map](reference_resource_map.md) — **READ THIS FIRST** when a task references a server/service/tenant/API. What we have access to, how to connect from this machine, per-machine exceptions, gotchas. Points at the detail files below. +- [GURU-5070 Rust toolchain](reference_guru5070_rust_toolchain.md) — GURU-5070 now has cargo + MSVC + protoc; build/clippy/test guru-connect LOCALLY (set PROTOC to the winget path) instead of the build host. CI only clippy-checks the Linux server, not the Windows agent. +- [ACG Office Network Infrastructure](infra_office_network.md) — IPs/hosts/roles for pfSense/Jupiter/VMs/Docker. Check before assuming; .21 (Uranus) is storage. +- [Power Failure Runbook](../POWER_FAILURE_RUNBOOK.md) — Recovery order after a power event: Tailscale routes, libvirt/VMs, Seafile, NPM/DNS. +- [Syncro API — Invoice Verification Pattern](syncro_invoice_verification_pattern.md) — /invoices?customer_id=X returns no ticket linkage; query /invoices/{number} for ticket_id. Compare by ticket ID, not number. +- [Approval Workflow: Tools vs Projects](approval-workflow-tools-vs-projects.md) — Tools (remediation, scripts): Howard/Claude with approval. Projects (GuruRMM): Mike approval; features→roadmap, bugs→bug list. +- [Community Forum (Flarum)](reference_community_forum.md) — Flarum forum at community.azcomputerguru.com, API access, database, posting workflow. +- [Radio Show Website](reference_radio_website.md) — Astro static site at radio.azcomputerguru.com on IX server. +- [IX Server Access](reference_ix_server_access.md) — `ix.azcomputerguru.com` / 172.16.3.10. Reachable when Tailscale is on (no VPN). SSH currently uses sshpass with root password; key auth from GURU-5070 not configured yet (was CachyOS, now Win11 — verify). +- [Matomo Analytics](reference_matomo_analytics.md) — Self-hosted analytics at analytics.azcomputerguru.com, site IDs, tracking for all 3 sites. +- [TickTick Integration](reference_ticktick_integration.md) — OAuth API integration, MCP server, SOPS vault creds, project/task CRUD. +- [Client Docs Structure](reference_client_docs_structure.md) — clients//docs/ layout (overview, network, servers, cloud, security, rmm). Template: clients/_client_template/. +- [MSP Audit Scripts](reference_msp_audit_scripts.md) — server_audit.ps1 / workstation_audit.ps1 at projects/msp-tools/msp-audit-scripts/. +- [Pluto Build Server](reference_pluto_build_server.md) — Windows build VM: hostname PLUTO = Unraid VM "Claude-Builder" = 172.16.3.36 (all the same box). MSVC + WiX + Azure Trusted Signing. Drive via /rmm (agent enrolls as PLUTO) when SSH key isn't authorized. +- [Coord /messages API shape](reference_coord_messages_api_shape.md) — GET /api/coord/messages returns {total,skip,limit,messages[]} NOT a bare array; parse .messages[], strip control chars, read flag may be null. +- [Gitea API credential](reference_gitea_api_credential.md) — Gitea API (PRs/merges) as howard uses services/gitea-howard.sops.yaml password on internal http://172.16.3.20:3000; NOT the gururmm-server SSH password. +- [Gitea Internal API Access](reference_gitea_internal.md) — git.azcomputerguru.com is NOT behind Cloudflare — it's the office Cox IP NAT'd to NPM (openresty) on Jupiter. Prefer internal 172.16.3.20:3000 for reliability (bypasses NPM SSL-renewal reload blips). +- [GuruRMM technical reference](reference_gururmm.md) — Server (172.16.3.30) layout + API + `context=user_session` (WTS impersonation) + build-pipeline vendoring at `deploy/build-pipeline/` (auto-syncs to /opt/gururmm) + Linux agent systemd sandbox trap (ProtectSystem=strict makes fs/mount observations sandbox-local). + +## Users +- [Howard Enos](user_howard.md) — Mike's brother, technician, full access. Machines: ACG-TECH03L, Howard-Home (authoritative in users.json). +- [Mike — font preference](user_font_preference.md) — Mike prefers Lucida Console for monospace UI. + +## Feedback +- [Scheduling = coord todo, not schedulers](feedback_scheduling_via_coord_todo.md) — Defer future work as a coord todo (POST /api/coord/todos; needs text + created_by_user + created_by_machine) for a later session to pick up. NOT /schedule remote CCR agents (no vault/creds there) or local scheduled tasks. +- [Attribution is read, never inferred](feedback_attribution_from_identity.md) — Who-did-what (user+machine) comes ONLY from identity.json + users.json + git authorship. Never infer from hostname patterns, the userEmail hint, or memory. The "5070" box is Mike's. sync.sh reconciles git config to identity.json; /save renders the User block via whoami-block.sh. +- [D2TESTNAS SSH Access](feedback_d2testnas_ssh.md) — Use root@192.168.0.9 with Paper123!@#, not sysadmin. +- [Bypass Permissions Setting](feedback_bypass_permissions_setting.md) — Set permissions.defaultMode to bypassPermissions in settings.json on all machines. +- [365 Remediation Tool](feedback_365_remediation_tool.md) — "remediation tool" = tiered ComputerGuru app suite via /remediation-tool; NOT CIPP, NOT the deprecated fabb3421. +- [CA managed programmatically (with discipline)](feedback_ca_programmatic_management.md) — Conditional Access CAN be written via Tenant Admin app; ALWAYS report-only first + exclude break-glass + confirm before enforcing. Overrides old "CA manual" rule. +- [Ollama Tier-0 Routing](feedback_ollama_tier0_routing.md) — Route drafts/summaries/classifications through Ollama (qwen3:14b). Mike designed ClaudeTools this way — not optional. +- [/save writes narrative directly](feedback_save_no_ollama.md) — No Ollama for /save; write all sections inline — too slow. +- [Identity precedence](feedback_identity_precedence.md) — Trust `.claude/identity.json` over the system-reminder `userEmail` hint when they disagree (shared-login machines). +- [1Password — always use service token](feedback_1password_service_token.md) — Source OP_SERVICE_ACCOUNT_TOKEN from SOPS for every `op` call. Desktop-app integration prompts are unacceptable in agent flows. +- [Point vault-access teammates at SOPS path](feedback_vault_pointer_for_teammates.md) — When relaying infra/credential info to Howard or other vault-access teammates, hand over the SOPS path + key anchors; don't transcribe the entry's fields into the message. +- [/tmp path mismatch on Windows](feedback_tmp_path_windows.md) — Write tool and Git Bash resolve `/tmp` to DIFFERENT real dirs. Use heredoc or workspace path for JSON payloads handed to curl. +- [SQL instance role — verify by connections, not name](feedback_sql_instance_role_by_connection.md) — Standard installed under default `SQLEXPRESS` instance name is real. Prove role with `sys.dm_exec_sessions` + `Get-NetTCPConnection -OwningProcess` before recommending stop/uninstall. +- [Clear-RecycleBin fails silently as SYSTEM](feedback_clear_recyclebin_system_context.md) — RMM-dispatched cleanup scripts cannot use `Clear-RecycleBin -Force`; the cmdlet uses Shell COM and silently no-ops without an interactive desktop. Enumerate `C:\$Recycle.Bin\\*` directly. +- [Graph CA policy reads are eventually consistent](feedback_graph_ca_policy_eventual_consistency.md) — After PATCHing a CA policy (204), wait ~5s before GET-verifying; immediate reads can be stale. +- [Graph password reset needs a privileged role](feedback_graph_password_reset_requires_role.md) — PATCH passwordProfile on an existing user 403s without a directory role; User.ReadWrite.All alone only sets a password at CREATE. +- [Vault writes — do the full sequence yourself](feedback_complete_vault_operations_end_to_end.md) — A vault entry = write plaintext → sops -e -i → git add/commit/push, all of it; don't stop at "encrypted on disk." +- [Syncro is the default PSA; Autotask is opt-in](feedback_psa_default_syncro.md) — Ticketing/billing/customers default to Syncro (/syncro). Only use /autotask on an explicit "in Autotask" request. /autotask kept local/undistributed. +- [Paste-safe command formatting (Howard)](feedback_command_formatting.md) — Two clauses, one root cause: (a) multi-line scripts not semicolon one-liners (wrap breaks paste), (b) all code at column 0 inside fences (indentation breaks PowerShell paste). +- [Autonomous infra/build setup](feedback_autonomous_infra_setup.md) — During infra/build/CI/dev setup, just install prerequisites and push through routine steps; reserve check-ins for genuine decisions (forks, destructive/outward, client/prod). +- [Check patterns before asking](feedback_check_patterns_before_asking.md) — Before asking how to do something repeat-style (sync, save, sweep, billing), study existing artifacts and workflow docs first; reach for similar past artifacts as the template. +- [Client communication tone](feedback_client_tone.md) — How to write client-facing Syncro comments — expert partner, not intake questionnaire. +- [Add Mike as owner on all Entra apps](feedback_entra_app_owner.md) — Apps created via management SP have no user owner — must add Mike manually or publisher verification fails. +- [No TOML/config file approach for endpoints](feedback_no_toml_config_endpoints.md) — User explicitly prohibits TOML or config-file-based endpoint configuration — this will never be approved. +- [Python on Windows — use py launcher](feedback_python_windows.md) — Windows Store python/python3 aliases disabled; always use py or jq on DESKTOP-0O8A1RL. + +### Syncro +- [Syncro API plumbing](feedback_syncro_api.md) — Content-Type required on all POST/PUT; NO idempotency anywhere — always GET before retrying; response wrappers (`.ticket.id`, `.comment.id`); add_line_item shape (internal ID, flat response, required fields); HTML uses `
` not `
    /
  • `; timer_entry response is FLAT but SUPERSEDED (use add_line_item). +- [Syncro billing rules](feedback_syncro_billing.md) — Bill with `add_line_item` directly (not timers); fetch rates LIVE; never invent labor names (real product names only); match labor type to delivery channel (never "Prepaid project labor"); labor `taxable:false` (AZ); warranty `1049360` (never patch price); emergency `26184` ×1.5 once, branch by `prepay_hours`; corrections preserve original tech's user_id; estimate hardware `32252`. +- [Syncro workflow rules](feedback_syncro_workflow.md) — ALWAYS preview comments before posting (no exceptions); verify appointment day-of-week ("Saturday 2026-05-23") before creating; ASK who the appointment owner is; leave `contact_id` BLANK by default for ALL customers (ignore Syncro's contact-picker auto-default). +- [Syncro lessons / incident archive](feedback_syncro_history.md) — Detail behind the three rule files: tickets (#32332, #32312, #32225, #32253, #32203, #32185, #32142, #32304, #32333), verbatim Mike/Howard/Winter quotes, dates, tech user_id table (Mike 1735 / Howard 1750 / Winter 1737 / Rob 1760), labor product table, and superseded-rule history. + +### GuruRMM +- [GuruRMM operational rules](feedback_gururmm.md) — Six rules: (1) RMM dev = Mike, never Howard (368/0 commits); GuruScan is Howard's. (2) Agent parity Win+Linux+macOS in same change. (3) Builds via Gitea webhook pipeline only, never SSH. (4) #bot-alerts only for client/ticket impact, skip internal infra/dev. (5) Identify agents by IP, not by reconning candidates. (6) UNC paths in user_session need [char]92 — literals get halved. + +### Cascades +- [Cascades operational rules](feedback_cascades.md) — Two active rules: (1) folder redirection (fdeploy) needs subfolders PRE-CREATED before first logon or it caches a failure forever; recovery via fix-shell-redirect.ps1. (2) ALWAYS ask which security group(s) a new user goes into — never auto-derive from OU. + +## Machine +- [GURU-5070 Workstation Setup](reference_workstation_setup.md) — Mike's primary (owner confirmed 2026-05-26). Windows 11 Pro. Renamed from OC-5070 → ACG-5070/acg-guru-5070 → GURU-5070; all the same box, all Mike's. +- [GURU-BEAST-ROG Setup Status](machine_windows_guru_setup_status.md) — Windows workstation fully configured except SSH key deployment to servers. + +## Project +- [Automate memory consolidation/lint (phased)](project_memory_consolidation_automation.md) — Eventually auto-run /memory-dream; lint+additive fixes can automate early, merges/deletes stay human-approved. Engine: .claude/skills/memory-dream/ + .claude/scripts/sync-memory.sh. +- [GuruRMM project state](project_gururmm.md) — Dev principles (every feature full-stack: backend+API+UI+docs+scalability; product works without AI; FEATURE_ROADMAP update is part of definition-of-done; mirrors guru-rmm/docs/DESIGN.md). Webhook docs-only build guard (SPEC-020 Phase 0; webhook-handler.py repo copy is STALE — don't redeploy). Mac install-hooks.sh setup STILL PENDING on Mikes-MacBook-Air. +- [GuruConnect](project_guruconnect.md) — v2 direction (native-first full key fidelity Win+R/Ctrl+Alt+Del + bidirectional file cut/paste/drag; WebRTC fallback only; standalone-first + RMM contract; tenancy-ready schema; Mike willing to scrap v1). Manual deploy procedure to 172.16.3.30 (build-on-server in login shell; sqlx runtime queries; NPM `CONNECT_TRUSTED_PROXIES=172.16.3.20` gotcha). v2 live since 2026-05-30. +- [Apple MDM + Developer certs (GuruRMM mobile)](project_apple_mdm_certs.md) — ACG holds Apple Developer+signing and Apple MDM Push certs (acquired 2026-05-29) for SPEC-017. MDM push cert RENEWS ANNUALLY on the same Apple ID or all enrolled iOS devices break. +- [Only RMM & GC are versionable products](project_versionable_products.md) — GuruRMM + GuruConnect are the only products with own repos/submodules; everything else stays in the claudetools monorepo. Split only for independent pipeline OR versioned external consumer. +- [Quantum GoDaddy M365 tenant](project_quantum_godaddy_m365_tenant.md) — quantumwms.com parked in a GoDaddy-provisioned M365 tenant (id ddf3d2c9-b76c-40d9-a216-9f11a1a26f97, netorg18235235.onmicrosoft.com); blocks Pax8 migration until GoDaddy removed. +- [Cascades](project_cascades.md) — Active state: Syncro ticket #110680053 + plan file (machine-specific path on Howard's box), admin accounts (sysadmin@=Howard, admin@=Mike — daily-driver, NOT break-glass), Phase-B caregiver CA pilot (SG-Caregivers-Pilot, group-scoped never tenant-wide), prepaid block ~37.5h (rate TBD), pilot cleanup checklist. +- [Cascades history](project_cascades_history.md) — fdeploy 502/ACL root cause (Flags=1211→187 fix), 2026-04-29 CA-rescoping decision (Howard pulled the brakes on tenant-wide), 2026-05-14 per-user-security-group decision rationale. +- [Sync script bug — untracked files (RESOLVED)](project_sync_script_bug.md) — FIXED 2026-05-21: sync.sh now uses `git status --porcelain` for change detection (repo + vault). +- [MasterBooter Side Project](project_masterbooter.md) — Howard's Rust+Slint Windows deployment toolkit at C:\MasterBooter, separate from client work. Do not log to clients/. +- [Audio Processor Architecture](project_audio_processor_architecture.md) — Segment-first pipeline: detect breaks before transcription for complete content capture. +- [Neptune SBR Email Routing Setup](project_neptune_sbr_email_routing.md) — Full SBR routing chain, config file locations, MailProtector integration, access methods. Treat routing breakage as systemic (devcon, Sorensen/rieussetcorp), not per-client. +- [Dataforth Test Datasheet Pipeline](project_datasheet_pipeline.md) — Full pipeline rebuilt 2026-03-27. Server-side generation replaces DFWDS/Uploader. Website upload still broken. +- [Dataforth](project_dataforth.md) — M365 email (Graph API; tenant in vault at clients/dataforth/m365.sops.yaml); neptune.acghosting.com is ACG's, NOT Dataforth's. MFA enforced 2026-04-04 (3 CA policies). AJ needs dataforthgit@ forwarding. +- [Dataforth history (2026-03-27 incident)](project_dataforth_history.md) — DF-JOEL2 compromise via ScreenConnect social-engineering, attacker C2 IPs + IC3 case + remediation log + MFA rollout origin story + Joel Lohr retirement. RESOLVED 2026-04-04. +- [Radio show co-host — Tara, not Tom](radio_show_no_cohost_named_tom.md) — Co-host in 2014-s6e19 and 2016-s8e43 is Tara. "Tom" was hallucinated; rename complete. +- [Proposal: centralize config in identity.json](proposal_identity_centralization.md) — Rationale for the identity.json machine-config centralization (claudetools_root, ollama/python); now implemented. +- [ACG MSP tool stack](reference_acg_msp_stack.md) — ScreenConnect/CW Control, Splashtop, Syncro, Datto RMM, Datto EDR/AV, GuruRMM are ACG's OWN tools; do not flag as foreign/threat on managed machines (Defender-off is expected when Datto AV is active). +- [ACG Website Hosting](project_azcomputerguru_hosting.md) — azcomputerguru.com is hosted on IX Web Hosting via cPanel. diff --git a/.claude/memory/feedback-rmm-unc-path-encoding.md b/.claude/memory/feedback-rmm-unc-path-encoding.md deleted file mode 100644 index 98bebf2..0000000 --- a/.claude/memory/feedback-rmm-unc-path-encoding.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -name: feedback-rmm-unc-path-encoding -description: RMM PowerShell UNC paths via user_session context lose one backslash when using string literals — must build with [char]92 -metadata: - type: feedback ---- - -Never use `"\\CS-SERVER\..."` string literals in PowerShell scripts dispatched via GuruRMM `user_session` context. The backslash gets halved somewhere in the encoding pipeline, producing `\CS-SERVER\...` (a local path) instead of the UNC `\\CS-SERVER\...`. - -**Why:** The `user_session` execution wrapper appears to process escape sequences in the script text differently than `system` context, stripping one backslash from `\\`. - -**How to apply:** Always build UNC paths explicitly when using user_session: -```powershell -$bs = [char]92 -$base = "${bs}${bs}CS-SERVER${bs}homes${bs}Username" -``` -This constructs `\\CS-SERVER\homes\Username` correctly regardless of context. - -The `system` context (offline hive reg query) showed correct `\\CS-SERVER` output, so the issue is specific to `user_session`. diff --git a/.claude/memory/feedback_autonomous_infra_setup.md b/.claude/memory/feedback_autonomous_infra_setup.md index 14a05ce..5765504 100644 --- a/.claude/memory/feedback_autonomous_infra_setup.md +++ b/.claude/memory/feedback_autonomous_infra_setup.md @@ -14,4 +14,4 @@ outward-facing/irreversible actions, ambiguous requirements, or anything touchin state. Routine "X needs to be installed" is not a decision — install it. Stated 2026-05-29 during the PLUTO Gitea-Actions-runner setup (re: installing Node for host-mode JS -actions). Related: [[reference_pluto_build_server]], [[feedback_no_botalerts_internal_rmm]]. +actions). Related: [[reference_pluto_build_server]], [[feedback_gururmm]] (the #bot-alerts internal-vs-client rule lives there). diff --git a/.claude/memory/feedback_cascades.md b/.claude/memory/feedback_cascades.md new file mode 100644 index 0000000..0fa1e36 --- /dev/null +++ b/.claude/memory/feedback_cascades.md @@ -0,0 +1,39 @@ +--- +name: Cascades-specific operational rules (folder redirect, security groups) +description: Two active rules for Cascades work — (1) folder redirection (fdeploy) needs subfolders pre-created before first logon or it caches a failure forever; recovery via fix-shell-redirect.ps1; (2) always ASK which security group(s) a new user goes into — never auto-derive from OU. Root-cause / incident detail in project_cascades_history.md. +type: feedback +--- + +Current-state context: [[project_cascades]]. Root cause / incident detail: [[project_cascades_history]]. + +--- + +## 1. Folder redirection — pre-create subfolders BEFORE first logon + +fdeploy caches failures and never retries if subfolders don't exist at first logon. "No changes detected" = stuck forever without manual intervention. + +**Mandatory order for every new user:** +1. Create AD user. +2. Run `New-HomeFolder -Username ""` on **CS-SERVER** — creates root + Desktop / Documents / Downloads / Music / Pictures subfolders with correct ACL. +3. Add user to `SG-FolderRedirect`. +4. THEN first domain logon. + +**Recovery (fdeploy already cached a failure):** +- Run `clients/cascades-tucson/scripts/fix-shell-redirect.ps1` via GuruRMM on the client **while the user is logged in**. +- Script sets both GUID-based and legacy-name registry keys (`Personal`, `My Music`, `My Pictures`) in `HKU\`. +- Folders must already exist on server — script doesn't create them. +- User logs off and on to pick up changes. + +Why both GUID and legacy keys: Downloads has no legacy-name key (GUID alone suffices); Documents / Music / Pictures have both, and Windows reads the legacy key for the actual shell folder — GUID alone is insufficient. + +--- + +## 2. ASK which security group(s) a new user goes into — never auto-derive + +When creating or being asked to create any Cascades user account (AD or M365), always ask the user **which security group(s)** the new account should be a member of. Include it explicitly in the creation preview/confirmation alongside name, UPN, and OU — do not assume from OU, department, or job title. + +**Why:** Howard explicitly declined an `OU=Caregivers` → `SG-Caregivers` auto-mirror script (2026-05-14). Security-group membership controls access and CA-policy coverage; he wants that to stay a deliberate, reviewed decision per user, never automated. + +OU placement is mechanical (controls Entra Connect sync scope); group membership is an access-control decision and must be made consciously. + +**Caregivers example:** account goes in `OU=Caregivers` (sync scope) AND must be deliberately added to `SG-Caregivers` (CA policy coverage) — two separate, intentional steps; neither auto-derived from the other. diff --git a/.claude/memory/feedback_cascades_folder_redirect.md b/.claude/memory/feedback_cascades_folder_redirect.md deleted file mode 100644 index 9b180d3..0000000 --- a/.claude/memory/feedback_cascades_folder_redirect.md +++ /dev/null @@ -1,26 +0,0 @@ ---- -name: feedback_cascades_folder_redirect -description: Cascades folder redirection — fdeploy failure/retry behavior, correct new-user procedure, recovery script location -metadata: - type: feedback ---- - -Folder redirection (fdeploy) caches failures and never retries if subfolders don't exist at first logon. "No changes detected" = stuck forever without manual intervention. - -**Root cause:** fdeploy1.ini had Flags=1211 which includes Grant Exclusive Rights (bit 0x400). The Homes share grants Domain Users=Change which excludes WRITE_DAC. fdeploy fails to set NTFS on new subfolders → logs 502 → caches the failure. Changed to Flags=187 in `{512B43A4-F049-4CE5-BFAC-860AD13E92BE}\User\Documents & Settings\fdeploy1.ini` on CS-SERVER. - -**Prevention — mandatory order for every new user:** -1. Create AD user -2. Run `New-HomeFolder -Username ""` on CS-SERVER — now creates root + Desktop/Documents/Downloads/Music/Pictures subfolders with correct ACL -3. Add user to SG-FolderRedirect -4. THEN first domain logon - -**Recovery (fdeploy already cached a failure):** -- Run `clients/cascades-tucson/scripts/fix-shell-redirect.ps1` via GuruRMM on the client while user is logged in -- Sets both GUID-based and legacy-name registry keys (Personal, My Music, My Pictures) in HKU\ -- Folders must already exist on server — script doesn't create them -- User logs off and on to pick up changes - -**Why both GUID and legacy keys matter:** Downloads has no legacy name key → only GUID needed. Documents/Music/Pictures have both `{GUID}` AND `Personal`/`My Music`/`My Pictures`. Windows reads the legacy key for the actual shell folder — GUID alone is insufficient. - -**How to apply:** Any time a new Cascades user gets folder redirection set up. diff --git a/.claude/memory/feedback_cascades_user_security_group.md b/.claude/memory/feedback_cascades_user_security_group.md deleted file mode 100644 index 510d69d..0000000 --- a/.claude/memory/feedback_cascades_user_security_group.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: cascades-user-security-group -description: When creating or adding any Cascades user, always ask which security group(s) the account goes into — deliberate decision, never auto-derived from OU -metadata: - type: feedback ---- - -When creating, or being asked to create, any Cascades user account (AD or M365), always ask the user **which security group(s)** the new account should be a member of. Include it explicitly in the creation preview/confirmation alongside name, UPN, and OU — do not assume it from the OU, department, or job title. - -**Why:** Howard explicitly declined an `OU=Caregivers` -> `SG-Caregivers` auto-mirror script (2026-05-14). Security-group membership controls what access and Conditional Access policies apply to a user; he wants that to stay a deliberate, reviewed decision per user, not automated away. OU placement is mechanical (it controls Entra Connect sync scope); group membership is an access-control decision and must be made consciously. - -**How to apply:** During any Cascades user-creation flow, ask "which security group(s)?" and confirm it in the preview. For caregivers specifically: the account goes in `OU=Caregivers` (for sync scope) AND must be deliberately added to `SG-Caregivers` (for CA policy coverage) — two separate, intentional steps, neither auto-derived from the other. diff --git a/.claude/memory/feedback_check_patterns_before_asking.md b/.claude/memory/feedback_check_patterns_before_asking.md new file mode 100644 index 0000000..e243d60 --- /dev/null +++ b/.claude/memory/feedback_check_patterns_before_asking.md @@ -0,0 +1,34 @@ +--- +name: feedback-check-patterns-before-asking +description: "For recurring/repeated tasks, study existing artifacts to derive the pattern instead of asking the user how to do it" +metadata: + node_type: memory + type: feedback + originSessionId: 0f674028-fca4-4ab4-95c7-aaf47083b031 +--- + +For recurring tasks (radio show prep, session logs, post-show debriefs, +client audits, anything Mike has done multiple times before), do NOT ask +"how should I approach this?" Read the existing examples in the repo, +derive the pattern, and just do it. + +**Why:** Mike has done show prep many times. Asking him to re-explain the +workflow when the answer is sitting in `projects/radio-show/episodes/*/` +and `post-show-workflow.md` wastes his time and signals I didn't bother +to look. He pushed back hard the first time I did this. + +**How to apply:** +- Before asking *how* to do recurring work, search for prior examples and + workflow docs first. +- For radio show prep specifically: read `post-show-workflow.md`, scan + the most recent `episodes/*/show-prep.md` or `show-prep-fresh.html`, + scan related session logs (e.g., `*radio-show*prep*.md`). The pattern + is: 4 segments × 12-16 min, fresh news from past 7-14 days only, mix + inspiring/breakthrough/practical/reality-check, HTML with talking + points + sources + timing, opened in Firefox for review. +- Only ask when there's genuinely missing info (e.g., a specific topic + Mike wants featured) — not about the format or process. + +Related: [[user-font-preference]] is fine to confirm because it's a +genuine preference; [[feedback-check-patterns-before-asking]] is about +not asking for things the repo already documents. diff --git a/.claude/memory/feedback_command_formatting.md b/.claude/memory/feedback_command_formatting.md index 06c7a26..6172429 100644 --- a/.claude/memory/feedback_command_formatting.md +++ b/.claude/memory/feedback_command_formatting.md @@ -1,12 +1,26 @@ --- -name: feedback-command-formatting -description: Howard needs commands formatted as multi-line scripts, not one-liners — one-liners wrap in the chat window and break when copy-pasted +name: Paste-safe command formatting for Howard +description: Format all shell/PowerShell commands for safe copy-paste — multi-line scripts (never semicolon one-liners) AND all code starts at column 0 inside fences (no indentation). Both rules exist because Howard copy-pastes directly from chat and any visual artifact (wrapping or leading whitespace) becomes a real parse error. metadata: type: feedback --- -Always write shell/PowerShell commands as multi-line scripts, never as semicolon-separated one-liners. +Two clauses, same root cause. Howard copy-pastes commands directly from the Claude chat window into PowerShell (often via ScreenConnect). Any visual artifact in the rendered output becomes a real character in the paste and breaks the parse. -**Why:** When long one-liners are displayed in the Claude chat window, they wrap visually. When Howard copy-pastes them (e.g. into ScreenConnect), the line breaks become real newlines, breaking operators like `&` onto their own lines and causing parse errors (e.g. "AmpersandNotAllowed"). +--- -**How to apply:** Any time you're giving Howard a PowerShell or shell command longer than ~60 characters, write it as a multi-line script in a code block. Each statement on its own line. No semicolons to chain statements — use newlines instead. +## 1. Multi-line, never one-liners + +Write commands longer than ~60 characters as multi-line scripts in a fenced code block — one statement per line, newlines instead of semicolons. + +**Why:** long semicolon-chained one-liners wrap visually in the chat window. On paste, the visual wraps become real newlines, breaking operators (`&`, etc.) onto their own lines → `AmpersandNotAllowed` and similar parse errors. + +--- + +## 2. No leading whitespace inside code fences + +All code starts at **column 0** inside the fences. No leading spaces or tabs, even when the surrounding markdown is indented (e.g. inside a list item). + +**Why:** indented code blocks consistently fail when pasted into PowerShell — Howard had to manually strip the indentation every time. + +**Apply:** every PowerShell, bash, or other code block — column 0 for the first non-fence character on every line. diff --git a/.claude/memory/feedback_gururmm.md b/.claude/memory/feedback_gururmm.md new file mode 100644 index 0000000..38000b0 --- /dev/null +++ b/.claude/memory/feedback_gururmm.md @@ -0,0 +1,86 @@ +--- +name: GuruRMM operational rules — dev ownership, parity, builds, alerts, identify-by-IP, UNC +description: Six rules for working with GuruRMM. (1) RMM dev is Mike's domain — Howard does NOT code RMM (368/0 commits); GuruScan carve-out is Howard's. (2) Agent parity — Win+Linux+macOS in the same change. (3) Builds go through the Gitea webhook pipeline, never SSH. (4) Skip #bot-alerts for internal infra/build/dev; only alert on client/ticket impact. (5) Identify agents by IP, don't recon all candidates. (6) UNC paths in user_session need [char]92 — backslash literals get halved. +type: feedback +--- + +Technical reference (server / API / user_session / pipeline / agent sandbox): [[reference_gururmm]]. Project state + dev principles + pending setup: [[project_gururmm]]. + +--- + +## 1. RMM dev / bugs / roadmap are Mike's — never route them to Howard + +GuruRMM code, bugs, roadmap, and architecture are **Mike's** domain. Do NOT send RMM dev/bug coord messages to Howard. + +**Evidence (Mike, 2026-05-26):** GuruRMM repo has **368 commits by Mike, 0 by Howard.** I had escalated a stale roadmap bug (BUG-001) to Howard via a coord note; Mike corrected: *"Howard hasn't done ANY code work on RMM."* The `/feature-request` skill encodes the real model: Howard *submits* feature requests → Mike does the dev. + +**How to apply:** +- RMM bug / dev / roadmap item → it's Mike's. Since Mike is usually the user, surface to him directly; don't send a coord note (a note to yourself is pointless). +- The broader principle: Howard defers backend / server / agent / DB / infra to Mike by default ("I don't like messing with things"). When wrapping an implementation with follow-up server-side items, note them as "deferred to Mike", not Howard's pending tasks. Don't proactively suggest Howard implement server or agent changes. + +**GuruScan carve-out (`projects/msp-tools/guru-scan/`):** +- GuruScan IS Howard's project. Coord notes about GuruScan correctly go to Howard. +- Don't conflate GuruScan with GuruRMM because the names rhyme or GuruScan may integrate with RMM. +- **Leave GuruScan alone until Howard asks.** Don't proactively review, audit, or modify its code — even after a sync pulls in big GuruScan changes (Mike, 2026-05-27, after I offered to review Howard's `GuruScan.psm1` refactor unprompted). + +See [[user_howard]]. + +--- + +## 2. Agent parity — Win + Linux + macOS in the same change + +"Add feature X to the agent" means all three platforms in the same commit. Delivering Windows-only and leaving Linux/macOS for later is not acceptable — same as not finishing the task (Mike, 2026-05-15). + +**Apply:** +- If the implementation differs by platform, write all three variants. +- If a real implementation isn't feasible on a platform yet, ship a working **stub + `// TODO(platform): `** in the same commit. +- A silent no-op without a stub and TODO is treated as a bug. +- Full matrix in `.claude/CODING_GUIDELINES.md` "GuruRMM Agent — Platform Parity". + +--- + +## 3. Builds go through the Gitea webhook pipeline — never SSH manually + +Never run `build-agents.sh` directly via SSH. All builds happen through the normal pipeline (push to `main` triggers it). + +**Why:** manual runs execute as the SSH user (`guru`) instead of root, breaking log writes, artifact cleanup, and service restarts. + +**To trigger a build without a real change:** push an empty commit: +```bash +git commit --allow-empty -m "chore: trigger build" +``` + +--- + +## 4. Skip #bot-alerts for internal infra / build / dev / recon + +The `/rmm` skill says "post a one-line #bot-alert after every dispatch." Mike's clarification (2026-05-29): post a `#bot-alert` ONLY when the RMM command **directly affects a client endpoint or a ticket** (remediation, client machine change, ticket-linked work). + +For internal infra (Gitea runner install, PLUTO build VM setup), CI/build orchestration, dev tooling, recon/inventory — **SKIP the alert**. Keeps the channel signal-high; it's a client/ticket activity feed, not a build log. + +**Apply:** before dispatching via `/rmm` or the API, ask "does this touch a client or a ticket?" If no, do NOT call `post-bot-alert.sh`. Overrides the skill's blanket "alert after every dispatch" rule. See also [[reference_pluto_build_server]]. + +--- + +## 5. Identify the agent by IP, don't recon every candidate + +When a task names a machine by its external IP (e.g. an auth-failure source from a server log), identify the RMM endpoint by **matching that IP** — don't dispatch recon to every candidate agent. + +Mike pushed back twice 2026-05-30 for probing both Pavon machines to find which had a stray GuruConnect client when the offending external IP was already known. Matching IP is one lookup; reconning all candidates is noisy and slow. + +**Apply:** get the source IP from the relevant server's logs first. Until GuruRMM stores agent IPs natively (todo `7459428e`, 2026-05-30 — no `local_ip`/`external_ip` fields yet), narrow candidates by site/client first, then have only the candidates report their external IP (`Invoke-RestMethod ipify`) and match. Once the server stamps `external_ip` from `X-Forwarded-For`, query `/api/agents` directly. + +--- + +## 6. UNC paths in user_session need `[char]92` — backslash literals get halved + +Never use `"\\CS-SERVER\..."` string literals in PowerShell scripts dispatched via GuruRMM `user_session` context. The backslash gets halved somewhere in the encoding pipeline, producing `\CS-SERVER\...` (a local path) instead of the UNC `\\CS-SERVER\...`. + +The `user_session` execution wrapper processes escape sequences in the script text differently than `system` context, stripping one backslash from `\\`. `system` context (offline hive reg query) showed correct `\\CS-SERVER` output, so the issue is `user_session`-specific. + +**Apply:** always build UNC paths explicitly when using `user_session`: +```powershell +$bs = [char]92 +$base = "${bs}${bs}CS-SERVER${bs}homes${bs}Username" +``` +Constructs `\\CS-SERVER\homes\Username` correctly regardless of context. diff --git a/.claude/memory/feedback_gururmm_agent_parity.md b/.claude/memory/feedback_gururmm_agent_parity.md deleted file mode 100644 index a52040c..0000000 --- a/.claude/memory/feedback_gururmm_agent_parity.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -name: feedback_gururmm_agent_parity -description: "Add feature X to the agent" means all three platforms (Windows + Linux + macOS) in the same change — no exceptions -metadata: - type: feedback ---- - -"Add feature X to the agent" means Windows + Linux + macOS. All three in the same change. - -**Why:** Mike stated this explicitly 2026-05-15. Delivering Windows-only and leaving Linux/macOS for later is not acceptable — it's the same as not finishing the task. - -**How to apply:** When implementing any agent feature: -- If the implementation differs by platform, write all three variants. -- If a real implementation is not feasible on a platform yet, add a working stub + `// TODO(platform): ` in the same commit. -- A silent no-op without a stub and TODO is treated as a bug. -- See `.claude/CODING_GUIDELINES.md` "GuruRMM Agent — Platform Parity" for the full matrix and known gaps. diff --git a/.claude/memory/feedback_gururmm_builds.md b/.claude/memory/feedback_gururmm_builds.md deleted file mode 100644 index 3490a5a..0000000 --- a/.claude/memory/feedback_gururmm_builds.md +++ /dev/null @@ -1,14 +0,0 @@ ---- -name: feedback-gururmm-builds -description: "GuruRMM builds must go through the Gitea webhook pipeline, never run manually via SSH" -metadata: - node_type: memory - type: feedback - originSessionId: 541d4004-8c45-4290-89f5-0ba9ee4e64a9 ---- - -Never run `build-agents.sh` directly via SSH. All builds go through the normal Gitea webhook pipeline (push to main triggers the build automatically). - -**Why:** Manual runs execute as the SSH user (`guru`) instead of root, breaking log writes, artifact cleanup, and service restarts. The pipeline exists precisely to handle this correctly. - -**How to apply:** To trigger a build, push a commit to the gururmm main branch on Gitea. If a test build is needed without a real change, use an empty commit: `git commit --allow-empty -m "chore: trigger build"`. diff --git a/.claude/memory/feedback_howard_delegation.md b/.claude/memory/feedback_howard_delegation.md deleted file mode 100644 index 4e4fe19..0000000 --- a/.claude/memory/feedback_howard_delegation.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback-howard-delegation -description: Howard prefers to leave backend/server-side follow-up and risky implementation work to Mike unless explicitly asked — don't assign those items to Howard or prompt him to do them. -metadata: - type: feedback ---- - -Howard defers backend follow-up tasks (server-side plumbing, DB schema changes, agent-side wiring, anything touching infrastructure) to Mike by default. He said "I don't like messing with things." - -**Why:** Howard is a tech/field role. He's comfortable with dashboard UI, specs, and client work but prefers not to touch server/agent code unless Mike specifically asks him to. - -**How to apply:** When wrapping up an implementation that has follow-up backend items (e.g. "push rules on agent connect", "policy tab plumbing"), note them as "deferred to Mike" rather than listing them as Howard's pending tasks. Don't proactively suggest Howard implement server or agent changes unless he asks or Mike assigns them. See [[feedback-testing]] for related conservative approach to changes. diff --git a/.claude/memory/feedback_no_botalerts_internal_rmm.md b/.claude/memory/feedback_no_botalerts_internal_rmm.md deleted file mode 100644 index 9bf0944..0000000 --- a/.claude/memory/feedback_no_botalerts_internal_rmm.md +++ /dev/null @@ -1,21 +0,0 @@ ---- -name: feedback_no_botalerts_internal_rmm -description: Post #bot-alerts ONLY when an RMM command directly affects a client endpoint or a ticket; skip for internal infra/build/dev/recon (e.g. PLUTO build-runner setup) -metadata: - type: feedback ---- - -The `/rmm` skill instructs "post a one-line #bot-alert after every dispatch." Mike does NOT want -#bot-alerts for **internal infrastructure / dev-tooling** commands — e.g. installing a Gitea Actions -runner on PLUTO, CI/build orchestration on build VMs, inventory/recon during setup. - -**The rule (Mike, 2026-05-29):** post a #bot-alert ONLY when the RMM command **directly affects a -client endpoint or a ticket** (remediation, a client machine change, ticket-linked work). For -everything else — internal infra, build/CI orchestration, dev-tooling, recon/inventory (e.g. the -PLUTO build-runner setup) — SKIP the alert. - -**Why:** keeps #bot-alerts signal-high — it's a client/ticket activity feed, not a build log. - -**How to apply:** When dispatching via `/rmm` or the GuruRMM command API, ask "does this touch a -client/ticket?" If no, do NOT call `post-bot-alert.sh`. Overrides the skill's blanket "alert after -every dispatch" rule. Related: [[reference_pluto_build_server]]. diff --git a/.claude/memory/feedback_no_indented_code_blocks.md b/.claude/memory/feedback_no_indented_code_blocks.md deleted file mode 100644 index 0b88fe7..0000000 --- a/.claude/memory/feedback_no_indented_code_blocks.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback_no_indented_code_blocks -description: Never indent code inside code blocks — Howard copy-pastes directly and leading spaces break PowerShell commands -metadata: - type: feedback ---- - -Never indent code inside markdown code blocks. Howard copy-pastes commands directly from the chat and leading spaces cause PowerShell parse errors. All code must start at column 0 inside the fences. - -**Why:** Howard reported that indented code blocks consistently fail when pasted into PowerShell and he has to manually strip the indentation every time. - -**How to apply:** Every PowerShell (and bash/other) code block — start all lines at column 0, no leading spaces or tabs inside the fences. diff --git a/.claude/memory/feedback_psa_default_syncro.md b/.claude/memory/feedback_psa_default_syncro.md index 70fb9bb..8f1a7a4 100644 --- a/.claude/memory/feedback_psa_default_syncro.md +++ b/.claude/memory/feedback_psa_default_syncro.md @@ -9,4 +9,4 @@ metadata: **Why:** ACG runs both PSAs, but Syncro is the primary day-to-day system. The Autotask integration exists (creds verified, `/autotask` skill built 2026-05-27) but is secondary. Mike confirmed: "Syncro is the default for ticketing and such; [Autotask] would only be called if there was a 'do a thing in autotask' request." -**How to apply:** When a request mentions tickets/billing/customers/appointments without naming the system, assume Syncro and use `/syncro`. Reach for `/autotask` only when the user names Autotask explicitly. The `/autotask` skill is intentionally kept local/undistributed for now (not synced to other machines), reinforcing that it's opt-in. See [[feedback-syncro-timer-first]] for Syncro billing specifics. +**How to apply:** When a request mentions tickets/billing/customers/appointments without naming the system, assume Syncro and use `/syncro`. Reach for `/autotask` only when the user names Autotask explicitly. The `/autotask` skill is intentionally kept local/undistributed for now (not synced to other machines), reinforcing that it's opt-in. See [[feedback_syncro_billing]] for Syncro billing specifics (and [[feedback_syncro_api]], [[feedback_syncro_workflow]] for the rest). diff --git a/.claude/memory/feedback_rmm_dev_is_mike.md b/.claude/memory/feedback_rmm_dev_is_mike.md deleted file mode 100644 index e54a3c9..0000000 --- a/.claude/memory/feedback_rmm_dev_is_mike.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -name: GuruRMM development is Mike's, not Howard's -description: GuruRMM code/bugs/dev are Mike's domain — never route RMM dev or bug coord notes to Howard. Howard only SUBMITS RMM feature requests; GuruScan is Howard's project, not RMM -type: feedback ---- - -GuruRMM development — code, bugs, the roadmap, architecture — is **Mike's** domain. Do NOT route RMM dev/bug coord messages to Howard. Howard does **zero** RMM coding. - -**Why:** Mike, 2026-05-26. I escalated a stale GuruRMM roadmap bug (BUG-001) to Howard via a coord note; Mike corrected me — "Howard hasn't done ANY code work on RMM." Verified: `users.json` machine lists don't overlap (mike: GURU-5070/Mikes-MacBook-Air/GURU-BEAST-ROG/GURU-KALI; howard: ACG-TECH03L/Howard-Home), and the GuruRMM repo has 368 commits by Mike and **0 by Howard**. The `/feature-request` skill encodes the real model: Howard *submits* RMM feature requests → Mike does the dev. I had inverted it. - -**How to apply:** -- RMM bug/dev/roadmap item → it's Mike's. Since Mike is usually the user, just surface it to him directly; don't send a coord note to anyone (a note to yourself is pointless, and Howard isn't the owner). -- **GuruScan** (`projects/msp-tools/guru-scan/`) IS Howard's project — coord notes about GuruScan correctly go to Howard. Don't conflate GuruScan with GuruRMM just because the names rhyme or GuruScan may integrate with RMM. -- **Leave GuruScan alone until Howard asks.** Do NOT proactively review, audit, or modify its code — even after a sync pulls in big GuruScan changes. Wait for Howard to explicitly request a review. (Mike, 2026-05-27, after I offered to review Howard's GuruScan.psm1 refactor unprompted.) -- Before sending any coord note to a teammate, check whose domain the work actually sits in. See [[user_howard]]. diff --git a/.claude/memory/feedback_rmm_identify_by_ip.md b/.claude/memory/feedback_rmm_identify_by_ip.md deleted file mode 100644 index baae8a2..0000000 --- a/.claude/memory/feedback_rmm_identify_by_ip.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback_rmm_identify_by_ip -description: When the offending/target machine is known by external IP, identify the RMM agent by matching the IP — don't recon every candidate. -metadata: - type: feedback ---- - -When a task names a machine by its external IP (e.g. an auth-failure source from a server log), identify the RMM endpoint by **matching that IP**, not by dispatching recon to every candidate agent and inspecting them. - -**Why:** Mike pushed back twice (2026-05-30) for probing both Pavon machines (Curves + Raiders) to find which had a stray GuruConnect client, when the offending external IP was already known. Matching IP is one lookup; reconning all candidates is noisy and slow. - -**How to apply:** Get the source IP from the relevant server's logs first. To map IP -> agent: GuruRMM does NOT yet store agent IPs (no local_ip/external_ip fields — see GuruRMM todo 7459428e, 2026-05-30), so until that lands, have only the *candidate* endpoints report their external IP (`Invoke-RestMethod ipify`) and match — or narrow candidates by site/client first. Once the server stamps external_ip from X-Forwarded-For, query `/api/agents` directly. Related: [[reference_gitea_internal]]. diff --git a/.claude/memory/feedback_syncro_api.md b/.claude/memory/feedback_syncro_api.md new file mode 100644 index 0000000..9132622 --- /dev/null +++ b/.claude/memory/feedback_syncro_api.md @@ -0,0 +1,82 @@ +--- +name: Syncro API plumbing — headers, endpoints, response shapes, idempotency +description: Technical mechanics for talking to the Syncro API — required Content-Type header, the no-idempotency rule (always GET before retry), response wrappers, the add_line_item endpoint shape, HTML rendering, and the (now historical) timer_entry response shape. +metadata: + type: feedback +--- + +Rules only. Incident detail, verbatim quotes, ticket numbers, and dates live in [[feedback_syncro_history]] — read on-demand when judging an edge case. Billing/product rules: [[feedback_syncro_billing]]. Workflow rules: [[feedback_syncro_workflow]]. + +--- + +## 1. Content-Type header is required on every POST/PUT + +Always include `-H "Content-Type: application/json"`. Without it, curl sends `application/x-www-form-urlencoded` and Syncro returns a **400 HTML page** (not JSON). Applies to comments, tickets, line items, estimates, updates. + +Ticket comment payloads also need the `subject` field: +```json +{"subject":"...","body":"...","hidden":true,"do_not_email":true} +``` + +--- + +## 2. No idempotency — ALWAYS GET before retrying any POST + +Syncro has **no idempotency on any endpoint**. One `POST` always creates one record, regardless of whether the client saw an error. A jq parse error, curl error, timeout, or weird-looking response does NOT mean the POST failed — verify first. + +**Verification before retry:** +- Comments: `GET /tickets/{id}` and search `.ticket.comments[] | select(.subject == "…")`. Check ALL comments, not just `[-3:]`. +- Tickets: `GET /customers/{id}/tickets` before retrying. +- Line items: `GET /tickets/{id}` → `.ticket.line_items[]`. + +**Response wrappers — CRITICAL for jq:** +- `POST /tickets` → `{"ticket": {...}}` → use `.ticket.id`. +- `POST /comment` → `{"comment": {...}}` → use `.comment.id`. + +**Hardening:** Write payload JSON to a temp file (e.g. `tmp/syncro_comment.json`) before posting. Avoids shell quoting/encoding failures that masquerade as POST failures on requests that actually succeeded. + +Comments cannot be deleted via API — duplicates require manual GUI removal. + +--- + +## 3. HTML formatting in comments + +Use `
    ` for line breaks. Do **NOT** use `
      ` or `
    • ` — Syncro's renderer collapses them into one line. For bulleted lists: +```html +

      +- Item one
      +- Item two
      +- Item three +

      +``` + +--- + +## 4. add_line_item endpoint + +`POST /api/v1/tickets/{internal_ticket_id}/add_line_item` + +- Path uses the **internal ticket ID** (e.g. `111387456`), NOT the ticket number (`32339`). Wrong-ID variants (`/line_item`, `/line_items`, `PUT line_items_attributes`) all 404. +- Required fields: `name`, `description`, `quantity`, `price`, `taxable` (and `product_id` for catalog items). Missing `name`/`description` → 422. +- Response is **flat** — parse `.id` directly (no wrapper). + +Example: +```bash +curl -X POST "$BASE/tickets/111387456/add_line_item?api_key=$KEY" \ + -H "Content-Type: application/json" \ + -d '{"product_id":1049360,"name":"Labor- Warranty work","description":"…","quantity":1,"price":0.0,"taxable":false}' +``` + +For ad-hoc API testing, use the internal ACG account only (customer ID `15353550`). + +--- + +## 5. Timer response — HISTORICAL / SUPERSEDED + +> Timers are no longer part of the ACG Syncro billing workflow (as of 2026-05-21 — see [[feedback_syncro_billing]] and [[feedback_syncro_history]]). Kept here for the rare manual-timer case. + +`POST /tickets/{id}/timer_entry` returns a **FLAT** object — parse `.id`, NOT `.timer.id // .timer_entry.id` (both resolve to `null` and break `charge_timer_entry`, which can trigger a retry → duplicate). `charge_timer_entry` response is also flat: use `.ticket_line_item_id`. + +Authoritative timer list for a ticket: `GET /tickets/{id}` → `.ticket.ticket_timers[]`. The standalone `/ticket_timers?ticket_id=N` returns global history, not filtered. + +Cleanup duplicates: `POST /tickets/{id}/delete_timer_entry` with `{"timer_entry_id": N}`. diff --git a/.claude/memory/feedback_syncro_appointment_date_check.md b/.claude/memory/feedback_syncro_appointment_date_check.md deleted file mode 100644 index c141e47..0000000 --- a/.claude/memory/feedback_syncro_appointment_date_check.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -name: Syncro — verify appointment date day-of-week -description: Before creating any Syncro appointment, verify the computed date falls on the intended weekday (py datetime) and show the day name in the preview. Wrong-day incident #32312 2026-05-21. -type: feedback ---- - -# Syncro — Verify appointment date day-of-week before creating - -**Rule:** Before creating any Syncro appointment, always verify that the computed date -actually falls on the intended day of the week. - -**Why:** Day-of-week math is easy to get wrong. In the incident that prompted this rule -(2026-05-21, ticket #32312), "Saturday" was computed as May 24 — which is actually a Sunday. -The appointment landed on the wrong day and didn't appear where Winter expected it on the calendar. - -**How to verify:** - -Use Python or Bash to print the weekday before including it in the preview: - -```bash -py -c "import datetime; d = datetime.date(2026, 5, 24); print(d.strftime('%A %Y-%m-%d'))" -# Output: Sunday 2026-05-24 ← would have caught the error -``` - -Or include the day name in the TICKET PREVIEW and require explicit user confirmation -that the day-of-week matches their intent. - -**Catch:** Always show `Day YYYY-MM-DD` (e.g., "Saturday 2026-05-23") in the preview — -never just the numeric date — so the user can verify at a glance. - -Reported by Winter, 2026-05-21. diff --git a/.claude/memory/feedback_syncro_appointment_owner.md b/.claude/memory/feedback_syncro_appointment_owner.md deleted file mode 100644 index fbc47e6..0000000 --- a/.claude/memory/feedback_syncro_appointment_owner.md +++ /dev/null @@ -1,40 +0,0 @@ ---- -name: Syncro — confirm appointment owner explicitly when creating tickets with appointments -description: When creating Syncro tickets that include an appointment, always ask "who is the appointment owner?" before posting. Don't auto-default to the ticket's assigned tech, and distinguish owner from additional attendees. -type: feedback ---- - -**Rule:** When creating a Syncro ticket that includes an appointment (Onsite, Remote, Phone Call, etc.), explicitly **ask the user who the appointment owner is** in the preview phase. Do not assume the appointment owner equals the ticket's assigned tech, and do not silently add other techs as attendees. - -**Why:** The appointment owner is the person whose calendar the appointment lands on as the primary entry — they are the one accountable for being there. Additional `user_ids` in the appointment payload only add the entry to other techs' calendars as secondary/visible items, which clutters their schedule and creates ambiguity about who is actually on the hook for the visit. Howard caught this on 2026-05-08 after a ticket creation where I added the assigned tech to `user_ids` without confirming whether they should be the owner versus an attendee. - -**How to apply:** - -In the ticket creation preview (Step 3 of the ticket creation workflow), present the appointment block with the OWNER as a separate, explicit field — not buried as an inferred default. Example preview format: - -``` -APPOINTMENT ------------ -Type: Onsite -Owner: -Additional attendees: (optional, leave blank unless explicitly added) -Start: -End: -Location: -``` - -In the API payload, the appointment owner is the FIRST or PRIMARY entry in `user_ids`. Confirm: - -- The owner is the person actually attending the appointment (or the lead tech if multiple). -- If the user wants ONLY the owner with no co-attendees, `user_ids` should contain ONE id only. -- If the user wants additional attendees (e.g., "Mike will join remote, Howard onsite"), add them only after explicit confirmation in the preview. - -**What NOT to do:** - -- Do NOT auto-add the ticket's `user_id` (assigned tech) as the appointment owner without asking. -- Do NOT add additional attendees to `user_ids` without explicit user direction. -- Do NOT treat appointment owner as a passive inheritance from the ticket — surface it as an active confirmation field in the preview. - -**Trigger context:** - -Howard created the Kittle Design ticket (#32263) on 2026-05-08 for an 11:30 AM onsite to set up Joshua. I auto-added Howard's `user_id` to the appointment's `user_ids` array without confirming whether Howard was the owner or just an attendee. Howard flagged: "when setting up an appointment confirm the appointment owner — don't just add additional attendees." Save as a rule for syncro ticket creation. diff --git a/.claude/memory/feedback_syncro_billing.md b/.claude/memory/feedback_syncro_billing.md index d7991c4..48c2c3f 100644 --- a/.claude/memory/feedback_syncro_billing.md +++ b/.claude/memory/feedback_syncro_billing.md @@ -1,16 +1,104 @@ --- -name: Syncro - preview all comments before posting -description: Every Syncro comment must be previewed and confirmed before posting, no exceptions -type: feedback -originSessionId: 4ccedc24-2f39-497e-9a89-ca09aba03982 +name: Syncro billing rules — products, rates, taxes, attribution, emergency, warranty +description: How to bill a Syncro ticket correctly — fetch live rates, use real product names, pick the right labor type for the delivery channel, set taxable=false on labor (AZ), warranty product 1049360, emergency ×1.5 (branch by prepay_hours), preserve original tech's user_id on corrections, estimate hardware uses product 32252. +metadata: + type: feedback --- -**Rule:** ALWAYS show the full comment text to Mike and wait for explicit confirmation before posting ANY comment to a Syncro ticket. No exceptions — not for billing comments, not for resolution notes, not for client-facing messages, not for internal notes. -**Why:** Mike has called this out multiple times. Comments posted without preview have had wrong tone, missing context, or incorrect content. Once posted they can't be deleted via API and require manual GUI cleanup. +Rules only. Incident detail, verbatim Mike quotes, ticket numbers, dates, the tech user_id table, and the labor-product table all live in [[feedback_syncro_history]] — read on-demand when judging an edge case. API mechanics: [[feedback_syncro_api]]. Workflow: [[feedback_syncro_workflow]]. -**How to apply:** -- Draft the comment, show it in chat as a formatted block -- Say "Good to post?" or similar and wait for a yes -- Only then call POST /tickets/{id}/comment -- This applies to every single comment regardless of how routine it seems -- Also always ask for minutes + labor type before logging any time entry — never assume a default +`.claude/commands/syncro.md` is the authoritative live product table. + +--- + +## 1. Bill with `add_line_item` directly — never the timer workflow + +`POST /tickets/{id}/add_line_item` is the billing path for ALL work (labor, warranty, internal, hardware). The timer workflow (`timer_entry → charge_timer_entry`) is **not used**. + +**Payload:** `product_id`, `quantity` (decimal hours), `price_retail` (fetched live), `name` (the product's REAL name — see §3), `description` (free-text work narrative), `taxable: false` for labor. + +--- + +## 2. ALWAYS fetch the rate live + +Fetch `price_retail` from `GET /products/` → `.product.price_retail` before billing. The product-ID table in `.claude/commands/syncro.md` is valid for IDs but **not** dollar amounts — rates vary by contract and change. + +```bash +RATE=$(curl -s "$BASE/products/$PRODUCT_ID?api_key=$API_KEY" | jq -r '.product.price_retail') +``` + +Use `$RATE` for drafts, the user preview, and the `price_retail` field. + +--- + +## 3. NEVER invent or rename labor line items + +Every labor line MUST be an existing Syncro product, billed under its **REAL name** (from `GET /products/` → `.product.name`, verbatim). Work-specific narrative goes in `description`, never the `name`. + +**Why:** invented names break the Syncro → QuickBooks sync. QB maps each labor line to an existing item; a fabricated name has no QB match and messes up the accounting. If no existing product fits, **STOP and ask Mike** — never invent one. + +Product table lives in [[feedback_syncro_history]] (and `.claude/commands/syncro.md`). + +--- + +## 4. Labor type must match delivery channel — never "Prepaid project labor" + +Pick the labor product matching how work was delivered: **remote** (most common), **onsite**, **in-shop**, or **web**. Resolve `product_id` via `GET /products?search=remote+labor` etc. + +**Never default to "Prepaid project labor"** — it is **exempt** and does NOT consume hours from a customer's prepaid block. Block accounting silently drifts. + +**Verify:** after billing a prepay-block customer, confirm the block balance dropped by the expected hours. If it didn't, the labor type was wrong. + +--- + +## 5. Labor is NEVER taxable in Arizona + +Pass `"taxable": false` explicitly on every labor line. The product config has `taxable: false`, but `add_line_item` does **not** inherit it — posts as `taxable: true` regardless. Applies to remote, onsite, in-shop, emergency, warranty, prepaid. + +--- + +## 6. Warranty / no-charge → product `1049360`, never patch the price + +Warranty work uses `product_id: 1049360` ("Labor- Warranty work", $0/hr, non-taxable). The line generates at $0 from `price_retail × quantity` — no need to flag or patch anything. + +**Do NOT** pick a regular labor product and try to neutralize it with `billable: false` or by patching `price_retail` to `0`. **Prices are set by selecting the correct product.** If you reach for `update_line_item` to drop a price, that's the signal to back up and pick a different `product_id`. + +The only legitimate `update_line_item price_retail` use is the Syncro auto-gen-zero recovery case (auto-line came in at $0 instead of the product's rate). + +--- + +## 7. Emergency / after-hours — ×1.5 applied ONCE; branch by `prepay_hours` + +`GET /customers/` and read `prepay_hours` BEFORE adding any emergency line. Emergency = **time-and-a-half, applied ONCE**. Never bill a separate regular line + emergency line for the same hours. + +**No prepaid block (`prepay_hours == 0`):** +- product `26184`, quantity = **actual hours** (do NOT also ×1.5 the quantity) +- `price_retail` by delivery channel (the 1.5× lives in the dollars): + - Onsite emergency = `$262.50` (175 × 1.5; 26184's default). + - Remote / In-Shop emergency = `$225` (150 × 1.5) → override `price_retail` to `225`. + +**Prepaid block (`prepay_hours > 0`):** +- product `26184`, quantity = **actual hours × 1.5** (premium goes in the QUANTITY) +- Delivery channel / dollar rate is irrelevant; prepaid blocks debit by quantity. Invoice nets to $0; block debits hrs×1.5. +- e.g. 1.5 emergency hrs → `26184` @ `2.25`. + +Always set `price_retail` explicitly — the rate doesn't auto-populate and the line posts $0 if omitted. Verify after: `.invoice.total` (non-prepaid) or block decrement (prepaid). + +--- + +## 8. Corrections preserve the ORIGINAL tech's attribution; ticket ownership is sticky + +**Labor lines:** when fixing a wrong line, preserve the **original tech's** `user_id` so their commission isn't lost. +- **Prefer `update_line_item` in place** — it preserves `user_id`. +- **If you must remove + re-add:** the new line defaults to the **API-key owner's** `user_id`. Explicitly set `user_id` to the original tech on `add_line_item`, or PUT-fix it afterward. +- Determine the original tech from `.ticket.user_id` and the line's `.user_id` BEFORE correcting; verify after. + +**Ticket ownership:** adding notes or labor does **NOT** change `.ticket.user_id`. Multiple techs routinely work the same ticket. Only change ticket ownership when explicitly asked. Status PUTs send only `status`; line edits use `update_line_item`; neither touches `user_id`. + +Tech user_id table → [[feedback_syncro_history]]. + +--- + +## 9. Estimate hardware → product `32252` + +All hardware on estimates uses one generic product: `product_id: 32252` ("Hardware", `price_retail: 0.0`). Differentiate via `name` ("Dell OptiPlex 7010") and `price_retail` (actual cost). Hardware is typically `taxable: true`. Never look up individual hardware product IDs — there's only one. diff --git a/.claude/memory/feedback_syncro_blank_contact.md b/.claude/memory/feedback_syncro_blank_contact.md deleted file mode 100644 index d3670d7..0000000 --- a/.claude/memory/feedback_syncro_blank_contact.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -name: Syncro — leave contact blank by default on tickets and billing -description: When creating Syncro tickets or billing them out, leave the contact field blank ("Not Assigned") in most cases. Blank contact lets Syncro use the company-level defaults for notifications and email routing. Setting a specific contact can route to a secondary email and bypass the customer's intended distribution. -type: feedback ---- - -**Rule:** When creating or billing Syncro tickets, leave `contact_id` / `contact_name` / `contact_email` blank ("Not Assigned") by default for any customer. Only set a contact when there's an explicit, deliberate reason to (e.g., user explicitly says "set the contact to X"). - -**Why:** Winter clarified on 2026-05-04: blank contact lets Syncro apply the **company-level email defaults** for the account — those defaults route notifications to the right people. Setting a specific contact overrides that and may push notifications to a secondary email address belonging to that contact, bypassing the customer's intended distribution. This was originally flagged for Cascades of Tucson (where Meredith was being incorrectly auto-selected), but Winter generalized it: the rule applies to most customers. - -**How to apply:** - -- **Creating a ticket** (POST `/tickets`): Omit `contact_id` from the body entirely. Do not pull contacts via `GET /customers/{id}` and pick one — let Syncro use the company defaults. -- **Editing a ticket** (PUT `/tickets/{id}`): Send only the fields you're changing (`status`, `priority`, etc.). Never include `contact_id`, `contact_name`, or `contact_email` in the body, even matching the existing value. PUT can re-apply the record; safest is to never reference contact in any write payload. -- **Billing / invoices**: Same rule on the invoice creation side. If `contact_id` shows up in any payload, drop it. -- **When to set a contact anyway:** Only if the user explicitly directs you to ("set Mike as the contact on this one") OR there's a documented per-customer instruction that overrides the default. Default is always blank. -- **Verify after any write:** `GET /tickets/{id}` and confirm `.ticket.contact_id` is `null`. If you find it set, blank it explicitly: `PUT /tickets/{id}` with `{"contact_id": null}`. - -**Generalizes from:** the prior Cascades-specific guidance (originally `feedback_syncro_cascades_contact.md`). Winter's 2026-05-04 message broadened the scope from "Cascades only" to "most customers." diff --git a/.claude/memory/feedback_syncro_cascades_contact.md b/.claude/memory/feedback_syncro_cascades_contact.md deleted file mode 100644 index 46fd23b..0000000 --- a/.claude/memory/feedback_syncro_cascades_contact.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -name: Syncro — Cascades contact incident detail (Meredith Kuhn) -description: Incident context for why the blank-contact rule matters at Cascades — Meredith Kuhn is the recurring wrong default that Syncro pre-selects. See feedback_syncro_blank_contact.md for the global rule. -type: feedback ---- - -At Cascades of Tucson (customer_id 20149445), Syncro repeatedly pre-selects **Meredith Kuhn** (Assistant Manager, ASSISTMAN-PC) as the ticket contact default. She is the wrong contact — setting her overrides the customer's distribution emails and routes notifications only to her. - -**Why it keeps happening:** Syncro's contact picker defaults to the first-alphabetical or most-recently-used contact. Howard surfaced this pattern; Mike confirmed the global rule on 2026-05-24 (do not set contact on ANY ticket unless explicitly requested). - -**Global rule:** See [[feedback_syncro_blank_contact]] — blank contact is the default for all customers, not just Cascades. - -**Cascades-specific guard:** Even if you're tempted to assign a contact for routing purposes, Meredith Kuhn is specifically wrong. The correct routing happens automatically when contact is null. diff --git a/.claude/memory/feedback_syncro_comment_dedup.md b/.claude/memory/feedback_syncro_comment_dedup.md deleted file mode 100644 index f1d8fe6..0000000 --- a/.claude/memory/feedback_syncro_comment_dedup.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -name: Syncro duplicate prevention — tickets AND comments -description: Never retry ANY Syncro POST (ticket create or comment) without first GETting to confirm the action didn't already succeed — Syncro has no idempotency on any endpoint -type: feedback -originSessionId: 7034be43-1464-4085-b765-dc1226b1f8e0 ---- -Never retry a POST /comment to Syncro without first doing GET /tickets/{id} to confirm the comment did not already post. The server has no idempotency — one POST always creates one comment, regardless of whether the client saw an error. - -**ALSO: Always show the full comment draft to the user and wait for explicit confirmation before posting ANY comment — including internal/hidden notes.** This rule has been violated twice. There are no exceptions. - -**ALSO: This applies to ticket CREATION too — not just comments.** When a POST /tickets response looks wrong (null fields, jq error, etc.), do GET /customers/{id}/tickets BEFORE retrying. The response wrapper is `{"ticket": {...}}` — always use `.ticket.id` not `.id`. Duplicate tickets were created twice by retrying a succeeded POST. Violated 2026-04-22. - -**Why:** A comment was duplicated on ticket #32185 because the first POST succeeded but jq threw a parse error on the response (em-dash in subject caused shell interpolation issue), making the request look failed. A retry posted a second copy. Comments cannot be deleted via API — duplicates require manual GUI removal. - -**How to apply:** -- Always write comment payloads to a temp file (`/tmp/syncro_comment.json`) before posting — avoids shell quoting/encoding failures that produce misleading errors -- If any POST /comment tool call returns an error or ambiguous result, immediately GET /tickets/{id} and check `.ticket.comments` for the subject/timestamp before retrying -- A jq parse error, curl error, or timeout on the response does NOT mean the POST failed — verify first -- **CRITICAL — jq path:** POST /comment response is `{"comment": {...}}` — ALWAYS use `.comment.id`, `.comment.created_at` etc. Using `.id` returns null and looks like failure even when the comment landed. This caused a duplicate on 2026-04-23 (#32142). When GETting to verify, check ALL comments not just `[-3:]` — the new comment may not be the most recent if other activity occurred. -- When GETting to verify after an ambiguous POST, search by subject: `.ticket.comments[] | select(.subject == "...")` diff --git a/.claude/memory/feedback_syncro_content_type.md b/.claude/memory/feedback_syncro_content_type.md deleted file mode 100644 index 2a87ae8..0000000 --- a/.claude/memory/feedback_syncro_content_type.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback-syncro-content-type -description: Syncro API POST calls require explicit Content-Type application/json header or they 400 with an HTML error page -metadata: - type: feedback ---- - -Always include `-H "Content-Type: application/json"` on every Syncro API POST/PUT call (comments, tickets, line items, estimates). - -**Why:** Without it, curl sends the JSON body as `application/x-www-form-urlencoded`, which Syncro rejects with an HTML 400 page instead of a JSON error. The HTML response looks like a hard failure but it's just a missing header. Discovered 2026-05-28 when posting a comment to ticket #32333 — two 400 HTML responses before the fix. - -**How to apply:** Every `curl -X POST` or `curl -X PUT` to the Syncro API needs the header. The subject field is also required on ticket comments (`{"subject":"...","body":"...","hidden":true,"do_not_email":true}`). diff --git a/.claude/memory/feedback_syncro_corrections_preserve_tech.md b/.claude/memory/feedback_syncro_corrections_preserve_tech.md deleted file mode 100644 index 7e6b8ab..0000000 --- a/.claude/memory/feedback_syncro_corrections_preserve_tech.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -name: feedback-syncro-corrections-preserve-tech -description: Preserve Syncro attribution — corrections keep the original tech's labor user_id (commission); and adding notes/labor never changes the ticket owner. Only reassign labor or ticket ownership when explicitly asked. -metadata: - type: feedback ---- - -When fixing labor line items that were billed incorrectly (wrong product, quantity, name, or bad math — a **debug/correction action**), do NOT let the labor get reassigned to the correcting tech. **Preserve the ORIGINAL tech's attribution (`user_id`)** on each line so their commission isn't lost. - -- **Prefer `update_line_item` in place** — it preserves the line's existing `user_id`. (Verified on #32332: updating Howard's line kept `user_id=1750`; the dollar/product changed but the commission stayed with Howard.) -- **If you must REMOVE + re-ADD a line**, the new line defaults to the **API-key owner's** `user_id` (e.g. Mike `1735`) — so explicitly set `user_id` to the original tech on `add_line_item`, or PUT `update_line_item` to fix the new line's `user_id` afterward. -- Determine the original tech from the **ticket's `.ticket.user_id`** and the line's `.user_id` before correcting; verify it still matches after. - -**Tech user_ids:** Mike `1735`, Howard `1750`, Winter `1737`, Rob `1760`. - -**Ticket ownership (related rule, Mike 2026-05-27):** Simply adding notes or labor to a ticket does **NOT** change the ticket owner (`.ticket.user_id` / assigned tech). Multiple techs routinely work the same ticket. **Only change ticket ownership when explicitly asked** — never PUT a ticket's `user_id` as a side effect of commenting, billing, or status changes. (Status PUTs should send only `status`; line edits use `update_line_item`; neither should touch `user_id`.) - -**Why:** Mike — a billing correction is a debug action (e.g. Claude or someone billed it wrong); the **original tech still did the work and keeps the commission**. Don't take Howard's commission just because the math was fixed by Mike/Winter. Hit on #32332 (Cascades) 2026-05-27 — Howard's mis-billed labor was corrected via Mike's API key; update-in-place preserved `user_id=1750`, but a remove+add would have stolen the commission. Related: per-user-key attribution in [[365-remediation-tool-reference]] / `/syncro` Attribution rule. diff --git a/.claude/memory/feedback_syncro_emergency_billing.md b/.claude/memory/feedback_syncro_emergency_billing.md deleted file mode 100644 index aff6d10..0000000 --- a/.claude/memory/feedback_syncro_emergency_billing.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -name: Syncro emergency/after-hours billing — check prepay_hours first -description: Emergency labor is time-and-a-half (×1.5), applied once, never additive. Branch by customer.prepay_hours. Prepaid → emergency item 26184 at hours×1.5 (premium in quantity); non-prepaid → 26184 at actual hours (rate has 1.5×). -metadata: - type: feedback ---- - -**Rule:** Before adding any Emergency/after-hours labor line on a Syncro ticket, `GET /customers/` and read `prepay_hours`. Emergency = **time-and-a-half (×1.5), applied ONCE** — never bill a separate regular line + emergency line for the same hours. - -- **No prepaid block (`prepay_hours == 0`):** product `26184` (Labor - Emergency or After Hours) at quantity = **actual hours**, and set `price_retail` by the work's **delivery channel** (the 1.5× lives in the dollars — do NOT also ×1.5 the quantity): **Onsite emergency = $262.50** (175 × 1.5; this is 26184's default rate); **Remote / In-Shop emergency = $225** (150 × 1.5) → override `price_retail` to `225`. Fetch the base rate live and ×1.5 if unsure. -- **Prepaid block (`prepay_hours > 0`):** product `26184` at quantity = **actual hours × 1.5** (hours + 50%). Prepaid blocks debit by QUANTITY not dollars, so the 1.5× premium goes in the **quantity**; the invoice nets to $0 and the block debits hours×1.5. e.g. 1.5 emergency hrs → `26184` @ **2.25**. (Delivery channel / dollar rate is **irrelevant** for prepaid — only the quantity hrs×1.5 matters.) - -**(Updated 2026-05-27 — Mike):** prepaid emergency now uses the **emergency item `26184`** at ×1.5 quantity — this REPLACES the old "prepaid → onsite `26118` at ×1.5." Using 26184 labels the line correctly as emergency and maps right in QuickBooks; the dollar double-1.5 worry doesn't apply to prepaid since the invoice is $0. Reaffirmed on #32332 (Cascades, prepaid 27h): total 1.5 emergency hrs → `26184` @ 2.25 (Howard had split it into made-up onsite/emergency lines). - -**Why ×1.5-not-additive:** Learned on #32203 (Desert Auto Tech) 2026-04-23 — billing "1h onsite + 1h emergency" as two additive lines came out $437.50 when 1 actual hour of emergency should bill at time-and-a-half. Emergency IS time-and-a-half; one line. - -**How to apply:** -- Every emergency/after-hours bill: check `prepay_hours` BEFORE choosing the quantity. One emergency line on `26184`. -- Always set `price_retail` explicitly (fetch live via `GET /products/26184`); the rate doesn't auto-populate and the line posts $0 if omitted. -- Use the product's REAL name on the line (work detail goes in the description) — see [[feedback-syncro-no-madeup-labor-items]]. -- Verify after invoicing: `.invoice.total` (non-prepaid) or the prepay-block decrement (prepaid). -- Full rules: `.claude/commands/syncro.md`. diff --git a/.claude/memory/feedback_syncro_estimate_hardware.md b/.claude/memory/feedback_syncro_estimate_hardware.md deleted file mode 100644 index 3d3eb5b..0000000 --- a/.claude/memory/feedback_syncro_estimate_hardware.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback_syncro_estimate_hardware -description: Hardware line items on Syncro estimates always use product_id 32252 with varying name/price per item -metadata: - type: feedback ---- - -All hardware on estimates uses a single generic product: `product_id: 32252` ("Hardware", `price_retail: 0.0`). The specific item name and cost are set per-line-item via the `name` and `price_retail` fields. Never search for individual hardware product IDs on estimates. - -**Why:** There is only one hardware product in Syncro. All hardware items are differentiated by description and price, not by product ID. - -**How to apply:** When building an estimate with hardware, always use `32252` as the product_id and set `name` to the specific item (e.g. "Dell OptiPlex 7010") and `price_retail` to the actual cost. Hardware is typically `taxable: true`. diff --git a/.claude/memory/feedback_syncro_history.md b/.claude/memory/feedback_syncro_history.md new file mode 100644 index 0000000..190ae8b --- /dev/null +++ b/.claude/memory/feedback_syncro_history.md @@ -0,0 +1,140 @@ +--- +name: Syncro lessons — incidents, quotes, and the events behind the rules +description: Detail and incident archive backing the Syncro feedback rules. Read this when you need to judge an edge case, verify a rule is still right, or understand WHY a rule exists. Tickets, dates, verbatim quotes, and tech user_id table live here so the main rule files can stay terse. Companion to feedback_syncro_api / billing / workflow. +metadata: + type: feedback +--- + +This file is the **incident archive** behind [[feedback_syncro_api]], [[feedback_syncro_billing]], and [[feedback_syncro_workflow]]. Those three files state the rules; this one preserves the events, quotes, and specifics so you can judge edge cases and verify a rule still applies. Read on-demand, not at session start. + +--- + +## Tech user_id table + +Used by [[feedback_syncro_billing]] Section "Corrections preserve attribution". + +| Tech | user_id | +|---------|---------| +| Mike | `1735` | +| Howard | `1750` | +| Winter | `1737` | +| Rob | `1760` | + +--- + +## Labor / hardware product IDs (verify rates live) + +Used by [[feedback_syncro_billing]]. Rates change — always `GET /products/` for `.product.price_retail`. The product IDs are stable; the table in `.claude/commands/syncro.md` is authoritative for the live list. + +| product_id | name | role | typical rate | +|------------|-----------------------------------------------|--------------------------|--------------| +| `1190473` | Labor - Remote Business | remote (most common) | $150 | +| `26118` | Labor - Onsite Business | onsite | $175 | +| `573881` | Labor - In-Shop Business | in-shop | (check live) | +| `26184` | Labor - Emergency or After Hours Business | emergency/after-hours | $262.50 | +| `1049360` | Labor- Warranty work | warranty / no-charge | $0 | +| `32252` | Hardware | generic estimate hardware| $0 base | + +--- + +## Incidents — by ticket / date + +### #32333 (2026-05-28) — Content-Type missing +- **Posted a comment** without `-H "Content-Type: application/json"`. Syncro returned a **400 HTML error page** twice before the fix. The HTML response looks like a hard failure but it's just a missing header. +- **Lesson** → [[feedback_syncro_api]] Section "Content-Type required". + +### #32332 (Cascades — Chris Knight new-user setup, 2026-05-27) — multi-lesson +This one ticket informed several rules at once: + +1. **Fabricated labor names.** Product `26118` ("Labor - Onsite Business") was billed on two lines as `"Emergency Call Setup"` and `"Onsite Computer Setup"` — both invented. Breaks the Syncro → QuickBooks sync because QB maps each labor line to an existing item. + - Mike's exact words: *"You CANNOT make up labor items. You MUST use existing items only for all labor items… the labor item must use the ones that already exist in syncro (otherwise it messes things up in Quickbooks)."* + - **Lesson** → [[feedback_syncro_billing]] "Never invent labor names". + +2. **Prepaid emergency billed as a split made-up onsite/emergency.** 1.5 hours of emergency on a prepaid customer (Cascades has 27h block). Howard had split it into fabricated onsite + emergency lines. Correct shape: ONE line on `26184` at quantity `2.25` (hrs×1.5). + - This also confirmed the **2026-05-27 update**: prepaid emergency now uses `26184` at ×1.5 quantity, REPLACING the older "prepaid → onsite `26118` at ×1.5" rule. 26184 labels the line correctly as emergency and maps right in QuickBooks. + - **Lesson** → [[feedback_syncro_billing]] "Emergency labor". + +3. **Correction via Mike's API key would have stolen Howard's commission.** The original line was Howard's (`user_id=1750`). Correcting via `update_line_item` preserved `user_id=1750`. A remove + re-add would have defaulted the new line to Mike's `1735` (the API-key owner). + - **Lesson** → [[feedback_syncro_billing]] "Corrections preserve attribution". + +4. **Ticket ownership is sticky.** Mike confirmed (also 2026-05-27): adding notes or labor to a ticket does NOT change the ticket owner. Multiple techs routinely work the same ticket. Only change ticket ownership when explicitly asked. + - **Lesson** → [[feedback_syncro_billing]] same section. + +### #32312 (Winter, 2026-05-21) — wrong day of week +- "Saturday" was computed as **May 24**, which is actually a **Sunday**. The appointment landed on the wrong day and didn't appear where Winter expected it on the calendar. +- **Lesson** → [[feedback_syncro_workflow]] "Verify appointment day-of-week". Always display `Day YYYY-MM-DD` (e.g., "Saturday 2026-05-23") in the preview, never the numeric date alone. + +### 2026-05-21 (Mike) — timer workflow superseded +- Mike confirmed the timer workflow (`timer_entry → charge_timer_entry`) is not used. The previous rule requiring timers was wrong and caused repeated billing failures (wrong product on the timer, `product_id` silently ignored by `charge_timer_entry`). +- **Lesson** → [[feedback_syncro_billing]] "Bill with add_line_item directly". The timer-response-shape rule in [[feedback_syncro_api]] is now historical / reference-only. + +### #32304 (Cascades, 2026-05-20) — hardcoded rates wrong +- Hardcoded rate table in the skill had Labor - Remote Business at $150/hr. The correct rate was $175/hr. Rates vary by contract and change over time. +- **Lesson** → [[feedback_syncro_billing]] "Fetch rates live". + +### Kittle Design #32263 (Howard, 2026-05-08) — appointment owner +- Howard created an 11:30 AM onsite to set up Joshua. I auto-added Howard's `user_id` to the appointment's `user_ids` array without confirming whether Howard was the owner or just an attendee. +- Howard's direction: *"when setting up an appointment confirm the appointment owner — don't just add additional attendees."* +- **Lesson** → [[feedback_syncro_workflow]] "Ask the appointment owner explicitly". + +### #32225 (Sombra Residential, 2026-05-06) — warranty hack +- Picked product `1190473` (Labor - Remote Business, $150/hr) for a warranty cleanup, set `billable: false` on the (then-used) timer, and assumed the timer flag would zero the line. Syncro silently overrode `billable: false` and the line came in at **$75**. I patched `price_retail` to `$0` to "fix" it. +- Howard caught it: warranty has its own product in the dropdown, and **patching dollar amounts is never how this is solved**. Use `1049360`. +- The earlier guidance in `.claude/commands/syncro.md` ("Warranty → use closest labor product with `billable=false`") was wrong; warranty has its own product like Onsite, Remote, Emergency. +- **Lesson** → [[feedback_syncro_billing]] "Warranty → product 1049360, never patch the price". + +### #32253 (Cascades, 2026-05-05) — duplicate timers from wrong jq path (HISTORICAL) +- The skill doc used `TIMER_ID=$(jq -r '.timer.id // .timer_entry.id')` on a flat response → resolved to `null`. A `null` TIMER_ID broke `charge_timer_entry` ("Not found"). The script retried and created a duplicate timer. +- Created two 0.5hr duplicate timers; deleted one via `delete_timer_entry`. +- **Lesson** → [[feedback_syncro_api]] timer section. Timers are no longer the workflow (see 2026-05-21), so this is reference-only. + +### 2026-05-04 (Winter) — wrong labor type default, blank-contact rule generalized +Two adjacent lessons: +1. Tickets I created used "Prepaid project labor" as the auto-selected labor type. That product is **exempt** and does NOT consume hours from a customer's prepaid block. Block-hour accounting silently drifted. Winter is fixing them retroactively. + - **Lesson** → [[feedback_syncro_billing]] "Labor type matches delivery channel". +2. Winter generalized the prior Cascades-only blank-contact rule to "most customers" — blank `contact_id` lets Syncro apply company-level email defaults, which route to the right people. Setting a specific contact overrides that. + - **Lesson** → [[feedback_syncro_workflow]] "Leave contact blank by default". + +### #32203 (Desert Auto Tech, 2026-04-23) — emergency was additive +- Billed "1h onsite + 1h emergency" as two additive lines = **$437.50**. The correct shape is ONE emergency line at time-and-a-half — 1 actual hour of emergency should bill at $262.50 (or $225 remote). +- **Lesson** → [[feedback_syncro_billing]] "Emergency = ×1.5 applied once". + +### #32142 (2026-04-23) — comment dup from wrapper-key error +- POST to `/comment` returned `{"comment": {...}}`. The script parsed `.id` (returning `null`), saw an error, and retried — creating a duplicate. +- **Lesson** → [[feedback_syncro_api]] "Response wrappers". POST `/comment` response is `{"comment": {...}}` — use `.comment.id`. Also: when GETting to verify, check ALL comments not just `[-3:]` — the new comment may not be the most recent if other activity occurred. + +### 2026-04-22 — ticket dup from retry +- A `POST /tickets` response looked wrong (null fields, jq error). The response wrapper is `{"ticket": {...}}` — `.ticket.id` not `.id`. Retried the POST, creating a duplicate ticket. +- **Lesson** → [[feedback_syncro_api]] "No idempotency — GET before retry" applies to ticket creation, not just comments. + +### #32185 — comment dup from shell-quoting error +- Subject contained an em-dash (`—`) → shell interpolation issue → the POST appeared to fail but actually succeeded. Retry created a duplicate comment. Comments cannot be deleted via API. +- **Lesson** → [[feedback_syncro_api]] "No idempotency". Hardening: write comment payloads to a temp file (e.g. `tmp/syncro_comment.json`) before posting — avoids shell quoting/encoding failures that produce misleading errors on requests that actually succeeded. + +### HTML formatting incident — `
        /
      • ` rendered as one line +- Posted a comment with `
        • ` items; Syncro's renderer collapsed them into a single line with no spacing. Had to post a corrected duplicate. +- **Lesson** → [[feedback_syncro_api]] "HTML formatting". Use `
          ` inside a `

          ` wrapper for bulleted lists. + +### Cascades — Meredith Kuhn keeps being the wrong default +- At Cascades of Tucson (`customer_id 20149445`), Syncro's contact picker repeatedly pre-selects **Meredith Kuhn** (Assistant Manager, ASSISTMAN-PC). She's the wrong contact — assigning her overrides distribution emails and routes notifications only to her. +- Howard surfaced the pattern; Mike confirmed the global blank-contact rule on 2026-05-24. +- **Lesson** → [[feedback_syncro_workflow]] "Cascades-specific guard". + +--- + +## Superseded rules — kept for context + +| Rule (old) | Replaced by | Date | +|-------------------------------------------------------------|--------------------------------------------------------------------------|--------------| +| All time billing through `timer_entry → charge_timer_entry` | Bill with `add_line_item` directly | 2026-05-21 | +| Prepaid emergency → onsite `26118` at ×1.5 | Prepaid emergency → emergency `26184` at qty ×1.5 | 2026-05-27 | +| Blank contact only at Cascades | Blank contact for ALL customers by default | 2026-05-04 | +| Warranty → closest labor product with `billable=false` | Warranty → product `1049360` (its own product) | 2026-05-06 | + +--- + +## Related ACG-internal references + +- **Skill doc:** `.claude/commands/syncro.md` — current labor product table, billing workflow, examples. +- **API docs:** `api-docs.syncromsp.com` (Swagger spec). +- **Tenant attribution rule (cross-product):** per-user-key attribution; see `/syncro` Attribution rule and `[[feedback_psa_default_syncro]]`. diff --git a/.claude/memory/feedback_syncro_html.md b/.claude/memory/feedback_syncro_html.md deleted file mode 100644 index 5884666..0000000 --- a/.claude/memory/feedback_syncro_html.md +++ /dev/null @@ -1,17 +0,0 @@ ---- -name: Syncro comment HTML formatting -description: Use
          for line breaks in Syncro comments, not

            /
          • — list tags don't render -type: feedback -originSessionId: b39e319c-ac3e-49f5-afb6-755e08f1fd82 ---- -Use `
            ` for line breaks in Syncro comment bodies. Do NOT use `
              `, `
            • `, or other block-level list tags — Syncro's renderer collapses them into a single line with no spacing. - -**Why:** Posted a comment with `
              • ` items and they all ran together on one line in the ticket view. Had to post a corrected duplicate. - -**How to apply:** For any bulleted list in a Syncro comment, use: -``` -- Item one
                -- Item two
                -- Item three -``` -wrapped in a `

                ` tag. Never use `

                  /
                • `. diff --git a/.claude/memory/feedback_syncro_labor_tax.md b/.claude/memory/feedback_syncro_labor_tax.md deleted file mode 100644 index 20296dd..0000000 --- a/.claude/memory/feedback_syncro_labor_tax.md +++ /dev/null @@ -1,14 +0,0 @@ ---- -name: feedback-syncro-labor-tax -description: Labor is never taxable in Arizona — always set taxable=false on labor line items in Syncro -metadata: - node_type: memory - type: feedback - originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e ---- - -Always pass `"taxable": false` explicitly on labor line items via `add_line_item`. - -**Why:** Labor products are configured with `taxable: false` in Syncro, but the `add_line_item` API endpoint does not inherit the product's taxable setting — it posts the line item as `taxable: true` regardless of the product config. - -**How to apply:** Include `"taxable": false` in every `add_line_item` payload for labor products (remote, onsite, in-shop, emergency, prepaid). The product itself is correct; the API just doesn't carry it through. diff --git a/.claude/memory/feedback_syncro_labor_type.md b/.claude/memory/feedback_syncro_labor_type.md deleted file mode 100644 index 890acbb..0000000 --- a/.claude/memory/feedback_syncro_labor_type.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -name: Syncro — use a billable labor type (in-shop / onsite / remote / web), never "Prepaid project labor" -description: When billing Syncro tickets, the labor product on the line item MUST be one of in-shop, onsite, remote, or web labor. "Prepaid project labor" is an exempt labor type and will NOT draw down a customer's prepay block — using it silently breaks block-hour accounting. -type: feedback ---- - -**Rule:** Line items on Syncro tickets must use a billable labor product matching the work delivery channel: **in-shop**, **onsite**, **remote**, or **web labor**. Do NOT use **"Prepaid project labor"** as the labor type for normal work. - -**Why:** Winter caught me on 2026-05-04 using "Prepaid project labor" by default. That product is **exempt** — it does not consume hours from a customer's prepaid block. So even if the ticket is for a prepay customer and looks billed correctly on the invoice, the block balance never decrements. Block-hour accounting silently drifts. Only the four non-exempt labor types (in-shop / onsite / remote / web) burn block time as intended. - -**How to apply:** - -- **Picking labor type:** Match it to how the work was actually delivered: - - **Remote labor** — work done over remote tools (RDP, Splashtop, ScreenConnect, phone-only support, scripts). This will be the most common pick. - - **Onsite labor** — work done at the client's physical location. - - **In-shop labor** — hardware brought to ACG's office for repair/build. - - **Web labor** — purely cloud/portal work (Microsoft 365 admin center, Entra, Cloudflare, etc.) where there's no remote-into-a-machine component. (Confirm with Winter if this distinction matters in your situation — sometimes "remote" is the right pick even for cloud work.) -- **Resolving the product_id:** Use `GET /products?search=remote+labor` (etc.) to pull the right product_id for the labor type, then pass that as `product_id` on the `add_line_item` POST. -- **Never default to "Prepaid project labor"** unless explicitly directed. If you find an existing entry with that product on a normal billable ticket, flag it — Winter (or whoever) will need to retroactively switch the labor type so the block decrement actually posts. -- **Verifying:** After billing, check that the customer's prepay block balance dropped by the expected number of hours. If it didn't, the labor type was wrong. - -**Real-world incident — 2026-05-04:** Tickets I created on this date used "Prepaid project labor" as the auto-selected labor type. Winter is fixing them retroactively. Going forward, default to `Remote labor` for the typical remote-support ticket, then adjust per delivery channel. - -**Where this lands in skill code:** `.claude/commands/syncro.md` and the `syncro` skill workflow examples need to make labor-type selection an explicit step in the add_line_item billing workflow, not a silent default. diff --git a/.claude/memory/feedback_syncro_line_items.md b/.claude/memory/feedback_syncro_line_items.md deleted file mode 100644 index ed8d34f..0000000 --- a/.claude/memory/feedback_syncro_line_items.md +++ /dev/null @@ -1,24 +0,0 @@ ---- -name: feedback_syncro_line_items -description: Correct Syncro API endpoint for adding labor/product line items to tickets -metadata: - node_type: memory - type: feedback - originSessionId: 282e0176-1bdb-49b7-8c15-faf152774d7e ---- - -Use `POST /api/v1/tickets/{internal_ticket_id}/add_line_item` to add line items to tickets. Both `name` and `description` fields are required (422 if either missing). Never use timers. - -**Why:** `/line_item`, `/line_items`, and PUT `line_items_attributes` all 404. The correct endpoint was found via Syncro Swagger spec at api-docs.syncromsp.com. Mike has explicitly said never use timers. - -**How to apply:** -- Path uses internal ticket ID (e.g., 111387456), not ticket number (32339) -- Required fields: `name`, `description`, `quantity`, `price`, `taxable` (and `product_id` if catalog item) -- Response is a flat object — parse `.id` directly (not `.line_item.id`) -- For testing/practice, use internal ACG account only (customer ID 15353550) - -Example: -``` -POST /api/v1/tickets/111387456/add_line_item -{"product_id":1049360,"name":"Labor- Warranty work","description":"...","quantity":1,"price":0.0,"taxable":false} -``` diff --git a/.claude/memory/feedback_syncro_live_rates.md b/.claude/memory/feedback_syncro_live_rates.md deleted file mode 100644 index f23b274..0000000 --- a/.claude/memory/feedback_syncro_live_rates.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -name: feedback-syncro-live-rates -description: Always fetch Syncro labor rates live from the API — never use hardcoded rate table -metadata: - node_type: memory - type: feedback - originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e ---- - -Always fetch `price_retail` live from `GET /products/` → `.product.price_retail` before billing any Syncro line item. Never use the rate table in the skill as a source of truth for dollar amounts. - -**Why:** The hardcoded rate table was proven wrong on 2026-05-20 (ticket #32304, Cascades) when Labor - Remote Business was listed at $150/hr but the correct rate was $175/hr. Rates vary by contract and change over time. - -**How to apply:** In any billing workflow, fetch the rate immediately after selecting the product_id: -```bash -RATE=$(curl -s "${BASE}/products/${PRODUCT_ID}?api_key=${API_KEY}" | jq -r '.product.price_retail') -``` -Use this `$RATE` value for the Ollama draft prompt, the preview shown to the user, and the `price_retail` field in all payloads. The product ID table in the skill is still valid — just not the rate column. diff --git a/.claude/memory/feedback_syncro_no_madeup_labor_items.md b/.claude/memory/feedback_syncro_no_madeup_labor_items.md deleted file mode 100644 index 0aba93b..0000000 --- a/.claude/memory/feedback_syncro_no_madeup_labor_items.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -name: feedback-syncro-no-madeup-labor-items -description: NEVER invent or rename Syncro labor line items — every labor line must use an existing product with its REAL name (from GET /products/); work detail goes in the description field, not the name -metadata: - type: feedback ---- - -Every labor line item on a Syncro ticket/invoice MUST be an **existing Syncro product, billed under its REAL name** (fetched from `GET /products/` → `.product.name`) with the live `price_retail`. **NEVER make up a custom line-item name** — even when the `product_id` is a real product. The line's `name` field = the product's actual name, verbatim. Put any work-specific narrative in the `description` field, never by renaming the line. - -**Why:** Mike flagged ticket #32332 (Cascades — Chris Knight new-user setup), where product `26118` (real name **"Labor - Onsite Business"**) was billed on two lines as **"Emergency Call Setup"** and **"Onsite Computer Setup"** — fabricated names. Invented/renamed labor items break the **Syncro -> QuickBooks sync** — QB maps each labor line to an existing item, so a fabricated name has no QB match and messes up the accounting (Mike's stated reason). The **`description` field is free text and can be whatever the work needs** — only the `name`/product must be an existing Syncro item. Mike: "You CANNOT make up labor items. You MUST use existing items only for all labor items... the labor item must use the ones that already exist in syncro (otherwise it messes things up in Quickbooks)." - -**How to apply:** When adding ANY labor line — `GET /products/`, copy `.product.name` verbatim into `name`, use `.product.price_retail` for `price_retail`, `taxable:false` for labor. Pick the correct EXISTING labor product (remote `1190473` "Labor - Remote Business" $150, onsite `26118` "Labor - Onsite Business" $175, emergency/after-hours `26184` "Labor - Emergency or After Hours Business" $262.50, in-shop `573881`, warranty `1049360`, etc. — full table in `/syncro`). Differentiate the work in `description`, not `name`. If no existing product fits the need, STOP and ask Mike — do not invent one. Related: [[feedback-syncro-live-rates]], [[feedback-syncro-warranty-product]]. diff --git a/.claude/memory/feedback_syncro_timer_first.md b/.claude/memory/feedback_syncro_timer_first.md deleted file mode 100644 index 3b5b0ca..0000000 --- a/.claude/memory/feedback_syncro_timer_first.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -name: Syncro — use add_line_item for billing, not timers -description: Syncro billing uses add_line_item directly. Timer workflow (timer_entry → charge_timer_entry) is not used. Overrides previous rule about timers being required. -type: feedback ---- - -**Rule:** Bill Syncro tickets with `POST /tickets/{id}/add_line_item` directly. Do NOT use `timer_entry → charge_timer_entry`. - -**Why:** Mike confirmed 2026-05-21 that the timer workflow is not used. The previous rule requiring timers was wrong and caused repeated billing failures (wrong product on the timer, product_id silently ignored by charge_timer_entry, etc.). - -**How to apply:** - -- `add_line_item` is the billing path for all work: labor, warranty, internal, hardware. -- Set `product_id`, `quantity` (decimal hours), `price_retail` (fetched live), `name`, `description`, `taxable: false`. -- Do not create timer entries as part of billing. -- Timer endpoints still exist in Syncro but are not part of the ACG billing workflow. - -**Previous rule (SUPERSEDED):** "All work-time billing MUST go through timer_entry → charge_timer_entry." That rule is no longer in effect as of 2026-05-21. diff --git a/.claude/memory/feedback_syncro_timer_response_shape.md b/.claude/memory/feedback_syncro_timer_response_shape.md deleted file mode 100644 index 5823819..0000000 --- a/.claude/memory/feedback_syncro_timer_response_shape.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -name: Syncro — timer_entry response is FLAT, not wrapped -description: POST /tickets/{id}/timer_entry returns a flat object {"id": N, "ticket_id": ..., "product_id": ..., ...}, NOT wrapped in {"timer": {...}} or {"timer_entry": {...}}. Parse as `.id`, never `.timer.id` — using the wrapped pattern silently returns null and creates duplicate timers when the script "retries". -type: feedback ---- - -> **SUPERSEDED / HISTORICAL — 2026-05-21.** Timers are no longer part of the ACG Syncro -> billing workflow; billing uses `add_line_item` directly. See [[Syncro — use add_line_item for billing, not timers]] (`feedback_syncro_timer_first.md`). Keep this note ONLY as reference -> for the rare case a timer is created manually — do not treat it as current workflow. - -**Rule:** When parsing the response from `POST /tickets/{id}/timer_entry`, use `.id` directly — the response is a FLAT object. Do NOT use `.timer.id // .timer_entry.id`. - -**Verified response shape (2026-05-05, ticket #32253):** -```json -{ - "id": 39031258, - "ticket_id": 109895882, - "user_id": 1750, - "start_time": "2026-05-05T09:00:00.000-07:00", - "end_time": "2026-05-05T09:30:00.000-07:00", - "recorded": false, - "billable": true, - "notes": "...", - "product_id": 26118, - "comment_id": null, - "ticket_line_item_id": null, - "active_duration": 1800, - "billable_time": 1800 - ... -} -``` - -**Why:** The skill doc at `.claude/commands/syncro.md` shows -```bash -TIMER_ID=$(echo "$TIMER_RESP" | jq -r '.timer.id // .timer_entry.id') -``` -That fallback resolves to `null` because neither key exists on the flat response. A `null` TIMER_ID then breaks `charge_timer_entry` ("Not found"). If the script retries the timer_entry POST after the perceived failure, it creates a duplicate — Syncro has no idempotency. Hit this on ticket #32253 (Cascades) on 2026-05-05; created two duplicate 0.5hr timers and had to delete one via `delete_timer_entry` before charging. - -**How to apply:** - -- **Parsing:** Always `jq -r '.id'` on the timer_entry response. -- **After ANY ambiguous timer_entry response** (null `.id`, jq error, network blip): GET the ticket and inspect `.ticket.ticket_timers[]` BEFORE retrying. Filter for `recorded: false` entries with the start/end times you just sent. -- **Cleanup if duplicates exist:** `POST /tickets/{id}/delete_timer_entry` with `{"timer_entry_id": N}` for the older duplicate(s). Returns `{"success": true}`. -- **Verifying the timer is on the ticket:** `GET /tickets/{id}` → `.ticket.ticket_timers` is the authoritative list. The standalone `/ticket_timers?ticket_id=N` query parameter does NOT filter by ticket — returns the entire global timer history. - -**Charge timer response is also flat:** -```json -{"id": 39031258, "recorded": true, "ticket_line_item_id": 42313052, ...} -``` -Parse as `.ticket_line_item_id` to get the auto-generated line. Do not look for a wrapper. - -**Where this lands in skill code:** `.claude/commands/syncro.md` example block needs `.id` not `.timer.id // .timer_entry.id`. Until the skill is patched, override the example pattern when running. diff --git a/.claude/memory/feedback_syncro_warranty_product.md b/.claude/memory/feedback_syncro_warranty_product.md deleted file mode 100644 index efe49fb..0000000 --- a/.claude/memory/feedback_syncro_warranty_product.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -name: Syncro — warranty work uses the "Labor- Warranty work" product, never patch a billable product to $0 -description: For warranty/no-charge labor on Syncro tickets, use product_id 1049360 (Labor- Warranty work, $0/hr). Do NOT use a regular labor product with billable=false or a patched price_retail=0. Prices are determined by the product selected; never override the dollar amount to make one product behave like another. -type: feedback ---- - -**Rule (two parts):** - -1. **Warranty / no-charge labor uses product `1049360` "Labor- Warranty work" ($0/hr, non-taxable).** Don't pick a regular Remote/Onsite/etc. labor product and try to neutralize it. - -2. **Prices are set by selecting the correct product. Never change `price_retail` on a line item to make a different labor product behave like a warranty (or any other) product.** If you find yourself reaching for `update_line_item` to drop a price, that's the signal to back up and pick a different `product_id` instead. - -**Why:** On 2026-05-06 (ticket #32225 Sombra Residential), I chose product `1190473` (Labor - Remote Business, $150/hr) for a follow-up warranty cleanup, set `billable: false` on the timer, and assumed the timer flag would zero the line. Syncro silently overrode `billable: false` and the resulting line came in at $75. I patched `price_retail` to $0 to "fix" it. Howard caught it: warranty work has a dedicated product in the dropdown, and patching dollar amounts is never how this is solved. The earlier guidance in `.claude/commands/syncro.md` (the "Warranty / no-charge → use closest labor product with billable=false" rule) was wrong; warranty has its own product just like Onsite, Remote, Emergency, etc., and that product is what should be used. - -**How to apply:** - -- **For any warranty / no-charge work:** `product_id = 1049360`, qty = actual hours, no need to patch the line — it generates at $0 because the product's `price_retail` is $0. -- **The warranty product is $0 by design — don't fake a free line with flags.** Its `price_retail` is $0, so the line generates at $0 from `price_retail` × `quantity`. Do NOT take a regular labor product and try to neutralize it with `billable: false`; that was the original mistake (see Why — and Syncro silently overrode the flag in the timer era anyway). Pick `1049360`. -- **Never reach for `update_line_item` to drop a price as a workaround.** If the dollar amount on a line is wrong, the wrong product was selected — undo, pick the correct product, redo. The only legitimate use of `update_line_item price_retail` is the Syncro auto-gen-zero recovery case (when the auto-line came in at $0 instead of the product's actual rate), and even that is a Syncro bug we're patching around, not a price-management tool. -- **For the dropdown of available labor products,** see the rate table in `.claude/commands/syncro.md`. If the situation doesn't match any of those, ask before improvising. - -**Where this lands in skill code:** `.claude/commands/syncro.md` — added `1049360` to the labor product table, fixed the warranty branch in the billing workflow, and added an explicit "never patch price_retail to convert products" rule. diff --git a/.claude/memory/feedback_syncro_workflow.md b/.claude/memory/feedback_syncro_workflow.md new file mode 100644 index 0000000..23908d9 --- /dev/null +++ b/.claude/memory/feedback_syncro_workflow.md @@ -0,0 +1,69 @@ +--- +name: Syncro workflow rules — preview comments, appointment ownership/dates, blank contact +description: Process and etiquette rules for Syncro work — always preview comments before posting, verify appointment day-of-week before creating, ask who the appointment owner is, leave the contact field blank by default for all customers. +metadata: + type: feedback +--- + +Rules only. Incident detail and ticket numbers live in [[feedback_syncro_history]] — read on-demand. API mechanics: [[feedback_syncro_api]]. Billing rules: [[feedback_syncro_billing]]. + +--- + +## 1. ALWAYS preview comments before posting — no exceptions + +Show the full comment text and wait for explicit confirmation before posting **any** comment to a Syncro ticket. No exceptions — not billing, not resolution notes, not client-facing, not internal/hidden notes. + +**Apply:** draft → show as a formatted block → "Good to post?" → wait for yes → only then POST. Also ALWAYS ask for minutes + labor type before logging time — never assume a default. + +Once posted, comments can't be deleted via API (manual GUI cleanup required). See also [[feedback_syncro_api]] §2 (no idempotency). + +--- + +## 2. Verify appointment day-of-week before creating + +Print the weekday and confirm it matches intent before posting: +```bash +py -c "import datetime; d=datetime.date(2026,5,24); print(d.strftime('%A %Y-%m-%d'))" +# Sunday 2026-05-24 +``` + +**Catch:** always show `Day YYYY-MM-DD` (e.g. "Saturday 2026-05-23") in the preview — never just the numeric date. + +--- + +## 3. Ask explicitly who the appointment owner is + +When creating a ticket with an appointment (Onsite, Remote, Phone Call, etc.), explicitly ask **who the appointment owner is** in the preview. Do NOT default to the ticket's assigned tech; do NOT silently add other techs as attendees. + +The owner is whose calendar the appointment lands on as the **primary entry** — they are accountable for being there. Additional `user_ids` only add the entry as secondary items on other techs' calendars (clutter + ambiguity). + +**Preview format:** +``` +APPOINTMENT +----------- +Type: Onsite +Owner: +Additional attendees: (optional, blank unless explicitly added) +Start: +End: +``` + +API payload: owner is the FIRST entry in `user_ids`. If only the owner, `user_ids` has ONE id. Additional attendees only after explicit confirmation. + +**Don't:** auto-add the ticket's `user_id` as owner without asking; add attendees without direction; treat owner as passive inheritance. + +--- + +## 4. Leave the contact field BLANK by default + +When creating or billing tickets, leave `contact_id` / `contact_name` / `contact_email` blank ("Not Assigned") by default for any customer. Only set a contact when the user explicitly says to. + +Blank contact lets Syncro apply **company-level email defaults**, routing notifications to the right people. Setting a specific contact overrides those and may push to a secondary email, bypassing the customer's intended distribution. + +**Apply:** +- `POST /tickets`: omit `contact_id` from the body entirely. Don't fetch contacts via `GET /customers/{id}` and pick one. +- `PUT /tickets/{id}`: send only fields you're changing. NEVER include `contact_id`/`contact_name`/`contact_email`, even matching the existing value — PUT can re-apply. +- Billing/invoices: same rule. Drop `contact_id` if it shows up in any payload. +- Verify after writes: `GET /tickets/{id}` → `.ticket.contact_id` should be `null`. If set, blank it: `PUT /tickets/{id}` with `{"contact_id": null}`. + +**Watch for Syncro's auto-default:** the contact picker often pre-selects the first-alphabetical or most-recently-used contact (Cascades' Meredith Kuhn is the canonical bad-default — see [[feedback_syncro_history]]). The rule applies the same way: ignore the pre-selection, leave null. diff --git a/.claude/memory/gururmm-development-principles.md b/.claude/memory/gururmm-development-principles.md deleted file mode 100644 index ff76578..0000000 --- a/.claude/memory/gururmm-development-principles.md +++ /dev/null @@ -1,108 +0,0 @@ ---- -name: GuruRMM Development Principles -description: Every GuruRMM feature is full-stack (backend+API+UI+docs+scalability); product works without AI; the FEATURE_ROADMAP entry update is part of definition-of-done. Mirrors guru-rmm/docs/DESIGN.md. -type: project ---- - -# GuruRMM Development Principles - -**Created:** 2026-04-29 -**Authority:** Mike Swanson (owner) -**Location:** Documented in `projects/msp-tools/guru-rmm/docs/DESIGN.md` - ---- - -## Holistic Feature Development (MANDATORY) - -When planning or implementing ANY GuruRMM feature, the complete stack must be considered and built: - -### Required Components for Every Feature: -1. **Backend/Agent Logic** — core capability implementation -2. **API Endpoints** — control and monitoring interfaces -3. **UI/UX** — dashboard configuration, status display, management interface -4. **Documentation** — user guides and operational docs -5. **Scalability Design** — architected for future expansion - -### Example: Network Discovery Node -A complete implementation includes: -- Agent-side scanning capability (ICMP, ARP, SNMP) -- Server-side data storage and API endpoints -- Dashboard UI for: - - Designating which agent is the discovery node - - Viewing discovered devices - - Configuring scan schedules - - Setting IP ranges and exclusions -- Status indicators (discovery progress, last scan time) -- Future-proof data model supporting multiple discovery methods - -### Why This Matters: -- **Completeness:** Features without UI are unusable by non-API-expert admins -- **User Experience:** Configuration should be intuitive, not require documentation diving -- **Consistency:** Every feature should feel native to the product -- **No Dead Ends:** Design decisions shouldn't block obvious next steps - -**Features shipped without their UI/configuration interfaces are incomplete and will be rejected.** - ---- - -## Living Roadmap (MANDATORY) - -`projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md` is the single living record of intent — where the product is going AND where it has been. It is a status-and-plan tracker, NOT a write-once backlog. Convention: `[ ]` = planned, `[x]` = shipped (annotate with date). - -**Consult it going in, update it coming out — the roadmap update is part of definition-of-done:** -- **Before building:** read the feature's roadmap entry for intent/scope. New work that isn't on the roadmap gets an entry first. -- **When shipping or modifying a feature:** update its roadmap entry in the SAME change — flip `[ ]`→`[x]` with a date, or revise/add the item. A code change that ships or alters a roadmap feature WITHOUT touching FEATURE_ROADMAP.md is incomplete (same standard as shipping without UI). -- **Don't over-claim:** an entry's text must match what's actually built. If only part is done, keep `[ ]` and annotate the scope (e.g. "TCP probing shipped; ICMP/ARP/SNMP pending") rather than flipping. - -`/rmm-audit`'s roadmap pass is the **backstop** that reconciles drift — it is not the primary maintainer. Dev work keeps the roadmap honest; the audit catches what slipped. See [[feedback_rmm_dev_is_mike]] (RMM dev is Mike's). - ---- - -## AI-Optional Operation - -GuruRMM must be fully functional without requiring AI agents (Claude, autonomous analysis tools) to operate. - -### Core Requirements: -- All functionality accessible via traditional dashboard/API -- Configuration and management through standard interfaces -- Usable by MSP techs with zero AI/ML knowledge -- Deterministic, reliable operation for production environments - -### AI Features Are Enhancements: -- **Agentic analysis** (AI-powered log analysis, anomaly detection, troubleshooting) — planned enhancement -- **Agentic command routing** (intelligent decision-making about command execution) — planned enhancement -- Users choose whether to enable AI features -- Product does not mandate AI usage - -### Why This Matters: -- Real MSPs need deterministic, reliable systems -- AI features can break, hallucinate, or be unavailable -- Core operations cannot depend on AI availability -- Production stability over experimental features - ---- - -## Application to Development - -### When Adding Features: -1. ✅ Design the complete stack before starting implementation -2. ✅ Include UI mockups in feature planning -3. ✅ Consider future expansion in data model design -4. ✅ Ensure feature works via dashboard without API knowledge -5. ✅ Never assume AI availability for core functionality - -### When Reviewing Features: -1. ❌ Reject backend-only implementations without UI -2. ❌ Reject features that require API expertise to configure -3. ❌ Reject designs that paint into architectural corners -4. ❌ Reject features that require AI to function - -### Planning Questions: -- "How does an admin configure this in the dashboard?" -- "What does the status display look like?" -- "How do we expand this in v2/v3?" -- "Does this work if AI services are unavailable?" - ---- - -**These principles apply to ALL features — past, present, and future.** diff --git a/.claude/memory/project-cascades-migration-plan.md b/.claude/memory/project-cascades-migration-plan.md deleted file mode 100644 index 78764b4..0000000 --- a/.claude/memory/project-cascades-migration-plan.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -name: project-cascades-migration-plan -description: Cascades of Tucson department migration plan — Syncro ticket, plan file location, resume command -metadata: - type: project ---- - -Active multi-day migration project for Cascades of Tucson. Department-by-department domain join, folder redirection, and Entra sync rollout. - -**Why:** Full migration from workgroup/cloud-only to domain-integrated environment with clean end state (everything works automatically on fresh machine domain join). - -**Syncro ticket:** https://computerguru.syncromsp.com/tickets/110680053 — update with notes after each session. - -**Plan file:** `C:\Users\Howard\.claude\plans\wise-discovering-panda.md` - -[VERIFY 2026-05-26 — plan-file path C:\Users\Howard\... is machine-specific (Howard's box); confirm it resolves on ACG-TECH03L/Howard-Home or relocate the plan into the synced repo. Cascades m365-rollout still active/blocked.] - -**Resume command:** Howard says "resume the Cascades migration plan" → read plan file, check CURRENT SAVE POINT section, pick up at next unchecked item. - -**How to apply:** At every Cascades session start, read the plan file CURRENT SAVE POINT before doing any work. Update the save point and run /save at end of session. diff --git a/.claude/memory/project_cascades.md b/.claude/memory/project_cascades.md new file mode 100644 index 0000000..08cbb87 --- /dev/null +++ b/.claude/memory/project_cascades.md @@ -0,0 +1,55 @@ +--- +name: Cascades of Tucson — current state (migration, admin, CA rollout, billing) +description: Active state of the Cascades migration — Syncro ticket #110680053, plan file (machine-specific path), admin accounts (sysadmin@ = Howard, admin@ = Mike, not break-glass), CA caregiver pilot (Phase B / SG-Caregivers-Pilot, scope group-only never tenant-wide), prepaid block ~37.5h (rate TBD). Active rules in feedback_cascades.md, incident detail in project_cascades_history.md. +type: project +--- + +Rules: [[feedback_cascades]]. Detail / decisions / pilot-cleanup checklist: [[project_cascades_history]]. + +## Migration + +Multi-day department-by-department migration from workgroup/cloud-only to domain-integrated environment. Clean end state: everything works automatically on a fresh-machine domain join. + +- **Syncro ticket:** https://computerguru.syncromsp.com/tickets/110680053 — update with notes after each session. +- **Plan file:** `C:\Users\Howard\.claude\plans\wise-discovering-panda.md` *(machine-specific path on Howard's box; confirm it resolves on ACG-TECH03L / Howard-Home or relocate into the synced repo)*. +- **Resume:** Howard says "resume the Cascades migration plan" → read plan file, check `CURRENT SAVE POINT`, pick up at next unchecked item. At session start, read the save point BEFORE doing any work; update + `/save` at session end. + +## Tenant + +Cascades Tucson tenant: `207fa277-e9d8-4eb7-ada1-1064d2221498`. + +## Admin accounts (daily-driver, NOT break-glass) + +- **`sysadmin@cascadestucson.com`** — Howard's working admin (used PIM portal click 2026-04-28 for CA Admin role). +- **`admin@cascadestucson.com`** — Mike's working admin. + +As of 2026-04-29, neither is confirmed cloud-only / FIDO2 / CA-excluded. **A break-glass admin still needs to be designed** before CA bypass policies go live. Don't assume sysadmin@ / admin@ meet break-glass criteria — verify against Graph (`onPremisesSyncEnabled`, authentication methods, CA exclusions) first. + +## CA caregiver pilot — phased, group-scoped + +The caregiver bypass CA work is a **phased rollout**, not a tenant-wide cutover. The original §5 design in `clients/cascades-tucson/docs/cloud/user-account-rollout-plan.md` and the 2026-04-29 resume-point implied tenant-wide; that was corrected. + +- New CA policies target `SG-Caregivers-Pilot` only (then `SG-Caregivers` after Entra Connect exits staging). Never `includeUsers: All`. +- The legacy `Require multifactor authentication for all users` policy **stays in place**. PATCH its `excludeGroups` to add the pilot group; existing office-staff behavior is unchanged. +- Expansion to other populations happens one group at a time post-pilot. Legacy all-users-MFA is deleted only at the very end when every population is governed by phased policies. + +**Caregiver policy set (current scope):** +- PATCH `Require multifactor authentication for all users`: add `SG-Caregivers-Pilot` to excludeGroups. +- CREATE `CSC - Block caregivers off Cascades network` (includeGroups: pilot, locations: not Cascades, grant: BLOCK). +- CREATE `CSC - Block caregivers on non-compliant device` (includeGroups: pilot, device filter `isCompliant -eq False`, grant: BLOCK). +- CREATE `CSC - Caregiver sign-in frequency 8h` (includeGroups: pilot, session control: 8h re-auth). + +For caregivers we use **Block** directly on non-compliant + off-network — caregivers can't satisfy MFA (no personal device), so block is the cleaner UX. Future non-caregiver populations will likely use MFA grants since office staff have MFA capability. + +## Billing + +Cascades is a **prepaid block** customer (Syncro `customer_id: 20149445`). Block had ~37.5h remaining as of 2026-05-20 (38.5h minus 1h for ticket #32304). + +**Block rate:** NOT yet confirmed. $175/hr is the standard non-block remote rate, NOT necessarily the Cascades block rate. **Ask Mike before billing.** Invoices post at $0.00 with hours deducted by quantity. See [[feedback_syncro_billing]] §7 for emergency-on-prepaid mechanics. + +## Pilot cleanup checklist + +At pilot wrap (transition to production `SG-Caregivers`), the following MUST be cleaned up — surface this list when we get to "flip pilot CA policies to production": +- `pilot.test@cascadestucson.com` — delete (or disable + remove license; recovers a Business Premium seat). +- `howard.enos@cascadestucson.com` — if used during pilot validation, clean up (Howard's eventual synced identity won't exist as a cloud user until Entra Connect exits staging). +- `SG-Caregivers-Pilot` — remove from CA policy targets when superseded by synced `SG-Caregivers`; group itself can be deleted after. diff --git a/.claude/memory/project_cascades_admin_accounts.md b/.claude/memory/project_cascades_admin_accounts.md deleted file mode 100644 index 8d4821b..0000000 --- a/.claude/memory/project_cascades_admin_accounts.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -name: Cascades admin account ownership -description: Howard uses sysadmin@cascadestucson.com, Mike uses admin@cascadestucson.com — used for daily admin work, not break-glass. -type: project ---- - -At Cascades Tucson tenant (`207fa277-e9d8-4eb7-ada1-1064d2221498`): - -- **`sysadmin@cascadestucson.com`** — Howard's working admin account (used the PIM portal click on 2026-04-28 for the CA Admin role assignment). -- **`admin@cascadestucson.com`** — Mike's working admin account. - -As of 2026-04-29, neither is confirmed as cloud-only / FIDO2 / CA-excluded — Howard "doesn't think they are cloud-only." A break-glass admin still needs to be designed before the CA bypass policies go live. - -**Why:** Avoid asking who owns which admin login again, and keep clear that these are *daily-driver* admin accounts, not the eventual break-glass. - -**How to apply:** When discussing Cascades admin work or break-glass design, attribute correctly. Don't assume sysadmin@ or admin@ already meet break-glass criteria — verify against Graph (onPremisesSyncEnabled, authentication methods, CA exclusions) before relying on either. diff --git a/.claude/memory/project_cascades_billing.md b/.claude/memory/project_cascades_billing.md deleted file mode 100644 index 83d1f6f..0000000 --- a/.claude/memory/project_cascades_billing.md +++ /dev/null @@ -1,14 +0,0 @@ ---- -name: project-cascades-billing -description: "Cascades of Tucson Syncro billing — prepaid block customer, rate TBD" -metadata: - node_type: memory - type: project - originSessionId: d91f202e-ddd5-46d7-b674-f848eb78aa8e ---- - -Cascades of Tucson (Syncro customer_id: 20149445) is a prepaid block customer. As of 2026-05-20 the block had ~37.5 hrs remaining (38.5 minus 1hr for ticket #32304). - -**Block rate:** Not yet confirmed — $175/hr is the standard non-block remote rate, NOT the Cascades block rate. Ask Mike before billing future Cascades tickets. - -**How to apply:** Always check prepay_hours before billing. Invoices post at $0.00 with hours deducted by quantity. Confirm block rate with Mike before setting price_retail. diff --git a/.claude/memory/project_cascades_ca_phased_rollout.md b/.claude/memory/project_cascades_ca_phased_rollout.md deleted file mode 100644 index 45a014f..0000000 --- a/.claude/memory/project_cascades_ca_phased_rollout.md +++ /dev/null @@ -1,26 +0,0 @@ ---- -name: Cascades CA bypass — phased per-group rollout, NOT tenant-wide -description: Caregiver bypass CA policies are scoped to SG-Caregivers-Pilot only at start, then expanded one department at a time. Legacy all-users-MFA stays in place; we PATCH excludeGroups, never delete it during rollout. -type: project ---- - -The Cascades caregiver bypass CA work is a **phased rollout**, not a tenant-wide policy swap. This corrects the original §5 design in `clients/cascades-tucson/docs/cloud/user-account-rollout-plan.md` and the resume-point in `2026-04-29-howard-cascades-bypass-pilot-phase-b-buildout.md`, which both implied a tenant-wide cutover. - -**What this means concretely:** - -- New CA policies target `SG-Caregivers-Pilot` only (then `SG-Caregivers` after Entra Connect exits staging). They do NOT use `includeUsers: All`. -- The legacy `Require multifactor authentication for all users` policy **stays in place**. We PATCH its `excludeGroups` to add the pilot group, so existing office-staff behavior is unchanged. -- Expansion to additional populations (front desk, clinical, admin staff) happens one group at a time post-pilot — each with its own scoped policy set, each by editing `excludeGroups` on the legacy policy and adding `includeGroups` to the relevant new policies. -- The legacy all-users-MFA policy is ONLY deleted at the very end, when every population is governed by a phased policy. - -**Why:** Howard pulled the brakes on 2026-04-29 after spotting that policies #1, #2, #3 in the original design hit all users — would have blocked any office user signing in off-site who wasn't in `SG-External-Signin-Allowed`. The btw replay he pasted contained the correct rescoping: "Re-scope the new policies so they only target the pilot group initially, and roll out to other groups one at a time later." Phased preserves today's behavior for everyone except the pilot group while we validate the bypass mechanics. - -**How to apply:** When building or modifying Cascades CA policies, default to group-scoped (`includeGroups`), never `includeUsers: All`. When expanding to a new department, the steps are: (1) create the department's group, (2) PATCH legacy all-users-MFA to add it to `excludeGroups`, (3) add it to `includeGroups` on the relevant new policies. Treat any "let's just push it tenant-wide now that the pilot worked" suggestion as a regression of this decision and flag it. - -**Caregiver set (the only set in scope today):** -- PATCH `Require multifactor authentication for all users`: add `SG-Caregivers-Pilot` to excludeGroups. -- CREATE `CSC - Block caregivers off Cascades network` (includeGroups: pilot, locations: not Cascades, grant: BLOCK). -- CREATE `CSC - Block caregivers on non-compliant device` (includeGroups: pilot, device filter isCompliant -eq False, grant: BLOCK). -- CREATE `CSC - Caregiver sign-in frequency 8h` (includeGroups: pilot, session control: 8h re-auth). - -Note: for caregivers we use **Block** directly on non-compliant + off-network, not "Require MFA" — caregivers can't satisfy MFA (no personal device), so block is the cleaner UX. For non-caregiver populations later, MFA grants will likely be appropriate since office staff have MFA capability. diff --git a/.claude/memory/project_cascades_history.md b/.claude/memory/project_cascades_history.md new file mode 100644 index 0000000..683dbd4 --- /dev/null +++ b/.claude/memory/project_cascades_history.md @@ -0,0 +1,52 @@ +--- +name: Cascades history — fdeploy root cause, CA rescoping decision, design rationale +description: Detail and rationale behind the active Cascades rules — fdeploy 502/ACL root cause and the Flags=1211→187 fix, the 2026-04-29 CA-policy rescoping decision (Howard pulled the brakes on tenant-wide rollout), and the per-user security-group decision. Read on-demand when judging an edge case or revisiting a design decision. +type: project +--- + +This file is the rationale archive for [[project_cascades]] and [[feedback_cascades]]. Read on-demand. + +--- + +## fdeploy folder-redirection root cause (the "stuck forever" failure) + +**Symptom:** new Cascades user logs in, folder redirection silently doesn't take effect. fdeploy logs "no changes detected" indefinitely. + +**Root cause:** `fdeploy1.ini` had `Flags=1211` which includes **Grant Exclusive Rights** (bit `0x400`). The Homes share grants `Domain Users = Change`, which excludes `WRITE_DAC`. fdeploy fails to set NTFS on new subfolders → logs 502 → **caches the failure** and never retries. + +**Fix:** changed to `Flags=187` in: +``` +{512B43A4-F049-4CE5-BFAC-860AD13E92BE}\User\Documents & Settings\fdeploy1.ini +``` +on CS-SERVER. + +**Why both GUID and legacy registry keys matter at the client side:** Downloads has no legacy-name key, so GUID alone works. Documents / Music / Pictures have BOTH `{GUID}` AND `Personal` / `My Music` / `My Pictures`. Windows reads the legacy key for the actual shell folder — GUID alone is insufficient. The recovery script `fix-shell-redirect.ps1` sets both. + +--- + +## CA policy rescoping decision (2026-04-29) + +The original §5 design in `clients/cascades-tucson/docs/cloud/user-account-rollout-plan.md` and the resume-point in `2026-04-29-howard-cascades-bypass-pilot-phase-b-buildout.md` both implied a **tenant-wide cutover**. Howard pulled the brakes on 2026-04-29 after spotting that policies #1, #2, #3 in the original design hit ALL users — would have blocked any office user signing in off-site who wasn't in `SG-External-Signin-Allowed`. + +The replay he pasted contained the correct rescoping: +> *"Re-scope the new policies so they only target the pilot group initially, and roll out to other groups one at a time later."* + +**Why phased:** preserves today's behavior for everyone except the pilot group while we validate the bypass mechanics. Tenant-wide cutover would have been a regression risk for office staff. + +**Operational application of this decision** is captured in [[project_cascades]] "CA caregiver pilot — phased, group-scoped". Treat any "let's just push it tenant-wide now that the pilot worked" suggestion as a regression of this decision and flag it. + +--- + +## Per-user security-group decision (2026-05-14) + +Howard explicitly **declined** an `OU=Caregivers` → `SG-Caregivers` auto-mirror script. Security-group membership controls access + CA-policy coverage; that decision should stay deliberate and reviewed per user, never automated away. + +OU placement is mechanical (controls Entra Connect sync scope). Group membership is an access-control decision and must be conscious. + +The active rule that comes from this is in [[feedback_cascades]] §2. + +--- + +## Pilot cleanup obligations (forward-looking) + +The Cascades caregiver shared-phone bypass pilot (Path B, cloud-only) uses temporary pilot artifacts. At pilot wrap, all must be cleaned up — checklist lives in [[project_cascades]] "Pilot cleanup checklist". Originally flagged by Howard 2026-04-29 with the explicit "all pilot artifacts must be cleaned up" direction (clean tenant hygiene + license recovery: Business Premium seat returned to the 34-spare pool). diff --git a/.claude/memory/project_cascades_pilot_cleanup.md b/.claude/memory/project_cascades_pilot_cleanup.md deleted file mode 100644 index c733616..0000000 --- a/.claude/memory/project_cascades_pilot_cleanup.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -name: Cascades caregiver pilot — cleanup obligations -description: Pilot accounts (pilot.test@, howard.enos@ once synced) at Cascades must be removed at end of caregiver bypass pilot. -type: project ---- - -The Cascades caregiver shared-phone bypass pilot (Path B, cloud-only) is using a temporary pilot identity. Howard explicitly flagged on 2026-04-29 that **all pilot artifacts must be cleaned up** when the pilot wraps: - -- **`pilot.test@cascadestucson.com`** — cloud-only test user created for the pilot. Delete (or disable + remove license) post-pilot. -- **`howard.enos@cascadestucson.com`** — Howard's eventual synced identity (won't exist as a cloud user until Entra Connect exits staging). If used during pilot validation, also clean up after. -- `SG-Caregivers-Pilot` cloud Entra group — superseded by synced `SG-Caregivers` group post-staging-exit. Remove pilot group from CA policy targets at that point; group itself can be deleted after. - -**Why:** Howard explicitly flagged on 2026-04-29 that pilot accounts must not stick around — clean tenant hygiene + license recovery (Business Premium seat returned to the 34-spare pool). - -**How to apply:** When the pilot validates and we transition to production rollout (synced `SG-Caregivers`), the cleanup of pilot.test, howard.enos pilot usage, and SG-Caregivers-Pilot is part of the cutover, not a separate task to forget. Surface this checklist when we get to the "flip pilot CA policies to production" step. diff --git a/.claude/memory/project_dataforth.md b/.claude/memory/project_dataforth.md new file mode 100644 index 0000000..3eab615 --- /dev/null +++ b/.claude/memory/project_dataforth.md @@ -0,0 +1,31 @@ +--- +name: Dataforth — current state (email, contacts, MFA posture) +description: Dataforth runs on M365 (Graph API for mail send); the neptune.acghosting.com Exchange is ACG's, NOT Dataforth's. MFA enforced 2026-04-04 across the tenant (3 CA policies). AJ at Dataforth needs forwarding from dataforthgit@. Incident history lives in project_dataforth_history.md. +type: project +--- + +Incident detail (2026-03-27 DF-JOEL2 compromise, attacker IPs, IC3, etc.) lives in [[project_dataforth_history]] — read on-demand. + +## Email infrastructure + +Dataforth's email runs on **Microsoft 365** (`sysadmin@dataforth.com`, tenant in vault at `clients/dataforth/m365.sops.yaml`). + +**Don't confuse with `neptune.acghosting.com`** (`67.206.163.124`) — that Exchange entry in `clients/dataforth/neptune-exchange.sops.yaml` is **ACG-side infrastructure, not Dataforth's**. Do not use it for Dataforth email workflows. + +**Send via Graph (SMTP basic auth is disabled):** +- Preferred: Microsoft Graph `POST /v1.0/users/sysadmin@dataforth.com/sendMail` with a client_credentials token. +- Alt: XOAUTH2 over SMTP. +- Entra app in vault at `clients/dataforth/m365.sops.yaml` under `credentials.entra-app`. Verify `Mail.Send` application permission is granted before use. + +## Contacts + +- **AJ (Dataforth):** messages to `dataforthgit@` need to forward to AJ. (Forwarding setup TBD — verify status.) + +## MFA / CA posture + +3 Conditional Access policies enforced **2026-04-04** across the tenant (deployed report-only after the 2026-03-27 incident, then promoted): +- Require MFA (skip from office IP `67.206.163.122`) +- Block foreign sign-ins (US only; `MFA-Travel-Bypass` group for exceptions) +- Block legacy auth + +Status as of MFA rollout: 19/38 users were MFA-ready at enforcement; the rest registered before the deadline. diff --git a/.claude/memory/project_dataforth_email.md b/.claude/memory/project_dataforth_email.md deleted file mode 100644 index 1bb777f..0000000 --- a/.claude/memory/project_dataforth_email.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -name: Dataforth email infrastructure -description: Dataforth uses M365 for email; the Exchange server on 172.16.x.x / neptune.acghosting.com is NOT Dataforth's — it belongs to ACG's own infrastructure -type: project -originSessionId: 7034be43-1464-4085-b765-dc1226b1f8e0 ---- -Dataforth's email runs on Microsoft 365 (sysadmin@dataforth.com, tenant in vault at `clients/dataforth/m365.sops.yaml`). - -The Exchange server at `neptune.acghosting.com` / `67.206.163.124` listed in the vault under `clients/dataforth/neptune-exchange.sops.yaml` is **not** part of Dataforth's infrastructure — do not use it for Dataforth email workflows. - -**Why:** Mike corrected this during pipeline notification work (2026-04-22). The Exchange entry is an ACG-side server, not Dataforth's. - -**How to apply:** For any Dataforth email sending, SMTP basic auth is disabled on the tenant. Must use OAuth2 — either XOAUTH2 over SMTP or (preferred) Microsoft Graph API `POST /v1.0/users/sysadmin@dataforth.com/sendMail` with a client_credentials token. Entra app is in vault at `clients/dataforth/m365.sops.yaml` under `credentials.entra-app`. Verify `Mail.Send` application permission is granted before use. diff --git a/.claude/memory/project_dataforth_history.md b/.claude/memory/project_dataforth_history.md new file mode 100644 index 0000000..3c27e00 --- /dev/null +++ b/.claude/memory/project_dataforth_history.md @@ -0,0 +1,46 @@ +--- +name: Dataforth incident history — 2026-03-27 DF-JOEL2 compromise +description: Detail and remediation log for the 2026-03-27 Dataforth security incident — DF-JOEL2 compromised via ScreenConnect social-engineering, attacker C2 IPs and case numbers, the MFA / CA rollout that came out of it, Joel Lohr retirement handling. RESOLVED 2026-04-04 when CA policies enforced. +type: project +--- + +Incident archive backing [[project_dataforth]]. Read on-demand when discussing post-incident posture, IPs, IC3 case, or the MFA rollout origin story. + +## Incident — 2026-03-27 (RESOLVED 2026-04-04) + +Joel Lohr's workstation (**DF-JOEL2**, 192.168.0.143) compromised via a phishing email to a personal Yahoo account. Attacker (alias "Angel Raya") deployed ScreenConnect C2 backdoors. M365 account also compromised — sign-ins from Turkey/UK/Germany. + +## Attacker + +- **C2 IPs:** `80.76.49.18`, `45.88.91.99` (AS399486, Virtuo, Montreal QC) — SUSPENDED by host. +- **Cloud relay:** `instance-wlb9ga-relay.screenconnect.com` +- **ConnectWise case:** `03464184` +- **IC3 complaint:** `1c32ade367084be9acd548f23705736f` + +## Remediation + +- C2 IPs blocked at UDM firewall via `iptables`. **Outstanding:** add permanent rules in the UniFi UI (still on iptables-only as of incident close). +- 3 rogue ScreenConnect clients uninstalled. +- `jlohr` AD password reset; M365 sessions revoked. +- 32 machines scanned clean, 28 unreachable (offline at scan time — check when available). +- No lateral movement detected. + +## MFA rollout (born from this incident) + +- 3 CA policies deployed report-only first, then enforced 2026-04-04: + - Require MFA (skip from office IP `67.206.163.122`) + - Block foreign sign-ins (US only; `MFA-Travel-Bypass` group for exceptions) + - Block legacy auth +- Notice sent to all users with the 2026-04-04 deadline. +- 19/38 users were MFA-ready at policy go-live; 19 had pending registration. + +## Joel Lohr + +- Retired 2026-03-31. +- Auto-reply directs contacts to Dan Center (`dcenter@dataforth.com`). +- Account to be disabled after retirement (verify status). + +## Open items + +- Permanent UDM block rules for C2 IPs (currently only iptables, not in UniFi UI). +- 28 machines that were offline at the post-incident scan — re-scan when reachable. diff --git a/.claude/memory/project_dataforth_incident_2026-03-27.md b/.claude/memory/project_dataforth_incident_2026-03-27.md deleted file mode 100644 index cbd4aac..0000000 --- a/.claude/memory/project_dataforth_incident_2026-03-27.md +++ /dev/null @@ -1,39 +0,0 @@ ---- -name: Dataforth Security Incident 2026-03-27 -description: DF-JOEL2 compromised via ScreenConnect social engineering. MFA deployed. IC3 filed. C2 IPs blocked. Full remediation completed. -type: project ---- - -[RESOLVED] CA policies enforced 2026-04-04; incident closed. - -## Incident -Joel Lohr's workstation (DF-JOEL2, 192.168.0.143) compromised via phishing email to personal Yahoo account. Attacker "Angel Raya" deployed ScreenConnect C2 backdoors. M365 account also compromised from Turkey/UK/Germany. - -## Attacker -- C2: 80.76.49.18 and 45.88.91.99 (AS399486, Virtuo, Montreal QC) - SUSPENDED by host -- Cloud relay: instance-wlb9ga-relay.screenconnect.com -- ConnectWise case: 03464184 -- IC3 complaint: 1c32ade367084be9acd548f23705736f - -## Remediation -- C2 IPs blocked at UDM firewall (iptables - need permanent rules in UniFi UI) -- 3 rogue ScreenConnect clients uninstalled -- jlohr AD password reset, M365 sessions revoked -- 32 machines scanned clean, 28 unreachable (offline) -- No lateral movement detected - -## MFA Rollout -- 3 CA policies deployed (report-only until April 4, 2026): - - Require MFA (skip from office IP 67.206.163.122) - - Block foreign sign-ins (US only, MFA-Travel-Bypass group for exceptions) - - Block legacy auth -- 19/38 users MFA-ready, 19 need to register -- MFA notice sent to all users, deadline April 4 - -## Joel Lohr -- Retiring March 31, 2026 -- Auto-reply directs contacts to Dan Center (dcenter@dataforth.com) -- Account should be disabled after retirement - -**Why:** Active security incident requiring immediate response. -**How to apply:** Monitor CA policies in report-only mode, enforce April 4. Check 28 offline machines when available. Add C2 IPs to permanent UDM block list. diff --git a/.claude/memory/project_guruconnect.md b/.claude/memory/project_guruconnect.md new file mode 100644 index 0000000..c91dfec --- /dev/null +++ b/.claude/memory/project_guruconnect.md @@ -0,0 +1,69 @@ +--- +name: GuruConnect — v2 direction and deploy procedure +description: GuruConnect v2 architecture direction (native-first full key fidelity, bidirectional file cut/paste/drag; WebRTC fallback only) plus the manual deploy procedure to 172.16.3.30 (build-on-server, login shell, sqlx runtime queries, NPM trusted-proxy gotcha). v2 live since 2026-05-30 at connect.azcomputerguru.com. +type: project +--- + +## Direction (v2 architecture) + +Re-architecture set 2026-05-29 after an audit found 3 CRITICAL relay-plane auth holes. Spec: `projects/msp-tools/guru-connect/docs/specs/SPEC-002-v2-modernization-architecture.md`. Mike is the product owner; willing to scrap v1 entirely for a considerably better product. + +- **Greenfield, salvage cores:** keep the proven Rust (DXGI/GDI capture, input injection, SAS helper, prost codec, proto, Gitea-Actions CI) — rebuild relay/auth, session, viewer, dashboard, deploy. Clean reset in-place (keep repo/history/issues), not a new repo. +- **Native-first, NOT WebRTC.** Mike's headline must-haves: + 1. **Keyboard hooks / full key fidelity** — Win+R, Ctrl+C/V, **Ctrl+Alt+Del** must work. Browsers structurally can't do these — WebRTC is fallback/secondary only. + 2. **Bidirectional file transfer via clipboard cut/paste AND drag-and-drop** from/to either guest or host. Core differentiator, not deferred. Needs delayed-render clipboard + chunked engine; drag-out (remote→local) is the hard case and ships after drag-in. +- Transport stays custom protobuf-over-WSS. +- **Standalone-first + versioned `/api/integration/v1/` contract** with GuruRMM (ADR-001). +- **Hardened single-tenant now, multi-tenancy-READY schema** (nullable `tenant_id` everywhere) so the partner/client model switches on later with no migration rewrite. +- Adopt GuruRMM principles: per-agent keys (kill shared AGENT_API_KEY), no-TOML-for-endpoints, living-roadmap = definition-of-done, full-stack features, true-integration / anti-Datto. +- Ship each capability **full-stack** (proto + agent + server + viewer + dashboard + docs). See [[project_versionable_products]]. + +**Open questions still pending Mike's answer:** repo reset, H.264-vs-HEVC default, web transport, support-code format, v1 cutover. + +--- + +## Deploy procedure (manual, to 172.16.3.30) + +Live in prod since 2026-05-30 at `connect.azcomputerguru.com` (NPM → localhost:3002). The `.gitea/workflows/deploy.yml` "deploy to server" step is a STUB (builds an artifact only) — deploy is manual. + +**Repo on the box:** `/home/guru/guru-connect` (separate repo `azcomputerguru/guru-connect`, NOT a submodule). + +**Build host = the server itself.** 172.16.3.30 has rust (rustup, cargo 1.94, `x86_64-unknown-linux-gnu` target), node 20 + npm 10, and protoc (`~/.local/bin`, libprotoc 28.3) — but **only on PATH in a login shell**: `ssh guru@172.16.3.30 'bash -lc "..."'`. A non-interactive shell doesn't source `~/.profile`, so cargo/protoc look "missing". GURU-5070 builds the Windows agent + a Windows-target server, NOT the Linux release — build Linux on the box. See [[reference_guru5070_rust_toolchain]]. + +### Sequence (build while v1 runs, quick cutover restart) + +1. **Backup first:** + ```bash + pg_dump "$DATABASE_URL" | gzip > ~/backups/guruconnect/pre-deploy-$(date +%F-%H%M).sql.gz + ``` + Save current commit + copy running binary to `~/guruconnect-server.vN.bak`. + +2. **Get the code.** The server's local `main` may have **diverged** from origin (the v2 greenfield respec rewrote history — `git pull --ff-only` will refuse). Tree is clean, so: + ```bash + git fetch origin && git reset --hard origin/main + ``` + `.env` is gitignored, untouched. Save the rollback SHA before resetting. + +3. **SPA:** `cd dashboard && npm ci && npm run build` → emits to `../server/static/app/` (gitignored). + +4. **Binary** (from repo root, login shell, `PROTOC` set): + ```bash + cargo build --release -p guruconnect-server --target x86_64-unknown-linux-gnu + ``` + `-p` scopes to the server so the Windows-only agent crate isn't compiled. Explicit `--target` overrides `.cargo/config.toml`'s windows-msvc default. Output: `target/x86_64-unknown-linux-gnu/release/guruconnect-server` = the unit's ExecStart. ~3 min. sqlx uses RUNTIME queries (no `query!` macros, no `.sqlx` cache) — build needs no DB. + +5. **Cutover:** `sudo systemctl restart guruconnect`. Migrations are sqlx-embedded and **auto-run on startup** (`db.migrate()`) — no manual `psql`. Watch `journalctl -u guruconnect` for "Migrations complete" + "Server listening". + +### Gotchas (all hit on the 2026-05-30 deploy) + +- **systemd unit:** the INSTALLED `/etc/systemd/system/guruconnect.service` has **no `WatchdogSec`** (correct for v2, which sends no `sd_notify`). The repo's `server/guruconnect.service` DOES set `WatchdogSec=30s` — so do NOT run `setup-systemd.sh` / copy the repo unit, or v2 restart-loops every 30s. Unit: `User=guru`, `EnvironmentFile=server/.env`, `WorkingDirectory=server/`, `ProtectSystem=strict`. + +- **`CONNECT_TRUSTED_PROXIES`** is a v2 env var (comma-separated IPs; defaults to loopback fail-closed). Public `connect.azcomputerguru.com` ingresses through **NPM on Jupiter (172.16.3.20)** → relay on `172.16.3.30:3002`. Set `CONNECT_TRUSTED_PROXIES=127.0.0.1,::1,172.16.3.20` in `server/.env` (the Jupiter NPM hop, NOT the relay host `.30` — that was the wrong first guess). Without trusting `172.16.3.20`, the relay logs every public agent as `172.16.3.20` instead of reading `X-Forwarded-For`. With it, the real client IP shows (verified: a Pavon agent logged its true public IP `98.172.64.243`). Only `JWT_SECRET` is hard-required. + +- **NULL tags bug:** `connect_machines.tags` is `text[]` nullable with no default; v2 decodes as non-`Option`, so NULL rows throw "unexpected null" at reconcile (and likely the Machines list). Mitigate: `UPDATE connect_machines SET tags='{}' WHERE tags IS NULL`. Real fix is a TODO (decode `Option<...>` + migration default). + +- **DB:** Postgres 14 `guruconnect` on localhost. Existing users (admin, howard, both role admin) survive migration. + +### Rollback + +`git reset --hard `, rebuild, restart, `psql < backup`. diff --git a/.claude/memory/project_guruconnect_deploy.md b/.claude/memory/project_guruconnect_deploy.md deleted file mode 100644 index f766039..0000000 --- a/.claude/memory/project_guruconnect_deploy.md +++ /dev/null @@ -1,54 +0,0 @@ ---- -name: project_guruconnect_deploy -description: How to deploy GuruConnect (v2+) to production — the server (172.16.3.30) builds its own Linux binary; gotchas with the systemd watchdog, trusted-proxy env, and auto-run migrations -metadata: - type: project ---- - -GuruConnect v2 went live in production on 2026-05-30 (server + dashboard at v0.2.0, -public at connect.azcomputerguru.com via NPM -> localhost:3002). The deploy is **manual** -(the `.gitea/workflows/deploy.yml` "deploy to server" step is a stub that only builds a -package artifact). Repo on the box: `/home/guru/guru-connect` (separate repo -`azcomputerguru/guru-connect`, NOT a submodule there). - -**Build host = the server itself.** 172.16.3.30 has rust (rustup, cargo 1.94, the -`x86_64-unknown-linux-gnu` target), node 20 + npm 10, and protoc (~/.local/bin, libprotoc 28.3) -— all on PATH only in a **login shell** (`ssh guru@172.16.3.30 'bash -lc "..."'`; a -non-interactive shell does NOT source ~/.profile so cargo/protoc look "missing"). GURU-5070 -builds the *Windows* agent + a windows-target server, NOT the Linux release — so build the -Linux server ON the box. See [[reference_guru5070_rust_toolchain]]. - -Deploy sequence (build while v1 runs, then a quick cutover restart): -1. **Backup first:** `pg_dump "$DATABASE_URL" | gzip > ~/backups/guruconnect/pre-deploy-*.sql.gz`; - save the current commit + copy the running binary to `~/guruconnect-server.vN.bak`. -2. Get the code: the server's local `main` may have **diverged** from origin (the v2 greenfield - respec rewrote history — `git pull --ff-only` will refuse). Tree is clean, so - `git fetch origin && git reset --hard origin/main` (rollback SHA is saved). `.env` is - gitignored, untouched. -3. SPA: `cd dashboard && npm ci && npm run build` -> emits to `../server/static/app/` (gitignored). -4. Binary (from repo ROOT, login shell, PROTOC set): `cargo build --release -p guruconnect-server - --target x86_64-unknown-linux-gnu`. `-p` scopes to the server so the Windows-only agent crate - isn't compiled; explicit `--target` overrides `.cargo/config.toml`'s windows-msvc default. - Output lands at `target/x86_64-unknown-linux-gnu/release/guruconnect-server` = the unit's ExecStart. - ~3 min. sqlx uses RUNTIME queries (no `query!` macros, no `.sqlx` cache) so the build needs no DB. -5. **Cutover:** `sudo systemctl restart guruconnect`. Migrations are sqlx-embedded in the binary and - **auto-run on startup** (`db.migrate()`), so no manual `psql`. Watch - `journalctl -u guruconnect` for "Migrations complete" + "Server listening". - -GOTCHAS (all hit on the 2026-05-30 deploy): -- **systemd unit:** the INSTALLED `/etc/systemd/system/guruconnect.service` has **no `WatchdogSec`** - (correct for v2, which sends no `sd_notify`). The repo's `server/guruconnect.service` DOES set - `WatchdogSec=30s` — so do NOT run `setup-systemd.sh` / copy the repo unit, or v2 restart-loops - every 30s. Unit: User=guru, EnvironmentFile=server/.env, WorkingDirectory=server/, ProtectSystem=strict. -- **`CONNECT_TRUSTED_PROXIES`** is a v2 env var (comma-separated IPs; defaults to loopback fail-closed). - Public `connect.azcomputerguru.com` ingresses through **NPM on Jupiter (172.16.3.20)**, which forwards to - the relay on 172.16.3.30:3002. So set `CONNECT_TRUSTED_PROXIES=127.0.0.1,::1,172.16.3.20` in `server/.env` - (the Jupiter NPM hop, NOT the relay host .30 — that was a wrong first guess). Without trusting 172.16.3.20 - the relay logs every public agent as 172.16.3.20 instead of reading X-Forwarded-For; with it, the real client - IP shows (verified: a Pavon agent logged its true public IP 98.172.64.243). Only `JWT_SECRET` is hard-required. -- **NULL tags bug:** `connect_machines.tags` is `text[]` nullable with no default; v2 decodes it as - non-`Option`, so rows with NULL tags throw "unexpected null" at reconcile (and likely the Machines - list). Mitigated with `UPDATE connect_machines SET tags='{}' WHERE tags IS NULL`. Real fix is a - todo (decode Option + migration default). -- DB is Postgres 14 `guruconnect` on localhost; existing users (admin, howard, both role admin) - survive migration. Rollback: `git reset --hard `, rebuild, restart, `psql < backup`. diff --git a/.claude/memory/project_guruconnect_v2_direction.md b/.claude/memory/project_guruconnect_v2_direction.md deleted file mode 100644 index f61f06d..0000000 --- a/.claude/memory/project_guruconnect_v2_direction.md +++ /dev/null @@ -1,32 +0,0 @@ ---- -name: project_guruconnect_v2_direction -description: GuruConnect v2 modernization direction (Mike, 2026-05-29) — native-first full key fidelity + bidirectional file cut/paste/drag are the headline must-haves; WebRTC is fallback only -metadata: - type: project ---- - -GuruConnect is being re-architected (v2) after the 2026-05-29 audit found 3 CRITICAL relay-plane -auth holes. Direction set by Mike (product owner), captured in -`projects/msp-tools/guru-connect/docs/specs/SPEC-002-v2-modernization-architecture.md`: - -- **Greenfield, salvage cores:** keep the proven Rust (DXGI/GDI capture, input injection, SAS - helper, prost codec, proto, Gitea-Actions CI) — rebuild relay/auth, session, viewer, dashboard, - deploy. Clean reset in-place (keep repo/history/issues), not a new repo. -- **Native-first, NOT WebRTC.** Mike's favorite ScreenConnect features and explicit priorities: - (1) **keyboard hooks / full key fidelity** — Win+R, Ctrl+C/V, **Ctrl+Alt+Del** must work (browsers - structurally can't do these, which is why WebRTC is fallback/secondary only); (2) **bidirectional - file transfer via clipboard cut/paste AND drag-and-drop** from/to either guest or host. Both are - core differentiators, not deferred. Transport stays custom protobuf-over-WSS. -- **Standalone-first + versioned `/api/integration/v1/` contract** with GuruRMM (ADR-001; the - `specs/native-remote-control/` work is the integration prior art). -- **Hardened single-tenant now, multi-tenancy-READY schema** (nullable `tenant_id` everywhere) so - the RMM partner/client model switches on later with no migration rewrite. -- Adopt GuruRMM principles: per-agent keys (Option 3, kill shared AGENT_API_KEY), no-TOML-for- - endpoints, living-roadmap = definition-of-done, full-stack features, true-integration/anti-Datto. - -**Why:** initial GC was built with a much older model; lots of debt. Mike is willing to scrap v1 -entirely for a considerably better product. **How to apply:** when building GC features, default to -native full-fidelity behavior and ship each capability full-stack (proto+agent+server+viewer+ -dashboard+docs). File transfer needs delayed-render clipboard + a chunked engine; drag-out -(remote→local) is the hard case, ships after drag-in. Re-spec keystone: [[project_versionable_products]]. -Open questions still pending Mike's answer: repo reset, H.264-vs-HEVC default, web transport, support-code format, v1 cutover. diff --git a/.claude/memory/project_gururmm.md b/.claude/memory/project_gururmm.md new file mode 100644 index 0000000..3d92f40 --- /dev/null +++ b/.claude/memory/project_gururmm.md @@ -0,0 +1,77 @@ +--- +name: GuruRMM project state — dev principles, webhook docs guard, pending setup +description: GuruRMM project state — dev principles (every feature full-stack: backend+API+UI+docs+scalability; product works without AI; FEATURE_ROADMAP update is part of definition-of-done), the webhook docs-only build guard (SPEC-020 Phase 0; repo copy of webhook-handler.py is STALE — don't redeploy), and the still-pending Mac install-hooks.sh setup on Mikes-MacBook-Air. +type: project +--- + +Rules: [[feedback_gururmm]]. Technical reference: [[reference_gururmm]]. Canonical principles doc: `projects/msp-tools/guru-rmm/docs/DESIGN.md` (this file is the in-context summary). + +--- + +## Dev principles (created 2026-04-29; authority: Mike Swanson) + +### Holistic feature development (MANDATORY) + +Every GuruRMM feature ships with the complete stack: +1. **Backend / agent logic** — core capability +2. **API endpoints** — control + monitoring +3. **UI / UX** — dashboard configuration, status display, management +4. **Documentation** — user guides and operational docs +5. **Scalability** — architected for future expansion + +**Example — Network Discovery Node** is complete when: agent-side scanning (ICMP/ARP/SNMP), server-side data + API, dashboard for designating the discovery node + viewing discovered devices + configuring scan schedules + IP ranges/exclusions, status indicators, future-proof data model supporting multiple discovery methods. + +**Features shipped without their UI / configuration interfaces are incomplete and will be rejected.** Same standard for backend-only implementations, features requiring API expertise to configure, and designs that paint into architectural corners. + +### Living roadmap (MANDATORY) + +`projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md` is the single living record. `[ ]` = planned, `[x]` = shipped (annotate with date). + +- **Before building:** read the feature's roadmap entry for intent/scope. New work that isn't on the roadmap gets an entry first. +- **When shipping or modifying a feature:** update its roadmap entry in the SAME change — flip `[ ]` → `[x]` with a date, or revise/add the item. **Code change without the matching roadmap update is incomplete** (same standard as shipping without UI). +- **Don't over-claim:** if only part is done, keep `[ ]` and annotate the partial scope (e.g. "TCP probing shipped; ICMP/ARP/SNMP pending") rather than flipping. +- `/rmm-audit`'s roadmap pass is the **backstop** that reconciles drift — not the primary maintainer. Dev work keeps the roadmap honest; the audit catches what slipped. + +### AI-optional operation + +GuruRMM must be fully functional **without** requiring AI agents to operate. All functionality accessible via the traditional dashboard / API; configuration via standard interfaces; usable by MSP techs with zero AI/ML knowledge; deterministic, reliable operation for production. + +AI features (agentic analysis, agentic command routing) are **enhancements** — users choose whether to enable them. Production stability over experimental features. + +### Planning questions for every feature + +- How does an admin configure this in the dashboard? +- What does the status display look like? +- How do we expand this in v2/v3? +- Does this work if AI services are unavailable? + +--- + +## Webhook docs-only build guard (SPEC-020 Phase 0) + +The GuruRMM build webhook (`gururmm-webhook.service` → `/opt/gururmm/webhook-handler.py` on 172.16.3.30) has a **docs-only build guard** as of 2026-05-30: a push whose every changed file matches `docs/`, `*.md`, `.claude/`, `session-logs/`, `LICENSE`, or `.gitignore` returns `Docs-only change -- build skipped` and triggers no build. **Fail-safe toward building** — no file list or any buildable file → build runs. + +Detection uses the Gitea push payload's per-commit `added`/`removed`/`modified` lists (`is_docs_only` / `NON_BUILDABLE`). Verified live (docs push skipped, no build locks, `last-built-commit` unchanged). Backup: `/opt/gururmm/webhook-handler.py.bak-20260530-guard`. + +This is **Phase 0** (interim). The full fix migrates RMM CI to Gitea Actions with native `paths-ignore`, matching GuruConnect (ADR-002) — see [[reference_gitea_internal]]. + +**Caveat about `webhook-handler.py` specifically:** the repo copy `scripts/webhook-handler.py` is STALE (109 lines vs the deployed 206 — predates the split-build refactor) and does NOT contain the guard. Do NOT redeploy it over the host copy; the host is the source of truth until SPEC-020 lands. (The other 6 build scripts auto-sync from `deploy/build-pipeline/` per [[reference_gururmm]] §pipeline-vendoring — this caveat is `webhook-handler.py`-only.) + +--- + +## Pending setup — Mikes-MacBook-Air `install-hooks.sh` + +**STATUS:** Genuinely still pending as of 2026-05-27. Verified: gururmm submodule is initialized on the Mac but only default `.sample` hooks are present. + +**Action (do this before any gururmm dev on the Mac):** +```bash +cd /Users/azcomputerguru/ClaudeTools/projects/msp-tools/guru-rmm +git pull +bash scripts/install-hooks.sh +``` + +**What it does:** sets `core.hooksPath = scripts/hooks/` (activates the pre-commit CRLF check), `core.autocrlf=false`, `core.eol=lf` (locally and globally) — prevents sqlx migration checksum drift (root cause: CRLF vs LF sha384 mismatch from Windows commits). + +**Why:** the gururmm build server once refused to start after a rebuild because migration file hashes differed between what was stored in `_sqlx_migrations` and the current files. Fixed with `.gitattributes` + per-machine git config. This step applies the git config side. macOS defaults to LF — low-risk; mainly activates the pre-commit guard. + +**When complete:** delete this section from this memory file (or remove the file entirely if no other RMM project state has been added). diff --git a/.claude/memory/project_mac_gururmm_setup_pending.md b/.claude/memory/project_mac_gururmm_setup_pending.md deleted file mode 100644 index efeba55..0000000 --- a/.claude/memory/project_mac_gururmm_setup_pending.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -name: Mac gururmm hook setup pending -description: Mikes-MacBook-Air needs install-hooks.sh run in gururmm repo — one-time setup to prevent sqlx migration drift -type: project ---- - -[CONFIRMED PENDING 2026-05-27 — Mac session verified install-hooks.sh has NOT been run on Mikes-MacBook-Air: the gururmm submodule is initialized but only default .sample hooks are present. Action genuinely still open — keep this file; run install-hooks.sh on the Mac before gururmm dev there.] - -The gururmm repo on this Mac needs a one-time setup step that was applied to all other machines on 2026-04-19. - -**Action required (do this now, before any gururmm work):** - -```bash -cd /Users/azcomputerguru/ClaudeTools/projects/msp-tools/guru-rmm -git pull -bash scripts/install-hooks.sh -``` - -**What this does:** -- Sets `core.hooksPath = scripts/hooks/` so the pre-commit CRLF check is active -- Sets `core.autocrlf=false` and `core.eol=lf` locally and globally -- Prevents sqlx migration checksum drift (root cause: CRLF vs LF sha384 mismatch) - -**Why:** The gururmm build server refused to start after a rebuild because migration file hashes differed between what was stored in `_sqlx_migrations` and the current files. Root cause was CRLF line endings from Windows commits. Fixed with `.gitattributes` + per-machine git config. This command applies the git config side. - -macOS defaults to LF, so this is low-risk — mainly sets the hooksPath so the pre-commit guard is active. - -**After running:** Delete this memory file or mark it resolved. diff --git a/.claude/memory/project_pluto_build_server.md b/.claude/memory/project_pluto_build_server.md deleted file mode 100644 index 115a7f8..0000000 --- a/.claude/memory/project_pluto_build_server.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -name: project-pluto-build-server -description: "Pluto Windows build server — location, role, and access details" -metadata: - node_type: memory - type: project - originSessionId: 541d4004-8c45-4290-89f5-0ba9ee4e64a9 ---- - -Pluto (`PLUTO`, 172.16.3.36) is a Windows Server 2019 VM hosted on Jupiter (Unraid primary). - -**Why:** It is the primary Windows build server for GuruRMM — builds all Windows agent variants (amd64, x86, legacy, debug), runs WiX 4 MSI builds, and signs binaries via Azure Trusted Signing. - -**Credentials:** Administrator / `Paper123!@#` (set 2026-05-15). SSH key: `guru@gururmm-build` (ed25519, `Q+ivqd/...`) must be in `C:\ProgramData\ssh\administrators_authorized_keys` with icacls `/inheritance:r` and ASCII encoding (not UTF-16). - -**How to apply:** When Pluto is unreachable or SSH auth fails, check Jupiter's VM console first (not physical machine). SSH key file must be ASCII-encoded — PowerShell `>` writes UTF-16 and breaks auth silently. Use `[System.IO.File]::WriteAllText(..., [System.Text.Encoding]::ASCII)` to write the key. - -**GuruRMM agent:** Installed but historically runs old versions (was on 0.6.3 as of 2026-05-15). Update it after any Pluto maintenance. diff --git a/.claude/memory/project_rmm_webhook_docs_guard.md b/.claude/memory/project_rmm_webhook_docs_guard.md deleted file mode 100644 index 4110383..0000000 --- a/.claude/memory/project_rmm_webhook_docs_guard.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -name: project_rmm_webhook_docs_guard -description: RMM build webhook now skips docs-only pushes (host guard in /opt/gururmm/webhook-handler.py). The repo copy is stale — don't redeploy it. -metadata: - type: project ---- - -The GuruRMM build webhook (`gururmm-webhook.service` → `/opt/gururmm/webhook-handler.py` -on 172.16.3.30) has a **docs-only build guard** as of 2026-05-30: a push whose every -changed file matches `docs/`, `*.md`, `.claude/`, `session-logs/`, `LICENSE`, or -`.gitignore` returns `Docs-only change -- build skipped` and triggers no build. -Fail-safe toward building — no file list or any buildable file → build runs. Detection -uses the Gitea push payload's per-commit `added`/`removed`/`modified` lists -(`is_docs_only` / `NON_BUILDABLE`). Verified live (docs push skipped, no build locks, -`last-built-commit` unchanged). Backup: `/opt/gururmm/webhook-handler.py.bak-20260530-guard`. - -This is **SPEC-020 Phase 0** (interim). The full fix migrates RMM CI to Gitea Actions -with native `paths-ignore`, matching GuruConnect (ADR-002) — see [[reference_gitea_internal]]. - -**Caveat:** the repo copy `scripts/webhook-handler.py` is STALE (109 lines vs the deployed -206 — predates the split-build refactor) and does NOT contain the guard. Do not redeploy -it over the host copy; the host is the source of truth until SPEC-020 lands. diff --git a/.claude/memory/project_versionable_products.md b/.claude/memory/project_versionable_products.md index b97723b..8b4f368 100644 --- a/.claude/memory/project_versionable_products.md +++ b/.claude/memory/project_versionable_products.md @@ -20,4 +20,4 @@ stale until 2026-05-29 because edits stayed in the monorepo and were never pushe **How to apply:** When asked "should X be its own repo?", default to NO unless X ships/deploys on its own pipeline or is consumed elsewhere through a versioned interface. For the two products, never edit -the submodule's files without committing + pushing inside the submodule. Relates to [[feedback_gururmm_builds]]. +the submodule's files without committing + pushing inside the submodule. Relates to [[feedback_gururmm]] (Gitea-webhook-only build rule). diff --git a/.claude/memory/reference_acg_msp_stack.md b/.claude/memory/reference_acg_msp_stack.md index 911e276..b9bd713 100644 --- a/.claude/memory/reference_acg_msp_stack.md +++ b/.claude/memory/reference_acg_msp_stack.md @@ -17,4 +17,4 @@ Also part of the stack (seen on ACG-managed machines incl. Birth Biologic + Redn - **Datto EDR / Datto AV** — the managed AV. Note: when Datto AV is the active AV, **Windows Defender real-time protection is OFF by design** (Windows disables Defender when a 3rd-party AV registers) — that is expected, not a gap. - **GuruRMM** — ACG's own RMM (the agent doing the monitoring) -Relevance: the onboarding diagnostic ([[reference_gururmm_api]] / `.claude/scripts/onboarding-diagnostic.ps1`) currently flags these as CRITICAL "foreign management/remote-access agent" — a known false positive being tuned (allowlist them as INFO; downgrade Defender-off when a managed AV is present). The genuine prior-MSP-leftover scenario still matters for *non-ACG* remote tools (Ninja, Atera, Kaseya, TeamViewer, LogMeIn, AnyDesk, etc.). +Relevance: the onboarding diagnostic ([[reference_gururmm]] / `.claude/scripts/onboarding-diagnostic.ps1`) currently flags these as CRITICAL "foreign management/remote-access agent" — a known false positive being tuned (allowlist them as INFO; downgrade Defender-off when a managed AV is present). The genuine prior-MSP-leftover scenario still matters for *non-ACG* remote tools (Ninja, Atera, Kaseya, TeamViewer, LogMeIn, AnyDesk, etc.). diff --git a/.claude/memory/reference_dataforth_contact.md b/.claude/memory/reference_dataforth_contact.md deleted file mode 100644 index fdd8d27..0000000 --- a/.claude/memory/reference_dataforth_contact.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -name: Dataforth Contact - AJ -description: AJ at Dataforth - email forwarding setup needed for dataforthgit@ address -type: reference ---- - -AJ at Dataforth needs messages sent to the dataforthgit@ email address to forward to him. diff --git a/.claude/memory/reference_gitea_api_credential.md b/.claude/memory/reference_gitea_api_credential.md index 83648d4..667b92c 100644 --- a/.claude/memory/reference_gitea_api_credential.md +++ b/.claude/memory/reference_gitea_api_credential.md @@ -9,4 +9,4 @@ For Gitea API operations as Howard (create/merge PRs, delete branches, etc.), au Do NOT use `infrastructure/gururmm-server.sops.yaml credentials.password` for Gitea — that is the `guru` server SSH/sudo password for 172.16.3.30. It was rejected by the Gitea API for a write (PR merge) on 2026-05-31; the gitea-howard credential is the correct one. The two are separate accounts; don't assume they're interchangeable. -Related: [[reference_gururmm_server]] (the guru SSH/server password), [[reference_coord_messages_api_shape]]. +Related: [[reference_gururmm]] (the guru SSH/server password), [[reference_coord_messages_api_shape]]. diff --git a/.claude/memory/reference_guru5070_rust_toolchain.md b/.claude/memory/reference_guru5070_rust_toolchain.md index 187979a..8cf8e01 100644 --- a/.claude/memory/reference_guru5070_rust_toolchain.md +++ b/.claude/memory/reference_guru5070_rust_toolchain.md @@ -23,7 +23,7 @@ built/linted/tested locally — **no more build-host (172.16.3.30) round-trips j **How to apply:** when a Coding Agent works on GuruConnect Rust, have it self-verify with the local toolchain (set PROTOC, run the four gates, iterate to green) and commit CI-green code — don't delegate fmt/clippy to the -build host. See [[project_guruconnect_v2_direction]]. +build host. See [[project_guruconnect]]. **CI fmt gate — don't omit it from agent briefs (incident 2026-05-31):** the CI `Build Server (Linux)` job runs `cargo fmt --check` as a hard gate FIRST (before build/test). SPEC-004 Task 2 + Task 4 (commits ffca7f0, 4e80573) diff --git a/.claude/memory/reference_gururmm.md b/.claude/memory/reference_gururmm.md new file mode 100644 index 0000000..08aeff2 --- /dev/null +++ b/.claude/memory/reference_gururmm.md @@ -0,0 +1,141 @@ +--- +name: GuruRMM technical reference — server, API, user_session, pipeline, agent sandbox +description: Operational reference for GuruRMM — server layout (SSH user, paths on 172.16.3.30), API auth + command execution + polling, user_session context (WTS impersonation, when SYSTEM fails), build-pipeline vendoring at deploy/build-pipeline/ (auto-sync to /opt/gururmm), Linux agent systemd sandbox trap (ProtectSystem=strict makes fs/mount observations sandbox-local). +type: reference +--- + +Rules: [[feedback_gururmm]]. Project state + principles + pending setup: [[project_gururmm]]. + +--- + +## Server layout (172.16.3.30) + +SSH user is **`guru`**, not `mike`. Home is `/home/guru/`. Other users with home dirs: `gitea-runner` only. + +- **Repo:** `/home/guru/gururmm` +- **Dashboard build:** `cd /home/guru/gururmm/dashboard && npm run build` +- **Deploy:** `sudo cp -r dist/* /var/www/gururmm/dashboard/` +- **Other dirs under `/home/guru/`:** `guru-connect`, `guruconnect-server`, `backups` + +--- + +## API — execute a script on any agent + +**Base:** `http://172.16.3.30:3001` (reachable from HOWARD-HOME and similar dev machines via Tailscale). + +**Auth:** `infrastructure/gururmm-server.sops.yaml` → `credentials.gururmm-api.admin-email` + `admin-password`. Login returns a JWT valid for ~24h (86400s from iat). + +### Flow + +```bash +VAULT="$PWD/.claude/scripts/vault.sh" +EMAIL=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-email) +PASS=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-password) + +JWT=$(curl -s -X POST http://172.16.3.30:3001/api/auth/login \ + -H "Content-Type: application/json" \ + -d "{\"email\":\"$EMAIL\",\"password\":\"$PASS\"}" \ + | python -c "import json,sys; print(json.load(sys.stdin)['token'])") + +# Find agent +curl -s http://172.16.3.30:3001/api/agents -H "Authorization: Bearer $JWT" + +# Submit (json-encode to preserve quotes/newlines) +AGENT="" +PAYLOAD=$(python -c " +import json +with open('path/to/script.ps1','r',encoding='utf-8') as f: s=f.read() +print(json.dumps({'command_type':'powershell','command':s})) +") +RESP=$(curl -s -X POST http://172.16.3.30:3001/api/agents/$AGENT/command \ + -H "Authorization: Bearer $JWT" -H "Content-Type: application/json" -d "$PAYLOAD") +CMD_ID=$(echo "$RESP" | python -c "import json,sys; print(json.load(sys.stdin)['command_id'])") + +# Poll +while true; do + STATUS=$(curl -s http://172.16.3.30:3001/api/commands/$CMD_ID -H "Authorization: Bearer $JWT" \ + | python -c "import json,sys; print(json.load(sys.stdin)['status'])") + [ "$STATUS" != "running" ] && break + sleep 5 +done + +# Fetch result +curl -s http://172.16.3.30:3001/api/commands/$CMD_ID -H "Authorization: Bearer $JWT" +``` + +### Required fields & response + +`POST /api/agents/:id/command` requires `command_type` (use `powershell` for Windows agents — the API accepts any string but Windows agent only runs powershell-compatible) and `command` (script text, JSON-encoded). + +Response from `/api/commands/:cmd_id`: +```json +{ + "id": "uuid", "agent_id": "uuid", "command_type": "powershell", + "command_text": "...", "status": "completed", // running | completed | failed | timeout + "exit_code": 0, "stdout": "...", "stderr": "...", + "created_at": "ISO-8601", "started_at": "ISO-8601", "completed_at": "ISO-8601" +} +``` + +### When to use / not to use + +**Use** for diagnostic checks on any enrolled agent, one-off remediation without ScreenConnect, anywhere you'd ask a user to paste a script. + +**Don't** when the agent isn't enrolled (`GET /api/agents` first), for interactive sessions (no stdin), for scripts >1 MB (untested — keep modular). + +**Notes:** `command_type: "powershell"` runs in SYSTEM context on Windows (agent runs as LocalSystem). Idempotent commands only — no rollback. If output is large, have the script write to a file on the agent and fetch via a separate command. Tunnel API (`/api/v1/tunnel/...`) is a planned interactive feature per `.claude/gururmm-tunnel-plan.md`, not deployed. + +--- + +## `context=user_session` — run as the active logged-on user + +`POST /api/agents/:id/command` accepts an optional **`context`** field (migration `041`): + +- `"system"` (default) — Session 0 / SYSTEM. Original behavior. +- `"user_session"` — runs in the active logged-on user's desktop session via WTS token impersonation (`WTSQueryUserToken` + `DuplicateTokenEx` + `CreateProcessAsUserW`, in `agent/src/watchdog/wts.rs`). **Requires an active logged-on user on the endpoint.** + +**Why it matters:** some Windows cmdlets fail as SYSTEM with "NonInteractive mode" / interactive-session errors and historically had to be done on-site. `user_session` runs them remotely instead. Verified 2026-05-27 on the Peaceful Spirit **BridgetteHome** L2TP VPN deploy: `Set-VpnConnection -L2tpPsk -AllUserConnection` — previously documented as "cannot be done remotely" — was set successfully via `user_session`, completing a VPN rollout entirely through RMM with no on-site visit. + +**Elevation:** the WTS-impersonated token of a logged-on **admin** user comes back effectively elevated (`WindowsPrincipal.IsInRole(Administrator)=True`) — enough to write the all-user phonebook / HKLM. A **standard** logged-on user is NOT elevated, so admin-requiring commands still fail. Agent launches `powershell.exe -NonInteractive`; don't rely on real interactive prompts. + +**Invoke:** `{"command_type":"powershell","command":"...","context":"user_session"}`. To dodge shell-quoting on multi-line scripts, base64-encode the script as UTF-16LE and send `powershell -NoProfile -NonInteractive -EncodedCommand ` (`iconv` is absent in Git Bash — encode with `py`). + +--- + +## Build-pipeline vendoring (`/opt/gururmm/` ⇄ repo `deploy/build-pipeline/`) + +Pipeline runs at **`/opt/gururmm/`** on the gururmm server (root-owned, hand-maintained). The scripts had silently diverged from the repo (caused BUG-015 Windows build-gate gap). Reconciled 2026-06-01: + +- **Source of truth:** scripts vendored in the gururmm repo at **`deploy/build-pipeline/`** — `build-{windows,linux,mac,agents,server,shared}.sh`, `sign-windows.sh`, `webhook-handler.py`, `README` (commit `2bf539e`). +- **Drift-stop (commit `24b5daf`):** `build-shared.sh` (runs first every build, after `git reset --hard origin/main`) `install -m 0755`-syncs the 6 build scripts from `deploy/build-pipeline/` → `/opt/gururmm/` each build. **Edit in repo + push to main → next build runs it.** No manual copy, no restart. +- **Two exceptions — manual `sudo cp` required** (can't self-overwrite mid-run): + - `build-shared.sh` (the running puller). + - `webhook-handler.py` (persistent HTTP server; also `sudo systemctl restart gururmm-webhook` to reload). + They change rarely. See `deploy/build-pipeline/README.md`. +- Webhook still INVOKES the `/opt/gururmm` copies (not repo copies directly) — the sync keeps them current. +- Repo's older `scripts/webhook-handler.py` + `scripts/build-agents.sh` are a prior generation, superseded. +- `build-windows.sh`'s change-gate watches `agent/ installer/` (BUG-015 fix — installer-only `.wxs`/`.ico` changes now rebuild the MSI). + +--- + +## Linux agent runs in a systemd sandbox — `findmnt` lies + +The Linux agent (`gururmm-agent.service`) is hardened with **`ProtectSystem=strict`** → private mount namespace where `/` is read-only, only `ReadWritePaths=` entries are writable. **Every command dispatched through the agent runs inside that namespace** — so `findmnt /`, `touch`, `/proc/mounts` etc. report the **agent's sandboxed view, not the host's actual state**. + +**Trap (hit 2026-06-01 on GURU-KALI):** I diagnosed "host root filesystem is read-only" because an RMM-dispatched `touch /var/lib/gururmm` returned EROFS (os error 30) and `findmnt /` showed `ro`. **The host root was rw the entire time** (SMART PASSED, ext4 clean, no kernel remount-ro). Real cause: the unit's `ReadWritePaths=` omitted `/var/lib/gururmm` → agent couldn't persist `/var/lib/gururmm/.device-id` → re-minted a `device_id` on each daily identity refresh → server (no `machine_uid` dedup) filed a new agent row each time (~11 ghosts). + +**How to get host truth instead of sandbox view:** +- SSH to the host directly (commands run in the host namespace), OR +- Read the agent PID's namespace explicitly: `cat /proc//mountinfo` — the process-scoped `ro` on `/` is the tell that it's sandbox, not host. Compare against the host's `findmnt`. +- `errors=remount-ro` in a mount line is the stock default mount option — NOT evidence an error fired. Confirm an actual remount-ro with kernel `EXT4-fs error` logs + `dumpe2fs -h` error count. + +**Fix pattern (additive):** drop-in `/etc/systemd/system/gururmm-agent.service.d/override.conf` with +```ini +[Service] +ReadWritePaths=/var/lib/gururmm +``` +(systemd merges `ReadWritePaths` additively across drop-ins), then `daemon-reload` + `restart`. + +**Better upstream fix:** `StateDirectory=gururmm` (handles dir creation + perms + RW bind in one directive). + +**Fleet implication:** every systemd-installed GuruRMM Linux agent with this unit shape has the same latent bug until the installer is fixed. See filed todos (agent `ReadWritePaths` / `StateDirectory` + server `machine_uid` dedup). diff --git a/.claude/memory/reference_gururmm_api.md b/.claude/memory/reference_gururmm_api.md deleted file mode 100644 index 692ae02..0000000 --- a/.claude/memory/reference_gururmm_api.md +++ /dev/null @@ -1,92 +0,0 @@ ---- -name: GuruRMM API — run PowerShell on any agent -description: API endpoints, auth flow, and curl recipe to execute a script on any GuruRMM agent and retrieve output. Use this instead of asking user to paste script into ScreenConnect. -type: reference ---- - -# GuruRMM API — Execute Script on an Agent - -**API base:** `http://172.16.3.30:3001` (reachable from HOWARD-HOME and similar dev machines via Tailscale — not reachable from cascades internal-network-only boxes, but that doesn't matter since the API talks to the agent, not the target machine). - -**Auth creds:** `infrastructure/gururmm-server.sops.yaml` → `credentials.gururmm-api.admin-email` + `admin-password`. Login returns a JWT valid for ~24h (expires 86400s from iat). - -## Flow - -```bash -VAULT="$PWD/.claude/scripts/vault.sh" -EMAIL=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-email) -PASS=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-password) - -JWT=$(curl -s -X POST http://172.16.3.30:3001/api/auth/login \ - -H "Content-Type: application/json" \ - -d "{\"email\":\"$EMAIL\",\"password\":\"$PASS\"}" \ - | python -c "import json,sys; print(json.load(sys.stdin)['token'])") - -# List agents (find the agent_id for the host you want) -curl -s http://172.16.3.30:3001/api/agents -H "Authorization: Bearer $JWT" - -# Submit a PowerShell command — works with any file, json-encode to preserve quotes/newlines -AGENT="" -PAYLOAD=$(python -c " -import json -with open('path/to/script.ps1','r',encoding='utf-8') as f: s=f.read() -print(json.dumps({'command_type':'powershell','command':s})) -") -RESP=$(curl -s -X POST http://172.16.3.30:3001/api/agents/$AGENT/command \ - -H "Authorization: Bearer $JWT" -H "Content-Type: application/json" -d "$PAYLOAD") -CMD_ID=$(echo "$RESP" | python -c "import json,sys; print(json.load(sys.stdin)['command_id'])") - -# Poll until completed (status values: running, completed, failed, timeout) -while true; do - STATUS=$(curl -s http://172.16.3.30:3001/api/commands/$CMD_ID -H "Authorization: Bearer $JWT" \ - | python -c "import json,sys; print(json.load(sys.stdin)['status'])") - [ "$STATUS" != "running" ] && break - sleep 5 -done - -# Fetch result (stdout / stderr / exit_code) -curl -s http://172.16.3.30:3001/api/commands/$CMD_ID -H "Authorization: Bearer $JWT" -``` - -## Required request fields - -`POST /api/agents/:id/command` requires: -- `command_type` — the interpreter. Valid values include `powershell`, `shell`, `script`, `exec` — any string is accepted by the API but the Windows agent only runs powershell-compatible content. Use `powershell` for Windows agents. -- `command` — the script text. JSON-encode to preserve newlines, quotes, and dollar-sign escapes. - -## Response shape (from `/api/commands/:cmd_id`) - -```json -{ - "id": "uuid", - "agent_id": "uuid", - "command_type": "powershell", - "command_text": "...", - "status": "completed", // or running | failed | timeout - "exit_code": 0, - "stdout": "...", - "stderr": "...", - "created_at": "ISO-8601", - "started_at": "ISO-8601", - "completed_at": "ISO-8601" -} -``` - -## When to use this - -- Readiness / diagnostic checks on any client server where GuruRMM is installed -- One-off remediation without needing ScreenConnect copy-paste -- Anywhere you'd otherwise ask the user to paste a script manually - -## When NOT to use this - -- When the agent isn't enrolled in GuruRMM (check `GET /api/agents` first) -- For interactive sessions (no stdin; single-shot execution) -- For >1 MB of script (untested — keep scripts modular) - -## Notes -- Script output is limited; if you need large output, have the script write to a file on the agent and fetch via a separate command -- `command_type: "powershell"` runs in the SYSTEM context on Windows (agent runs as LocalSystem) -- Idempotent commands only — there is no transactional rollback -- The tunnel API (`/api/v1/tunnel/...`) is a planned interactive feature per `.claude/gururmm-tunnel-plan.md`, not yet deployed as of 2026-04-22. Stick to `/api/agents/:id/command` for now. -- Agents enrolled as of 2026-04-22 include CS-SERVER (`6766e973-e703-47c1-be56-76950290f87c`) for Cascades, DESKTOP-DLTAGOI for Cascades LE, AD2 for AZ Computer Guru. Use `GET /api/agents` for the live list. diff --git a/.claude/memory/reference_gururmm_pipeline_vendored.md b/.claude/memory/reference_gururmm_pipeline_vendored.md deleted file mode 100644 index f5d29ac..0000000 --- a/.claude/memory/reference_gururmm_pipeline_vendored.md +++ /dev/null @@ -1,29 +0,0 @@ ---- -name: reference_gururmm_pipeline_vendored -description: GuruRMM build-pipeline scripts are now version-controlled at deploy/build-pipeline/ in the gururmm repo (2026-06-01); build-shared.sh auto-syncs them to /opt/gururmm each build, so edit-in-repo + push = live — EXCEPT build-shared.sh + webhook-handler.py, which need a manual cp. -metadata: - type: reference ---- - -The GuruRMM build/CI pipeline runs at **`/opt/gururmm/`** on the gururmm server (172.16.3.30, -root-owned, hand-maintained). Those scripts had silently diverged from the repo's older `scripts/` -generation (that drift caused the BUG-015 Windows build-gate gap). Reconciled 2026-06-01: - -- **Source of truth:** the live scripts are vendored into the gururmm repo at - **`deploy/build-pipeline/`** (build-{windows,linux,mac,agents,server,shared}.sh, sign-windows.sh, - webhook-handler.py + README). Commit `2bf539e`. -- **Drift-stop (commit `24b5daf`):** `build-shared.sh` (runs first every build, after - `git reset --hard origin/main`) now `install -m 0755`-syncs the 6 build scripts from - `deploy/build-pipeline/` → `/opt/gururmm/` each build. So to change a GuruRMM build script: - **edit it in `deploy/build-pipeline/`, push to gururmm main — the next build runs it.** No manual - copy, no restart. -- **Two exceptions — need a manual `sudo cp` on change** (they can't self-overwrite mid-run): - `build-shared.sh` (the running puller) and `webhook-handler.py` (the persistent HTTP server; - also needs `sudo systemctl restart gururmm-webhook` to reload). They change rarely. See - `deploy/build-pipeline/README.md`. - -Webhook still INVOKES the `/opt/gururmm` copies (not the repo copies directly) — the sync keeps -them current. The repo's older `scripts/webhook-handler.py` + `scripts/build-agents.sh` are a prior -generation, superseded. Build-windows.sh's change-gate watches `agent/ installer/` (BUG-015 fix — -installer-only `.wxs`/`.ico` changes rebuild the MSI). Supersedes the "repo copy is stale, don't -redeploy" caveat in [[project_rmm_webhook_docs_guard]] for the build scripts (not webhook-handler.py). diff --git a/.claude/memory/reference_gururmm_server.md b/.claude/memory/reference_gururmm_server.md deleted file mode 100644 index f0dcf4a..0000000 --- a/.claude/memory/reference_gururmm_server.md +++ /dev/null @@ -1,14 +0,0 @@ ---- -name: GuruRMM Server Layout -description: SSH user, home directory, and deploy paths on 172.16.3.30 -type: reference ---- - -SSH user is `guru`, NOT `mike`. Home directory is `/home/guru/`. - -- Repo: `/home/guru/gururmm` -- Dashboard build: `cd /home/guru/gururmm/dashboard && npm run build` -- Deploy: `sudo cp -r dist/* /var/www/gururmm/dashboard/` -- Other dirs under `/home/guru/`: `guru-connect`, `guruconnect-server`, `backups` - -**Why:** First SSH session assumed `/home/mike/` — does not exist. Only users with home dirs are `guru` and `gitea-runner`. diff --git a/.claude/memory/reference_gururmm_user_session_context.md b/.claude/memory/reference_gururmm_user_session_context.md deleted file mode 100644 index 24ab1e7..0000000 --- a/.claude/memory/reference_gururmm_user_session_context.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -name: gururmm-user-session-context -description: GuruRMM commands accept context=user_session (migration 041) to run as the active logged-on user via WTS impersonation — executes previously-interactive-only commands that fail as SYSTEM with "NonInteractive mode" -metadata: - type: reference ---- - -GuruRMM's command API (`POST /api/agents/:id/command`, see [[reference_gururmm_api]]) accepts an optional **`context`** field: - -- `"system"` (default) — Session 0 / SYSTEM, the original behavior of every existing command. -- `"user_session"` — runs in the **active logged-on user's** desktop session via WTS token impersonation (`WTSQueryUserToken` + `DuplicateTokenEx` + `CreateProcessAsUserW`, in `agent/src/watchdog/wts.rs`). **Requires an active logged-on user** on the endpoint — no user logged in = no session to run in. - -Added by migration `041_add_command_context.sql`; server enum `CommandContext` serializes `snake_case`. - -**Why it matters:** some Windows cmdlets fail as SYSTEM with a "NonInteractive mode" / interactive-session error and historically had to be done by hand on-site. `user_session` runs them remotely instead. Verified 2026-05-27 on the Peaceful Spirit **BridgetteHome** L2TP VPN deploy: `Set-VpnConnection -L2tpPsk -AllUserConnection` — previously documented as "cannot be done remotely" — was set successfully via `user_session`, completing a VPN rollout entirely through RMM with no on-site visit. - -**Elevation:** the WTS-impersonated token of a logged-on **admin** user comes back effectively elevated (`WindowsPrincipal.IsInRole(Administrator)=True`) — enough to write the all-user phonebook / HKLM. A **standard** logged-on user would NOT be elevated, so admin-requiring commands would still fail. The agent still launches `powershell.exe -NonInteractive`, so don't rely on real interactive prompts. - -**Invoke:** body `{"command_type":"powershell","command":"...","context":"user_session"}`. To dodge shell-quoting on multi-line scripts, base64-encode the script as UTF-16LE and send `powershell -NoProfile -NonInteractive -EncodedCommand ` (`iconv` is absent in this Git Bash — encode with `py`). diff --git a/.claude/memory/reference_ix_access_tailscale.md b/.claude/memory/reference_ix_access_tailscale.md deleted file mode 100644 index 82d43ef..0000000 --- a/.claude/memory/reference_ix_access_tailscale.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -name: IX Server Access via Tailscale -description: IX server (ix.azcomputerguru.com) is accessible with Tailscale on, no VPN needed -type: reference ---- - -IX server (ix.azcomputerguru.com / 172.16.3.10) can be accessed directly when Tailscale is on. No separate VPN connection required. diff --git a/.claude/memory/reference_ix_server_access.md b/.claude/memory/reference_ix_server_access.md new file mode 100644 index 0000000..e9d23c6 --- /dev/null +++ b/.claude/memory/reference_ix_server_access.md @@ -0,0 +1,25 @@ +--- +name: IX server access — network + SSH +description: How to reach ix.azcomputerguru.com (172.16.3.10) — Tailscale-on means it's directly reachable, no separate VPN. SSH currently uses sshpass with the root password (key auth was never set up after GURU-5070 was reinstalled to Windows 11). Setting up key auth would simplify this. +type: reference +--- + +## Network reachability + +- **Host:** `ix.azcomputerguru.com` / `172.16.3.10` +- **Access:** directly reachable when Tailscale is on. No separate VPN connection required. + +## SSH + +> **VERIFY 2026-05-26** — the no-key-auth note was written under the old CachyOS install on GURU-5070; the machine is now Windows 11. Re-confirm whether key auth got set up before relying on the sshpass fallback below. + +- **User:** `root` +- **Password:** vault — see `credentials.md` or SOPS. +- **SSH key auth:** NOT configured from GURU-5070 (the old `guru@wsl` key was authorized but the workstation was reinstalled; new pubkey hasn't been added to IX's `authorized_keys` yet). +- **Current workflow (sshpass):** + ```bash + sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no -o PubkeyAuthentication=no root@172.16.3.10 + ``` +- **Suppress sshpass warnings:** pipe through `grep -v WARNING | grep -v 'not using'` or `tail`. + +**Recommended:** add GURU-5070's pubkey to IX's `~/.ssh/authorized_keys` to drop the sshpass dance. diff --git a/.claude/memory/reference_ix_server_ssh.md b/.claude/memory/reference_ix_server_ssh.md deleted file mode 100644 index a483487..0000000 --- a/.claude/memory/reference_ix_server_ssh.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -name: IX Server SSH Access -description: SSH access notes for IX server - key auth not set up on GURU-5070 (was CachyOS), must use sshpass with password -type: reference ---- - -[VERIFY 2026-05-26 — written under the old CachyOS install; GURU-5070 is now Windows 11. Re-confirm whether key auth is set up before relying on the no-key-auth/sshpass note.] - -## IX Server SSH from GURU-5070 - -- **Host:** 172.16.3.10 (ix.azcomputerguru.com) -- **User:** root -- **Password:** See credentials.md -- **SSH Key Auth:** NOT configured on GURU-5070 (formerly acg-guru-5070; now Windows 11) -- **Must use:** `sshpass -p 'PASSWORD' ssh -o StrictHostKeyChecking=no -o PubkeyAuthentication=no root@172.16.3.10` -- **Suppress warnings:** Pipe through `grep -v WARNING | grep -v 'not using'` or `tail` - -**Why:** The SSH key from this machine hasn't been added to IX server's authorized_keys yet. The old WSL key (guru@wsl) was authorized but this was a new install (originally CachyOS; GURU-5070 has since been reinstalled to Windows 11). - -**How to apply:** When running commands on IX server, use sshpass approach. Consider setting up SSH key auth to simplify future access. diff --git a/.claude/memory/reference_pluto_build_server.md b/.claude/memory/reference_pluto_build_server.md index 016e9e1..ecbe24f 100644 --- a/.claude/memory/reference_pluto_build_server.md +++ b/.claude/memory/reference_pluto_build_server.md @@ -1,58 +1,70 @@ --- name: Pluto Build Server -description: General-purpose Windows build VM — hostname PLUTO / Unraid VM name "Claude-Builder" / 172.16.3.36. For any EXE needing native Windows MSVC compilation (utilities, Howard's tools, GuruRMM + GuruConnect agents). Drive via /rmm (agent enrolls as PLUTO) when SSH key isn't authorized. +description: General-purpose Windows build VM — hostname PLUTO / Unraid VM "Claude-Builder" / 172.16.3.36. For any EXE needing native Windows MSVC compilation (utilities, Howard's tools, GuruRMM + GuruConnect agents). Drive via /rmm (agent enrolls as PLUTO) when SSH key isn't authorized. type: reference --- -Pluto is a Windows Server VM on Jupiter. It is the **general-purpose Windows build machine** for any project needing a native Windows executable — not just GuruRMM. +Pluto is a **Windows Server 2019** VM on Jupiter (Unraid). The **general-purpose Windows build machine** for any project needing a native Windows executable — not just GuruRMM. -- **Hostname:** PLUTO (VM on Jupiter) -- **Unraid VM name:** **Claude-Builder** — the VM is listed as "Claude-Builder" in Unraid; it is the SAME machine as PLUTO / 172.16.3.36. There is **no dedicated `pluto` vault entry** — don't go searching for one. -- **Drive it remotely without SSH:** PLUTO runs a GuruRMM agent (client "AZ Computer Guru"). Use `/rmm` — resolve hostname **PLUTO** → agent id at runtime (IDs change on re-enroll; do not hardcode) — to run PowerShell on it. This is the path to use when a workstation's SSH key isn't authorized (e.g. GURU-5070, see below). +## Identity + +- **Hostname:** PLUTO +- **Unraid VM name:** `Claude-Builder` — same machine as PLUTO / 172.16.3.36. There is **no dedicated `pluto` vault entry** — credentials for Administrator are in SOPS (search the vault for "pluto" if needed). - **Static IP:** 172.16.3.36 (confirmed static 2026-04-19) -- **SSH:** `ssh -i ~/.ssh/id_ed25519 Administrator@172.16.3.36` (key auth) -- **Authorized keys (verified via RMM 2026-05-26):** `gururmm-build@gururmm-server` and `guru@gururmm-build` (the build server's keys), present in both `C:\ProgramData\ssh\administrators_authorized_keys` and `Administrator\.ssh\authorized_keys`. The old `guru@DESKTOP-0O8A1RL` key (retired machine) has already been rotated out. NOTE: no personal-workstation key (e.g. GURU-5070) is currently authorized — the `ssh -i ~/.ssh/id_ed25519 Administrator@172.16.3.36` workflow below works only from a host whose pubkey is in the file; add GURU-5070's pubkey to `administrators_authorized_keys` if you need direct workstation SSH. + +## Access + +- **SSH (preferred):** `ssh -i ~/.ssh/id_ed25519 Administrator@172.16.3.36` — key auth only. +- **Authorized keys (verified via RMM 2026-05-26):** `gururmm-build@gururmm-server` and `guru@gururmm-build` (the build server's keys), in both `C:\ProgramData\ssh\administrators_authorized_keys` and `Administrator\.ssh\authorized_keys`. The retired `guru@DESKTOP-0O8A1RL` key has been rotated out. NO personal-workstation key (e.g. GURU-5070) is currently authorized — add the workstation's pubkey to `administrators_authorized_keys` if you need direct SSH. +- **Fallback when SSH key isn't authorized:** PLUTO runs a GuruRMM agent (client "AZ Computer Guru"). Use `/rmm` — resolve hostname **PLUTO** → agent id at runtime (IDs change on re-enroll; do not hardcode) — to run PowerShell on it. +- **When SSH fails:** check Jupiter's VM console first (the VM, not the physical host). + +### `administrators_authorized_keys` gotcha + +The file must be **ASCII-encoded** with `icacls /inheritance:r` set. PowerShell `>` writes UTF-16 BOM and silently breaks SSH key auth. Write keys with: +```powershell +[System.IO.File]::WriteAllText('C:\ProgramData\ssh\administrators_authorized_keys', $keyContent, [System.Text.Encoding]::ASCII) +``` ## Installed Toolchain - **Rust:** stable-x86_64-pc-windows-msvc (rustup at `C:\Users\Administrator\.cargo\bin`) -- **VS Build Tools:** Installed with `Microsoft.VisualStudio.Workload.VCTools` (MSVC linker, CRT, Windows SDK) +- **VS Build Tools:** `Microsoft.VisualStudio.Workload.VCTools` (MSVC linker, CRT, Windows SDK) +- **WiX 4:** for MSI builds +- **Code signing:** Azure Trusted Signing (via `signtool`) - **Git:** v2.47.1.windows.2 - **OpenSSH:** Win32-OpenSSH, sshd set to Automatic startup ## Use Cases -Use Pluto when you need a **native Windows MSVC build** — produces proper `.exe` files with no MinGW runtime dependency. Examples: -- Utilities (internal tooling, one-off scripts compiled to EXE) -- Howard's tech tools (MasterBooter, Slint GUI apps, etc.) -- GuruRMM agent MSVC builds (when MSVC target is preferred over the automated MinGW build on the Linux server) -- Anything using Windows-only APIs or needing code signing via signtool +Native Windows MSVC builds — produces `.exe` with no MinGW runtime dependency. Examples: +- Internal utilities and one-off compiled scripts +- Howard's tech tools (MasterBooter, Slint GUI apps) +- GuruRMM Windows agent variants (amd64, x86, legacy, debug) and MSI packaging +- Anything using Windows-only APIs or needing `signtool` signing **Note:** Routine GuruRMM agent builds are automated on the Linux server (172.16.3.30) via MinGW + jsign. Use Pluto for MSVC-specific builds or one-off tooling. ## Directory Layout -- `C:\builds\` — general project builds (create a subdirectory per project) +- `C:\builds\` — general project builds (subdirectory per project) - `C:\gururmm\` — GuruRMM repo clone -## Typical Build Workflow +## Typical Workflow ```bash -# 1. SSH in ssh -i ~/.ssh/id_ed25519 Administrator@172.16.3.36 - -# 2. Clone or pull project git clone https://azcomputerguru:@git.azcomputerguru.com/azcomputerguru/.git C:\builds\ - -# 3. Build cd C:\builds\ cargo build --release - -# 4. SCP output back # From workstation: scp -i ~/.ssh/id_ed25519 Administrator@172.16.3.36:"C:/builds//target/release/.exe" ./ ``` +## GuruRMM agent on PLUTO + +Historically runs old versions (0.6.3 as of 2026-05-15). Update it after any Pluto maintenance. + ## Not Neptune -Neptune is a separate existing server (email/web hosting). Pluto is only for builds. +Neptune is a separate server (email/web hosting). Pluto is for builds only. diff --git a/.claude/memory/reference_resource_map.md b/.claude/memory/reference_resource_map.md new file mode 100644 index 0000000..d938b6e --- /dev/null +++ b/.claude/memory/reference_resource_map.md @@ -0,0 +1,264 @@ +--- +name: ACG resource map — what I have access to and how to connect +description: Cheatsheet for every resource ACG has access to (servers, services, APIs, M365 tenants, MSP tools). For each: what it is, default access method, per-machine exceptions (if any), gotchas, and pointer to the existing detail file. Use this FIRST when a task says "connect to X" / "check Y" — don't search; look here. +type: reference +--- + +**Use this first.** When a task references a resource ("ssh into Jupiter", "check Syncro", "look at the Cascades tenant"), look here BEFORE searching for credentials or trying random connection methods. This is the lookup table; the detail lives in the linked `reference_*` / `project_*` files. + +## First principles (apply to ~everything) + +- **Vault wrapper** (NEVER hardcode the vault path): + ```bash + VAULT="$CLAUDETOOLS_ROOT/.claude/scripts/vault.sh" + bash "$VAULT" get-field # e.g. infrastructure/gururmm-server.sops.yaml credentials.password + bash "$VAULT" search # search without decrypting + bash "$VAULT" list # full inventory + ``` + Reads `vault_path` from `.claude/identity.json` per-machine (Windows `c:/Users/guru/vault`, Mac `~/vault`, etc.). + +- **Tailscale must be on** to reach anything on `172.16.x.x` from outside the office. Office LAN is `172.16.0.0/22`. + +- **SSH on Windows:** always use **system OpenSSH** (`C:\Windows\System32\OpenSSH\ssh.exe`), **NEVER Git for Windows SSH**. Git for Windows ssh has subtle key handling differences that break auth silently. + +- **Git Bash on Windows:** never redirect to Windows paths with backslashes (`echo X > D:\path`) — Git Bash strips backslashes and substitutes the colon with a Unicode PUA char, creating a garbled junk file. Use forward slashes (`/d/path`) or workspace-relative paths. + +- **1Password fallback:** service-account token in vault at `infrastructure/1password-service-account.sops.yaml`. Set `OP_SERVICE_ACCOUNT_TOKEN`, then `op read "op://Vault/Item/field"`. Each workstation's age key backup lives at `op://Infrastructure/age Key - `. + +--- + +## Office servers & VMs (all on Tailscale + 172.16.0.0/22) + +### Jupiter — Unraid primary (172.16.3.20) +- **What:** Unraid host. Runs ALL ACG VMs (GuruRMM server, OwnCloud, UniFi, Pluto, etc.) and the Docker stack (NPM, Gitea, Seafile). +- **Default:** `ssh root@172.16.3.20`. Password `infrastructure/jupiter-unraid-primary.sops.yaml` `credentials.password`. iDRAC out-of-band at 172.16.1.73. +- **Notes:** `guru@wsl` + `guru@gururmm-build` + Mac keys all authorized. Unraid web UI on port 80 — use VM console when a VM's SSH fails. +- Detail: [[infra_office_network]]. + +### gururmm-server (172.16.3.30, hostname `gururmm`) +- **What:** Linux VM on Jupiter. THE workhorse — runs MariaDB, PostgreSQL, ClaudeTools API (`:8001`), GuruRMM API (`:3001`), GuruConnect server (`:3002`), coord API, Gitea Actions runner, build pipeline, webhook. +- **Default:** `ssh guru@172.16.3.30`. Password `infrastructure/gururmm-server.sops.yaml` `credentials.password`. User is **`guru`** NOT `mike`. Home `/home/guru/`. +- **Gotcha:** for cargo/protoc/PATH, use a **login shell**: `ssh guru@172.16.3.30 'bash -lc "..."'`. Non-interactive shell doesn't source `~/.profile` and these look "missing". +- **Layout:** repo at `/home/guru/gururmm`, build pipeline at `/opt/gururmm/` (auto-synced from repo `deploy/build-pipeline/` by `build-shared.sh`). +- Detail: [[reference_gururmm]], [[project_gururmm]], [[project_guruconnect]]. + +### Pluto — Windows build VM (172.16.3.36, Unraid VM "Claude-Builder") +- **What:** Windows Server 2019 VM. Native MSVC builds — Rust, WiX MSI, Azure Trusted Signing. +- **Default:** `ssh -i ~/.ssh/id_ed25519 Administrator@172.16.3.36` (key auth, no password). +- **Per-machine:** Only `gururmm-build@gururmm-server` and `guru@gururmm-build` keys are authorized. From **GURU-5070** (Mike's main) the pubkey is NOT authorized → use `/rmm` (PLUTO agent) instead of trying SSH. +- **Gotcha:** if adding a key, `administrators_authorized_keys` MUST be ASCII. PowerShell `>` writes UTF-16 BOM and silently breaks SSH. Use `[System.IO.File]::WriteAllText(..., $key, [System.Text.Encoding]::ASCII)`. +- Detail: [[reference_pluto_build_server]]. + +### IX server (172.16.3.10 / ix.azcomputerguru.com) +- **What:** Rocky Linux cPanel/WHM. 40+ client WordPress sites + Matomo + Flarum forum + radio show site. +- **Default:** `ssh root@172.16.3.10`. Password `infrastructure/ix-server.sops.yaml` `credentials.password`. Tailscale-reachable directly (no separate VPN). WHM at `:2087`, cPanel at `:2083`. +- **Per-machine:** **GURU-5070's pubkey is NOT authorized** (was CachyOS, reinstalled to Win11, key never re-added) → use `sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no -o PubkeyAuthentication=no root@172.16.3.10`. Suppress warnings with `| grep -v WARNING`. Other machines: re-verify per machine. +- Detail: [[reference_ix_server_access]]. + +### Uranus — Unraid secondary (172.16.3.21) +- **What:** Unraid secondary. Pavon archive storage, planned future Windows build VM. Low RAM (7.7GB). +- **Default:** `ssh root@172.16.3.21`. Password `infrastructure/uranus-unraid.sops.yaml`. +- **Note:** NOT the Seafile proxy. Mounted as OwnCloud external storage (SMB → `/Archive`). + +### OwnCloud VM (172.16.3.22 / cloud.acghosting.com) +- **What:** Rocky Linux 9.6 VM on Jupiter. OwnCloud file sync. +- **Default:** SSH per `infrastructure/owncloud-vm.sops.yaml`. +- **Note:** distinct from Seafile (`sync.azcomputerguru.com` is Seafile on Jupiter Docker). + +### Neptune (67.206.163.124 / neptune.acghosting.com) +- **What:** Exchange Server 2016. **Physically at Dataforth's D2 facility**, NOT the ACG office (despite the `acghosting.com` name). Email for ACG-hosted clients. +- **Default:** RDP/admin via `clients/dataforth/neptune-exchange.sops.yaml`. OWA at `https://neptune.acghosting.com/owa/`. +- **Note:** to reach from the ACG office, route via D2TESTNAS (192.168.0.9) — Dataforth UDM subnet overlaps 172.16.x.x. **It is NOT Dataforth's mail system** — Dataforth uses M365 (see below). + +### WebSvr (162.248.93.81 / websvr.acghosting.com) +- **What:** Legacy CentOS 7 cPanel. DNS for ACG Hosting domains + some legacy sites. +- **Default:** `ssh root@websvr.acghosting.com`. `infrastructure/websvr-legacy-hosting.sops.yaml`. + +### pfSense firewall (172.16.0.1) +- **What:** FreeBSD pfSense 2.8.1. Firewall + OpenVPN + Tailscale subnet router for 172.16.0.0/22. +- **Default:** SSH on **port 2248** (not 22), user `admin`. Creds `infrastructure/pfsense-firewall.sops.yaml`. Web UI `https://172.16.0.1`. +- **Gotcha:** Tailscale gateway — losing pfSense = no remote access to anything in office. Don't drop SSH/Tailscale config without an alternative path verified. + +--- + +## Office network services (Docker on Jupiter) + +### Gitea — internal (`http://172.16.3.20:3000` / `https://git.azcomputerguru.com`) +- **What:** Self-hosted git. ALL ACG repos (`claudetools`, `gururmm`, `guru-connect`, `vault`, projects). +- **Default:** for API/automation use **internal** `http://172.16.3.20:3000` (bypasses NPM SSL-renewal blips). For Howard-attributed PR merges: `services/gitea-howard.sops.yaml` `credentials.password`. For admin API: `services/gitea.sops.yaml` `credentials.api.api-token`. Git over SSH: `ssh://git@172.16.3.20:2222`. +- **Gotcha:** public `git.azcomputerguru.com` is **NOT** behind Cloudflare — it's the office Cox IP NAT'd to NPM. Internal `:3000` is more reliable. +- Detail: [[reference_gitea_internal]], [[reference_gitea_api_credential]]. + +### NPM (Nginx Proxy Manager) +- **What:** openresty reverse proxy for all `*.azcomputerguru.com` services. +- **Default:** admin UI `http://172.16.3.20:7818`. `services/npm.sops.yaml`. +- **Note:** proxy configs at `/data/nginx/proxy_host/*.conf` on Jupiter. Cert renewals briefly drop external `:443`. + +### Seafile Pro (`sync.azcomputerguru.com`) +- 11.8TB file sync. `services/seafile-pro.sops.yaml`. + +### Cloudflare (DNS for `azcomputerguru.com`) +- API tokens in `services/cloudflare.sops.yaml`. Analytics record is proxied; git is NOT. + +### GoDaddy API +- Domain registrar API. `services/godaddy-api.sops.yaml`. + +--- + +## PSA / ticketing + +### Syncro — primary (`computerguru.syncromsp.com`) +- **What:** Primary PSA / RMM (Kabuto agent). ACG's tickets, invoices, customers, time entries. +- **Default:** API key `msp-tools/syncro.sops.yaml` `credentials.api_key`; Howard's own key `msp-tools/syncro-howard.sops.yaml`. Base `https://computerguru.syncromsp.com/api/v1`. Skill: `/syncro`. +- **Gotchas:** **NO idempotency on any endpoint — ALWAYS GET before retrying any POST.** Content-Type header required. Comments need `subject`. `add_line_item` uses internal ticket ID, not ticket number. Timers no longer used for billing. +- Detail: [[feedback_syncro_api]], [[feedback_syncro_billing]], [[feedback_syncro_workflow]], [[feedback_syncro_history]]. + +### Autotask — secondary +- **What:** Legacy/secondary PSA. **Default to Syncro** unless task explicitly says "Autotask". +- **Default:** `msp-tools/autotask.sops.yaml` (API username, password, integration code; zone `webservices5.autotask.net`). +- Detail: [[feedback_psa_default_syncro]]. + +--- + +## RMM / remote control + +### GuruRMM — ACG's own (`rmm.azcomputerguru.com`) +- **What:** Rust/Axum server @ `172.16.3.30:3001`. Agents on all ACG-managed endpoints. Drives `/rmm` skill. +- **Default:** JWT login `POST /api/auth/login`. Creds `infrastructure/gururmm-server.sops.yaml` fields `credentials.gururmm-api.admin-email` / `admin-password`. External `https://rmm-api.azcomputerguru.com`. Dashboard `https://rmm.azcomputerguru.com`. +- **Gotchas:** use `context: "user_session"` for cmdlets that fail as SYSTEM with "NonInteractive mode" (see [[reference_gururmm]]). Linux agent runs in a **systemd sandbox** — `findmnt`/`/proc/mounts` from the agent lie (sandbox view, not host). SSH the host directly for ground truth. +- Detail: [[reference_gururmm]], [[project_gururmm]], [[feedback_gururmm]]. + +### ScreenConnect / CW Control +- Primary remote-access tool. `msp-tools/screenconnect.sops.yaml`. +- **Gotcha:** Toolbox scripts truncate lines >80 chars silently; no inline comments mid-script. See [[reference_msp_audit_scripts]]. + +### Splashtop (SOS / Streamer) +- Secondary remote-access in the stack. Portal — verify vault entry if needed. + +### Datto RMM (CagService / Aemagent) +- Part of ACG stack on managed endpoints. **Expected, not a threat.** Portal creds — verify in vault. + +### GuruConnect — ACG's own (`connect.azcomputerguru.com`) +- **What:** ACG's own remote-access product. v2 live since 2026-05-30. Native-first, full key fidelity, bidirectional file transfer. +- **Default:** server `172.16.3.30:3002` behind NPM. Portal creds `projects/guruconnect/portal.sops.yaml`. DB `projects/guruconnect/database.sops.yaml`. +- Detail: [[project_guruconnect]]. + +--- + +## Security / EDR / AV + +### Bitdefender GravityZone (Cloud MSP partner tenant) +- **What:** ACG partner tenant. Endpoint AV/EDR. +- **Default:** API creds `msp-tools/gravityzone.sops.yaml`. Skill: `/bitdefender`. +- **Gotcha:** skill talks to **live production** partner tenant — destructive ops gated. + +### Datto EDR / Datto AV +- **What:** Managed AV on ACG endpoints. When active, **Windows Defender real-time is OFF by design** — that's expected, not a gap. +- Detail: [[reference_acg_msp_stack]]. + +--- + +## Cloud storage + +### Backblaze B2 +- **What:** Per-client MSP360/CloudBerry backup destinations. Account ID `46f69bc61163`, region `us-west-001`. +- **Default:** API key `projects/claudetools/backblaze-b2.sops.yaml`. Skill: `/b2`. + +### MSP360 API (backup orchestration) +- `msp-tools/msp360-api.sops.yaml`. + +--- + +## M365 / Google Workspace tenants + +ACG manages multiple M365 tenants via the **ComputerGuru tiered MSP app suite** (Security Investigator / Exchange Operator / User Manager / Tenant Admin / Defender Add-on / Intune Manager). Per-tenant tokens in `msp-tools/computerguru-*.sops.yaml`. Use the **`/remediation-tool`** skill — NOT CIPP (CIPP creds exist at `msp-tools/cipp.sops.yaml` but the ComputerGuru suite is the primary path). + +| Tenant | Vault path | +|--------|------------| +| ACG own (computerguru) | `msp-tools/computerguru-*.sops.yaml` (partner tenant) | +| Dataforth | `clients/dataforth/m365.sops.yaml` | +| Cascades Tucson | `clients/cascades-tucson/m365-admin.sops.yaml`, `m365-sysadmin.sops.yaml` | +| QuantumWMS | `clients/quantumwms/m365-breakglass.sops.yaml` | +| BG Builders | `clients/bg-builders/m365.sops.yaml` | +| MVAN | `clients/mvan/m365.sops.yaml` | +| Heieck.org | `clients/heieck-org/m365.sops.yaml` | +| CW Concrete | `clients/cw-concrete/m365.sops.yaml` | +| Kittle (M. Sanchez) | `clients/kittle/m365-michael-sanchez.sops.yaml` | + +Also: multi-tenant Graph API service principal at `msp-tools/claude-msp-access-graph-api.sops.yaml`. + +**Google Workspace:** ACG service account `msp-tools/acg-msp-access-google-workspace.sops.yaml`. Client-specific: `clients/lonestar-electrical/google-workspace.sops.yaml`. + +Detail: [[project_cascades]], [[project_dataforth]], [[project_quantum_godaddy_m365_tenant]]. + +--- + +## Internal APIs (all on `172.16.3.30`) + +### ClaudeTools main API (`:8001`) +- 95+ endpoints, JWT auth, MariaDB. Docs `/api/docs`. Auth creds `projects/claudetools/api-auth.sops.yaml`. + +### ClaudeTools coord API (`:8001/api/coord`) +- Inter-session coordination (locks, messages, todos, component state). **NO AUTH.** Direct curl. Spec in `CLAUDE.md` + [[reference_coord_messages_api_shape]]. + +### GuruRMM API (`:3001`) / GuruConnect API (`:3002`) +- See respective sections above. + +--- + +## Other services + +### Matomo Analytics (`analytics.azcomputerguru.com`) +- PHP analytics on IX server. Tracks 3 sites. Creds `services/matomo-analytics.sops.yaml` (verify; older docs hardcoded the password — should now be vault-only). +- Detail: [[reference_matomo_analytics]]. + +### Flarum forum (`community.azcomputerguru.com`) +- Flarum 1.8.14 on IX server cPanel `azcomputerguru`. Skill: `/forum-post`. +- **Gotcha:** **Cloudflare blocks external Flarum API calls.** Must SSH to IX and run PHP/DB script — the `/forum-post` skill handles this via paramiko SSH. +- Detail: [[reference_community_forum]]. + +### Radio show (`radio.azcomputerguru.com`) +- Astro static site, source at `projects/radio-show/website/`. Build `npm run build` → rsync `dist/` to IX server cPanel. +- Detail: [[reference_radio_website]]. + +### TickTick +- OAuth creds `services/ticktick.sops.yaml`. MCP server + token cache at `mcp-servers/ticktick/.tokens.json`. Detail: [[reference_ticktick_integration]]. + +### Ollama (local, per-machine) +- **Tier-0 LLM** (drafts, summaries, classification). Endpoint per-machine in `.claude/identity.json` `.ollama.endpoint`. Models: `qwen3:14b` / `qwen3.6` (structured) / `codestral:22b` (code). See `.claude/OLLAMA.md`. + +### GrepAI (local watcher + MCP server) +- Semantic code search over `claudetools/` + `session-logs/`. MCP tools `grepai_search`, `grepai_trace_callers/callees`. CLI `$CLAUDETOOLS_ROOT/grepai search`. Watcher runs as scheduled task per machine. + +### Discord bot +- `projects/discord-bot/anthropic-api.sops.yaml` + `bot-token.sops.yaml`. Runs as `.venv/Scripts/python.exe -m bot.main` from `projects/discord-bot/`. + +### Azure Trusted Signing +- Windows code signing (Pluto `signtool`). `services/azure-trusted-signing.sops.yaml`. + +### Apple Developer Program +- macOS code signing + MDM Push cert. `infrastructure/apple-developer-program.sops.yaml`. **MDM Push cert renews annually on the same Apple ID** or enrolled iOS devices break. See [[project_apple_mdm_certs]]. + +--- + +## Client systems (per-client vault pattern) + +Every managed client has access entries at `clients//.sops.yaml`. Examples by frequency: **Cascades Tucson** (pfSense / Synology / CS-SERVER / accountant PC / multiple admin accounts), **Dataforth** (AD1, AD2, ESXi 122/124, D2TESTNAS, PBX, UDM, Neptune, M365, OAuth), **VWP** (UDM / DC1 / XenServer / iLO / etc.), **Peaceful Spirit** (server + L2TP VPN), plus: Anaise, BG Builders, Birth Biologic, CryoWeave, CW Concrete, Grabb & Durando, Heieck, IMC, Khalsa, Kittle, Lens Auto Brokerage, Lonestar Electrical, MVAN, QuantumWMS, Rednour, Scileppi, Sif-Oidak, Sombra Residential, Stamback Septic, Tucson Golden Corral, Key Paul, Glaztech (GuruRMM site key only). Sweep `bash $VAULT search ` first. + +Doc layout (`overview/network/servers/cloud/security/rmm`) and wiki articles at `wiki/clients/.md`. Detail: [[reference_client_docs_structure]]. + +**Notable gotcha — D2TESTNAS:** `root@192.168.0.9` with `Paper123!@#` (NOT `sysadmin`). See [[feedback_d2testnas_ssh]]. + +--- + +## Per-machine access gotchas (consolidated) + +| Machine | Gotchas | +|---------|---------| +| **GURU-5070** (Mike's Win11 primary) | IX pubkey not authorized → use `sshpass`. Pluto pubkey not authorized → use `/rmm` agent PLUTO instead. Has full local Rust toolchain (`cargo` + MSVC + `protoc`) — build GuruConnect locally; set `$env:PROTOC` to the winget path. See [[reference_guru5070_rust_toolchain]]. | +| **GURU-BEAST-ROG** (Win11 secondary) | Verify SSH key deployment per resource. See [[machine_windows_guru_setup_status]]. | +| **GURU-KALI** (Linux) | Subject to GuruRMM agent sandbox issue ([[reference_gururmm]] §sandbox) for Linux-agent dispatched commands. | +| **Mikes-MacBook-Air** | gururmm `install-hooks.sh` still pending — see [[project_gururmm]]. Vault path is `~/vault`. | +| **Howard-Home / ACG-TECH03L** | Vault path varies — read from `.claude/identity.json` `vault_path`. | +| **All Windows machines** | Use **system OpenSSH** (`C:\Windows\System32\OpenSSH\ssh.exe`) NEVER Git for Windows SSH. NEVER redirect to backslashed Windows paths from Git Bash (`echo X > D:\path` corrupts to junk file). | +| **All machines** | Tailscale must be on for any `172.16.x.x` from outside office. | diff --git a/.claude/memory/reference_rmm_agent_runs_in_systemd_sandbox.md b/.claude/memory/reference_rmm_agent_runs_in_systemd_sandbox.md deleted file mode 100644 index 6e52cb0..0000000 --- a/.claude/memory/reference_rmm_agent_runs_in_systemd_sandbox.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -name: reference_rmm_agent_runs_in_systemd_sandbox -description: Commands dispatched via the GuruRMM agent execute INSIDE the agent's systemd sandbox (ProtectSystem=strict) — fs/mount observations reflect the agent's private namespace, NOT the host. For host truth, SSH directly or read /proc//mountinfo. -metadata: - type: reference ---- - -The GuruRMM Linux agent runs as a systemd service (`gururmm-agent.service`) hardened with -**`ProtectSystem=strict`**, which gives the agent process a **private mount namespace where `/` -is mounted read-only**, with only `ReadWritePaths=` entries writable. **Any command you dispatch -through the RMM agent (`/rmm shell`, probes) runs inside that namespace** — so `findmnt /`, -`touch`, `/proc/mounts` etc. report the **agent's sandboxed view, not the host's actual state**. - -**Trap (hit 2026-06-01, GURU-KALI):** I diagnosed "host root filesystem is read-only" because -RMM-dispatched `touch /var/lib/gururmm` returned EROFS (os error 30) and `findmnt /` showed `ro`. -The host root was **rw the entire time** (SMART PASSED, ext4 clean, no kernel remount-ro — all -consistent with the host being fine). The real cause: the unit's -`ReadWritePaths=/var/log /usr/local/bin /etc/gururmm` **omitted `/var/lib/gururmm`**, so the agent -couldn't persist `/var/lib/gururmm/.device-id` → it re-minted a device_id on each daily -identity refresh → the server (no machine_uid dedup) filed a new agent row each time (~11 ghosts). - -**How to get host truth instead of the sandbox view:** -- SSH to the host directly (commands there run in the host namespace), OR -- Read the agent PID's namespace explicitly: `cat /proc//mountinfo` — the process-scoped - `ro` on `/` is the tell that it's sandbox, not host. Compare against the host's `findmnt`. -- `errors=remount-ro` in a mount line is just the stock default mount option — NOT evidence an - error fired. Confirm an actual remount-ro with kernel `EXT4-fs error` logs + `dumpe2fs -h` error - count, not the mount option alone. - -**The fix pattern** (durable, additive): drop-in -`/etc/systemd/system/gururmm-agent.service.d/override.conf` with `[Service]\nReadWritePaths=/var/lib/gururmm` -(systemd merges ReadWritePaths additively across drop-ins), then `daemon-reload` + `restart`. -Better upstream fix: `StateDirectory=gururmm` (handles dir creation + perms + RW bind in one -directive). **Fleet implication:** every systemd-installed GuruRMM Linux agent with this unit shape -has the same latent bug until the installer is fixed. See filed todos (agent ReadWritePaths/ -StateDirectory + server machine_uid dedup). diff --git a/.claude/memory/user_font_preference.md b/.claude/memory/user_font_preference.md new file mode 100644 index 0000000..a53ac40 --- /dev/null +++ b/.claude/memory/user_font_preference.md @@ -0,0 +1,14 @@ +--- +name: user-font-preference +description: Mike prefers Lucida Console as his terminal/editor font +metadata: + node_type: memory + type: user + originSessionId: 0f674028-fca4-4ab4-95c7-aaf47083b031 +--- + +Mike's preferred monospace font is **Lucida Console**. + +Apply when configuring terminal fonts, editor fonts, or any UI font choice +where a monospace face is needed and Mike hasn't specified a different one +for that context.