Files

Mike Swanson 8e35986765 sync: auto-sync from GURU-5070 at 2026-05-28 14:27:08

Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-05-28 14:27:08

2026-05-28 14:27:12 -07:00

24 KiB

Raw Blame History

description

description
Run commands, investigate agents, and execute remote scripts via the GuruRMM agent fleet.

/rmm — GuruRMM remote command execution

Interact with the GuruRMM agent fleet: list agents, run remote commands (PowerShell, shell, Python), poll for output, and cancel in-flight tasks.

Default posture: read before write. Always look up the agent by hostname before dispatching a command. Never guess a UUID.

Usage

/rmm                                    List all agents with connection status
/rmm agents [<client>]                  List agents, optionally filtered by client/site name
/rmm agent <hostname|uuid>              Show agent detail + last 10 commands
/rmm run <hostname|uuid>                Interactively compose and dispatch a script
/rmm shell <hostname|uuid> <cmd>        One-liner bash/sh command (Linux/macOS agents)
/rmm ps <hostname|uuid> <cmd>           One-liner PowerShell command (Windows agents)
/rmm status <command_id>                Check command status
/rmm output <command_id>                Fetch full stdout + stderr for a completed command
/rmm cancel <command_id>                Cancel a pending or running command
/rmm history <hostname|uuid> [N]        Recent command history (default 10, max 500)

API configuration

Base URL: http://172.16.3.30:3001 Auth: JWT (24-hour expiry) — obtain via POST /api/auth/login, pass as Authorization: Bearer <token> Credentials vault path: infrastructure/gururmm-server.sops.yaml

credentials.gururmm-api.admin-email
credentials.gururmm-api.admin-password

Hard rules (incidents have occurred — no exceptions)

Never hardcode a UUID. Agent UUIDs change when an agent re-enrolls. Always resolve hostname → UUID via GET /api/agents at the start of every workflow. The UUID from memory or a prior session may be stale.

Never guess the shell from the hostname. Use os_type from the agent record: "windows" → powershell, "linux" → shell, "macos" → shell. A machine named "WEST-MEADOW-9025" could be macOS. Check os_type.

Initial status "pending" is not a failure. It means the agent is offline and the command is queued. The agent will execute it when it reconnects. Poll patiently; do not cancel and retry.

Status "failed" with stderr = "Command timed out (server-side reaper)" means the command exceeded its timeout. The server reaper transitions running → failed rather than introducing a separate timeout status. Read stderr before concluding the script had a bug.

Status "interrupted" means the agent restarted mid-execution. The command was orphaned. The server marks it interrupted when the agent reconnects. Re-run if needed.

context: "user_session" requires an active logged-on user. If no interactive user is signed in, the WTS impersonation call will fail. Check whether a user is logged in before choosing this context. Falls back silently to no output if no desktop session exists.

No interactive input. Agent shells are non-interactive. Scripts must not call Read-Host, pause, read -p, or any prompt. Pass all values as variables or parameters in the script body.

Output is streamed in full but stored as a string. For very large output (log files, directory trees), write the output to a file on the endpoint and fetch it with a second command.

elevated: true sends the flag to the agent but does not guarantee elevation. On Windows the agent is already running as SYSTEM; elevated is a hint for future agent behaviour. Do not rely on it for privilege escalation on endpoints where the agent is not already elevated.

Heredoc payloads — always use --data-binary @- and <<'JSON' for static payloads. On Windows, the Write tool and Git Bash resolve /tmp/foo.json to different directories. Heredoc avoids the file handoff entirely. Use <<'JSON' (single-quoted) for static payloads that contain no shell variables; use <<JSON (unquoted) when interpolating ${VARS}.

After every dispatch: post a one-line alert to #bot-alerts. Same rule as Syncro — write operations (dispatch + cancel) get a bot alert. Read operations (list, status, output) do not.

Phase 0 — Bootstrap (run once per session)

IDENTITY_PATH="${HOME}/.claude/identity.json"
if [ ! -f "$IDENTITY_PATH" ]; then
    IDENTITY_PATH=$(git rev-parse --show-toplevel 2>/dev/null)/.claude/identity.json
fi
REPO_ROOT=$(jq -r '.claudetools_root // empty' "$IDENTITY_PATH" 2>/dev/null)
if [ -z "$REPO_ROOT" ]; then
    REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
fi
VAULT="$REPO_ROOT/.claude/scripts/vault.sh"
RMM="http://172.16.3.30:3001"

RMM_EMAIL=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-email)
RMM_PASS=$(bash "$VAULT" get-field infrastructure/gururmm-server.sops.yaml credentials.gururmm-api.admin-password)

JWT=$(curl -s -X POST "$RMM/api/auth/login" \
  -H "Content-Type: application/json" \
  --data-binary @- <<JSON
{"email": "$RMM_EMAIL", "password": "$RMM_PASS"}
JSON
)
TOKEN=$(echo "$JWT" | jq -r '.token // empty')
if [ -z "$TOKEN" ]; then
    echo "[ERROR] RMM login failed: $JWT"
    exit 1
fi
echo "[OK] Authenticated to GuruRMM"

Reuse $TOKEN, $RMM, and $REPO_ROOT for all subsequent calls in the session. Do not re-authenticate unless you get a 401.

Resolve agent by hostname

Always resolve before dispatching. Partial hostname matches are accepted (grep client-side).

AGENTS=$(curl -s "$RMM/api/agents" -H "Authorization: Bearer $TOKEN")

# Find by exact or partial hostname (case-insensitive)
AGENT=$(echo "$AGENTS" | jq --arg h "HOSTNAME" '[.[] | select(.hostname | ascii_downcase | contains($h | ascii_downcase))] | .[0]')

AGENT_ID=$(echo "$AGENT" | jq -r '.id // empty')
AGENT_HOST=$(echo "$AGENT" | jq -r '.hostname // empty')
AGENT_OS=$(echo "$AGENT" | jq -r '.os_type // empty')
AGENT_STATUS=$(echo "$AGENT" | jq -r '.status // empty')
AGENT_LAST=$(echo "$AGENT" | jq -r '.last_seen // "never"')
IS_CONNECTED=$(echo "$AGENTS" | jq -r --arg id "$AGENT_ID" '.[] | select(.id == $id) | .is_connected // false')

if [ -z "$AGENT_ID" ]; then
    echo "[ERROR] No agent found matching hostname. Run /rmm agents to list enrolled agents."
    exit 1
fi
echo "[OK] Found agent: $AGENT_HOST ($AGENT_OS) — id=$AGENT_ID, connected=$IS_CONNECTED, last_seen=$AGENT_LAST"

If multiple agents match: list the matches and ask the user to disambiguate. Never pick the first silently when there are multiple.

If is_connected = false: warn the user that the command will queue and execute when the agent comes back online. Still dispatch unless the user says to abort.

Agent list display

WEST-MEADOW-9025   macos    online    Scileppi Law          last: 2026-05-28T18:00:00Z  v0.3.4
CS-SERVER          windows  online    Cascades of Tucson    last: 2026-05-28T17:55:00Z  v0.3.4
AD2                windows  offline   ACG Internal          last: 2026-05-27T09:10:00Z  v0.3.2

Show: hostname, os_type, online/offline, client_name (from site_name/client_name), last_seen, agent_version.

Send a command

Determine command_type from os_type

`os_type`	Default `command_type`	Notes
`windows`	`powershell`	Runs via `powershell.exe -NonInteractive` as SYSTEM
`linux`	`shell`	Runs via `/bin/bash`
`macos`	`shell`	Runs via `/bin/bash` (or `/bin/zsh` on newer installs)

Use python only when explicitly writing a Python script. Use script for saved scripts (not covered in this skill).

Basic dispatch

# For PowerShell (Windows)
CMD_RESP=$(curl -s -X POST "$RMM/api/agents/$AGENT_ID/command" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data-binary @- <<'JSON'
{
  "command_type": "powershell",
  "command": "Get-ComputerInfo | Select-Object CsName, OsName, OsVersion, CsProcessors",
  "timeout_seconds": 60
}
JSON
)

# For shell (Linux/macOS)
CMD_RESP=$(curl -s -X POST "$RMM/api/agents/$AGENT_ID/command" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data-binary @- <<'JSON'
{
  "command_type": "shell",
  "command": "df -h / && uptime",
  "timeout_seconds": 60
}
JSON
)

CMD_ID=$(echo "$CMD_RESP" | jq -r '.command_id // empty')
CMD_STATUS=$(echo "$CMD_RESP" | jq -r '.status // empty')

if [ -z "$CMD_ID" ]; then
    echo "[ERROR] Dispatch failed: $CMD_RESP"
    exit 1
fi
echo "[OK] Command dispatched — id=$CMD_ID, initial_status=$CMD_STATUS"

Initial status values:

"running" — agent is online and received the command
"pending" — agent is offline; command queued

Multi-line scripts (heredoc with variable interpolation)

For scripts that need shell variable values substituted at the point of the curl call:

# Use <<JSON (unquoted) so shell expands ${VARS} in the payload
# Use jq -n --arg to safely JSON-encode the script text before embedding
SCRIPT='
$path = "C:\Users\$env:USERNAME\Downloads"
Write-Host "Size: $((Get-ChildItem $path -Recurse | Measure-Object Length -Sum).Sum / 1GB) GB"
'
PAYLOAD=$(jq -n \
  --arg ct "powershell" \
  --arg cmd "$SCRIPT" \
  --argjson to 120 \
  '{command_type: $ct, command: $cmd, timeout_seconds: $to}')

CMD_RESP=$(curl -s -X POST "$RMM/api/agents/$AGENT_ID/command" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d "$PAYLOAD")

jq -n --arg cmd "$SCRIPT" correctly JSON-encodes the script including backslashes, dollar signs, and newlines. Always use this pattern for multi-line scripts — never embed raw script text into JSON by hand.

Run as logged-on user (`context: user_session`)

Use when the command requires a desktop session: GUI actions, Clear-RecycleBin, VPN cmdlets that need a user token, interactive-only software.

CMD_RESP=$(curl -s -X POST "$RMM/api/agents/$AGENT_ID/command" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  --data-binary @- <<'JSON'
{
  "command_type": "powershell",
  "command": "Get-WMIObject Win32_UserProfile | Where-Object { $_.Special -eq $false } | Select-Object LocalPath",
  "timeout_seconds": 30,
  "context": "user_session"
}
JSON
)

user_session requirements and limits:

Requires an active (non-locked) desktop session — no user logged in = command runs but produces empty output or silently no-ops
On Windows: runs under the WTS-impersonated token of the currently active user. If an admin is logged in, the command runs elevated; if a standard user, it does not.
On Linux/macOS: the agent uses sudo -u <active_user> or su — verify agent implementation supports this on the target platform before relying on it
Clear-RecycleBin, Set-VpnConnection -L2tpPsk, and similar COM/interactive-shell cmdlets require user_session
Clear-RecycleBin -Force silently no-ops as SYSTEM even with user_session if no desktop is present — enumerate C:\$Recycle.Bin\<SID>\* directly when running as SYSTEM

Preview before dispatch (for `/rmm run`)

When the user requests an interactive run without a pre-written script, show a preview before dispatching:

COMMAND PREVIEW
---------------
Agent:    WEST-MEADOW-9025 (macos) — online
Type:     shell
Context:  system
Timeout:  120s
Script:
  find /Users/sylvia/Library -name "*.plist" -maxdepth 3

Dispatch? (yes/no)

Wait for explicit confirmation before dispatching any command.

Poll for completion

poll_command() {
    local cmd_id="$1"
    local max_polls="${2:-60}"
    local interval=5
    local count=0

    while [ $count -lt $max_polls ]; do
        RESULT=$(curl -s "$RMM/api/commands/$cmd_id" \
          -H "Authorization: Bearer $TOKEN")
        STATUS=$(echo "$RESULT" | jq -r '.status // empty')

        case "$STATUS" in
            completed|failed|cancelled|interrupted)
                echo "$RESULT"
                return 0
                ;;
            running|pending)
                count=$((count + 1))
                sleep $interval
                ;;
            "")
                echo "[ERROR] Empty status — response: $RESULT"
                return 1
                ;;
        esac
    done

    echo "[WARNING] Polling timeout after $((max_polls * interval))s — last status: $STATUS"
    echo "[INFO] Command $cmd_id may still be running. Use /rmm status $cmd_id to check later."
}

RESULT=$(poll_command "$CMD_ID")

Polling cadence:

Default: check every 5s for up to 5 minutes (60 polls)
For long-running scripts (software installs, disk scans): pass a higher max_polls or remind the user to check with /rmm status later
Never spam polls at < 3s intervals

Display command output

display_output() {
    local result="$1"
    local status=$(echo "$result" | jq -r '.status')
    local exit_code=$(echo "$result" | jq -r '.exit_code // "—"')
    local stdout=$(echo "$result" | jq -r '.stdout // ""')
    local stderr=$(echo "$result" | jq -r '.stderr // ""')

    echo "Status:    $status (exit $exit_code)"
    if [ -n "$stdout" ]; then
        echo "--- stdout ---"
        echo "$stdout"
    fi
    if [ -n "$stderr" ]; then
        echo "--- stderr ---"
        echo "$stderr"
    fi
}

Always show stderr on non-zero exit — PowerShell writes errors to stderr even when the command partially succeeds. Never suppress stderr output.

exit_code = null on status = "pending" or status = "interrupted" — the command never ran or was orphaned. Do not treat null exit code as exit 0.

status = "failed" does NOT always mean a script bug:

Check stderr for "Command timed out (server-side reaper)" → timeout, increase timeout_seconds
Check stderr for "Agent restarted during execution" → interrupted, re-run
Only if stderr has actual script errors is it a script issue

Cancel a command

CANCEL_RESP=$(curl -s -X POST "$RMM/api/commands/$CMD_ID/cancel" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json")
# Response: {"status": "cancelled", "message": "Command cancelled"}

Only cancels pending or running commands
If already completed, failed, or cancelled: server returns 400 "Command already finished"
Cancellation of a running command sends a WebSocket cancel message to the agent; the agent may not honour it immediately

List command history

# All recent commands (admin only)
curl -s "$RMM/api/commands?limit=20" -H "Authorization: Bearer $TOKEN" | \
  jq '[.[] | {id, agent_id, command_type, status, exit_code, created_at, command_text: (.command_text[:60])}]'

# Agent-scoped (works for non-admin)
curl -s "$RMM/api/commands?agent_id=$AGENT_ID&limit=10" -H "Authorization: Bearer $TOKEN" | \
  jq '[.[] | {id, status, exit_code, command_type, created_at, preview: (.command_text[:80])}]'

Verified response shapes (from source)

POST /api/auth/login

{"token": "eyJ..."}

POST /api/agents/{id}/command

{
  "command_id": "uuid",
  "status": "running",
  "message": "Command sent to agent"
}

or if agent offline:

{
  "command_id": "uuid",
  "status": "pending",
  "message": "Agent is offline. Command queued for execution when agent reconnects."
}

GET /api/commands/{id}

{
  "id": "uuid",
  "agent_id": "uuid",
  "command_type": "powershell",
  "command_text": "...",
  "status": "completed",
  "exit_code": 0,
  "stdout": "...",
  "stderr": "",
  "created_at": "ISO-8601",
  "started_at": "ISO-8601",
  "completed_at": "ISO-8601",
  "created_by": "uuid",
  "timeout_seconds": 300,
  "context": "system"
}

Note: the response field is command_text, not command. Parsing .command will return null.

GET /api/agents (array)

[{
  "id": "uuid",
  "hostname": "WEST-MEADOW-9025",
  "os_type": "macos",
  "os_version": "14.5",
  "os_name": "macOS",
  "agent_version": "0.3.4",
  "last_seen": "ISO-8601",
  "status": "online",
  "is_connected": true,
  "device_id": "...",
  "site_id": "uuid",
  "site_name": "Main Office",
  "client_name": "Scileppi Law",
  "maintenance_mode": false,
  "maintenance_mode_note": null,
  "update_channel": null
}]

POST /api/commands/{id}/cancel

{"status": "cancelled", "message": "Command cancelled"}

All command status values

Status	Meaning
`pending`	Queued; agent offline
`running`	Agent executing
`completed`	Finished; `exit_code` set
`failed`	Non-zero exit OR timed out (check stderr)
`cancelled`	Cancelled via API
`interrupted`	Agent restarted mid-run

Platform-specific patterns

Windows — PowerShell as SYSTEM

# Disk usage
Get-PSDrive -PSProvider FileSystem | Select-Object Name, @{n='Used(GB)';e={[math]::Round($_.Used/1GB,2)}}, @{n='Free(GB)';e={[math]::Round($_.Free/1GB,2)}}

# Services (a specific one)
Get-Service -Name "servicename" | Select-Object Name, Status, StartType

# Running processes (top by CPU)
Get-Process | Sort-Object CPU -Descending | Select-Object -First 10 Name, Id, CPU, WorkingSet

# Event log errors (last 24h)
Get-WinEvent -LogName System -MaxEvents 50 | Where-Object { $_.LevelDisplayName -eq "Error" -and $_.TimeCreated -gt (Get-Date).AddHours(-24) } | Select-Object TimeCreated, Id, Message

# Recycle bin contents (correct way — Clear-RecycleBin fails as SYSTEM)
Get-ChildItem "C:\`$Recycle.Bin" -Recurse -Force -ErrorAction SilentlyContinue | Select-Object FullName, Length

# User sessions (who is logged on)
query user

# Installed software
Get-ItemProperty HKLM:\Software\Microsoft\Windows\CurrentVersion\Uninstall\* | Select-Object DisplayName, DisplayVersion, Publisher | Sort-Object DisplayName

Common Windows gotchas:

Clear-RecycleBin -Force silently no-ops in SYSTEM context — use direct C:\$Recycle.Bin\<SID>\* enumeration
$env:USERNAME as SYSTEM = "SYSTEM" (not the logged-on user) — use query user or WMI to find the actual logged-on username
Get-Date is UTC-aware; use -Format "yyyy-MM-dd HH:mm:ss" for readable output
PowerShell Write-Host outputs to stdout; Write-Error outputs to stderr
Paths with spaces need quoting: "C:\Program Files\..."
$ErrorActionPreference = 'Stop' makes the script fail on first error (useful for idempotent scripts); without it, PowerShell continues past non-terminating errors

macOS — shell as root

# Disk usage
df -h /

# Find large files
find /Users -maxdepth 4 -size +100M -type f 2>/dev/null | sort -k5 -rn | head -20

# Logged-on user (who is using the GUI session)
stat -f "%Su" /dev/console

# LaunchDaemons check
ls /Library/LaunchDaemons/

# LaunchAgents for a user
ls /Users/sylvia/Library/LaunchAgents/ 2>/dev/null

# Remove a plist and unload (system scope)
launchctl bootout system /Library/LaunchDaemons/com.example.plist 2>/dev/null
rm /Library/LaunchDaemons/com.example.plist

# Mount AFP share (uses Keychain credentials)
osascript -e 'mount volume "afp://SL-SERVER._afpovertcp._tcp.local/Data"'

# File write via base64 (python3 is an Xcode stub on macOS without dev tools)
echo "BASE64ENCODED==" | /usr/bin/base64 -D > /path/to/file
chmod +x /path/to/file

# Strip macOS ACL that prevents directory deletion
chmod -a "group:everyone deny delete" /path/to/dir

Common macOS gotchas:

/usr/bin/python3 on macOS without Xcode CLI tools is a stub that triggers an installer dialog — use /usr/bin/base64 -D (capital D — BSD flag, not GNU) for file writing
nohup cmd & in agent shell context fails: nohup: can't detach from console: Inappropriate ioctl for device — use LaunchDaemon (launchctl bootstrap system <plist>) for truly detached background processes
Home directory subdirectories may have ACL 0: group:everyone deny delete — strip with chmod -a "group:everyone deny delete" before rmdir
pgrep rsync matches "colorsyncd" as a substring — use pgrep -f "rsync.*specific-path" for precision
launchctl bootstrap gui/501 <plist> targets the user session for UID 501 (first user); find UID with id -u sylvia
AFP automount via osascript uses Keychain; works only in user session context

Linux — shell as root

# Disk usage
df -h

# Memory
free -h

# Service status (systemd)
systemctl status servicename --no-pager -l

# Journal (last 50 lines of a service)
journalctl -u servicename -n 50 --no-pager

# Find large files
find / -xdev -size +100M -type f 2>/dev/null | sort -k5 -rn | head -20

# Open ports
ss -tlnp

Investigate → script → dispatch workflow

When the user asks for something that requires investigation before writing a script:

List agents — confirm target, check online status
Characterize the endpoint — run a quick recon command (OS version, disk, services) if the context is unknown
Write the script — draft it inline, show to user
Preview — show agent + script before dispatch
Dispatch — send with appropriate command_type and context
Poll — wait for completion, display output
Bot alert — post one-line summary

For complex multi-step remediation (move files, install software, configure services), dispatch one logical step at a time. Confirm output before proceeding to the next step.

Error handling

HTTP / Condition	Meaning	Recovery
401 Unauthorized	JWT expired or invalid	Re-authenticate (Phase 0)
404 Not Found	Agent or command UUID not found	Re-run `GET /api/agents` to refresh list
400 "Command already finished"	Tried to cancel a done command	No action needed
403 Forbidden	Non-admin calling fleet-wide endpoint	Add `?agent_id=<uuid>` filter
`status = "failed"`, stderr = "Command timed out"	Timeout	Increase `timeout_seconds` and re-run
`status = "interrupted"`	Agent restarted	Re-run the command
`status = "pending"` (stuck)	Agent offline for extended period	Notify user; do not cancel unless user confirms
`command_id = null` or empty	Dispatch failed	Check response body for error message
`jq` parse error on agent list	Control characters in field	Use `grep -o '"hostname":"[^"]*"'` for simple extraction

Post to #bot-alerts (after every write)

Post after every successful dispatch or cancel. Never post for read-only operations.

ALERT_OUT=$(bash "$REPO_ROOT/.claude/scripts/post-bot-alert.sh" "<message>")
echo "$ALERT_OUT"

Message format:

[RMM] <Tech> dispatched to <hostname> (<os_type>) - <brief description> -> cmd:<command_id_short>
[RMM] <Tech> cancelled cmd <command_id_short> on <hostname>

<command_id_short> = first 8 chars of the UUID.

Examples:

bash "$REPO_ROOT/.claude/scripts/post-bot-alert.sh" \
  "[RMM] Mike dispatched to WEST-MEADOW-9025 (macos) - disk cleanup AFP symlink -> cmd:3f8a1c2e"

bash "$REPO_ROOT/.claude/scripts/post-bot-alert.sh" \
  "[RMM] Howard ran disk scan on CS-SERVER (windows) - exit 0, 14 GB freed -> cmd:9b4d7e01"

ASCII only — no Unicode dashes or arrows. Use - and ->.

Known enrolled agents (verify with GET /api/agents — UUIDs change on re-enroll)

Do not use this table as authoritative — always resolve live. Treat as a starting hint only.

Hostname	Client	OS
WEST-MEADOW-9025	Scileppi Law	macOS
CS-SERVER	Cascades of Tucson	Windows
DESKTOP-DLTAGOI	Cascades LE	Windows
AD2	ACG Internal	Windows

What NOT to use RMM for

Permanent file deletion without user confirmation — always show what will be deleted, get a yes
Credential harvesting — do not dump password stores, SAM hive, or credential manager contents
Software installation without stating what will be installed — always preview the installer command
Bulk destructive operations (rm -rf /, format, Remove-Item -Recurse C:\Windows) — require explicit written confirmation from the user in chat, not just "yes" to a vague preview
RMM product development — use the Coding Agent and GuruRMM project context instead

24 KiB Raw Blame History