grok: fix xsearch (multi-agent web_search), pin grok-build, RTFM doc sweep

Root-caused the long-standing `ask-grok.sh xsearch` "no result (stopReason=)"
failure by reading Grok's bundled docs (~/.grok/docs/user-guide + README) instead
of probing:
- web_search runs a SEPARATE multi-agent model (grok-4.20-multi-agent), so the
  wrapper's blanket --no-subagents strangled it -> indefinite hang, 0 bytes. Scoped
  --no-subagents OFF xsearch; use --yolo (documented headless tool-run posture).
- xsearch prompt mandated X/Twitter search on every call (slow multi-agent) and the
  budget was 240s -> still timed out. Now web-primary (X only when relevant), 300s.
  Validated end-to-end through the wrapper: 23s, correct answer + 3 sources.

Model: pin -m grok-build (xAI flagship, 512k, the documented default) for the
reasoning modes (text/verify/review*) so quality is deterministic and not at the
mercy of the runtime default (this machine drifted to grok-composer-2.5-fast, a fast
Cursor coding model). xsearch + image/video keep the runtime default. Validated text
mode on grok-build (13s).

Doc accuracy (SKILL.md): corrected the model facts (default, the separate web_search
model, --effort unsupported on grok-build per supports_reasoning_effort:false);
documented the xsearch subagent exception. Fixed a stale in-script comment claiming
--rules/--disallowed-tools "tripped the CLI" (both are valid headless flags).

memory: add feedback_interview_ai_read_docs (read bundled docs / interview the model
before probing) + index; errorlog correction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-16 19:14:59 -07:00
parent 58ecc5ad40
commit a242797908
5 changed files with 85 additions and 9 deletions

View File

@@ -55,6 +55,7 @@
- [Point vault-access teammates at SOPS path](feedback_vault_pointer_for_teammates.md) — When relaying infra/credential info to Howard or other vault-access teammates, hand over the SOPS path + key anchors; don't transcribe the entry's fields into the message. - [Point vault-access teammates at SOPS path](feedback_vault_pointer_for_teammates.md) — When relaying infra/credential info to Howard or other vault-access teammates, hand over the SOPS path + key anchors; don't transcribe the entry's fields into the message.
- [/tmp path mismatch on Windows](feedback_tmp_path_windows.md) — Write tool and Git Bash resolve `/tmp` to DIFFERENT real dirs. Use heredoc or workspace path for JSON payloads handed to curl. - [/tmp path mismatch on Windows](feedback_tmp_path_windows.md) — Write tool and Git Bash resolve `/tmp` to DIFFERENT real dirs. Use heredoc or workspace path for JSON payloads handed to curl.
- [Windows strips embedded double-quotes](feedback_windows_quote_stripping.md) — Embedded `"` in an arg gets eaten twice over: PowerShell->curl.exe (CommandLineToArgvW) AND RMM->cmd.exe. Use single-quoted heredoc `<<'JSON'` + `--data-binary @-` for bodies; build `"` from `[char]34`; or drop the quoted part (e.g. `shutdown /c`). - [Windows strips embedded double-quotes](feedback_windows_quote_stripping.md) — Embedded `"` in an arg gets eaten twice over: PowerShell->curl.exe (CommandLineToArgvW) AND RMM->cmd.exe. Use single-quoted heredoc `<<'JSON'` + `--data-binary @-` for bodies; build `"` from `[char]34`; or drop the quoted part (e.g. `shutdown /c`).
- [Interview the AI / read its docs before probing](feedback_interview_ai_read_docs.md) — To learn an external AI/CLI's syntax or capabilities, READ its bundled docs (Grok: `~/.grok/docs/user-guide/`, `README.md`, `grok inspect`/`models`/`--help`) or interview the model; don't guess flags or run slow trial-and-error. One run to confirm a doc-derived hypothesis, not a dozen to discover.
- [Windows bash command mapping](feedback_windows_bash_mapping.md) — `bash` often resolves to WSL stub instead of Git/MSYS bash required by the harness. Fix by prepending `C:\Program Files\Git\bin` (and usr\bin) to PATH, or source `.claude/scripts/ensure-git-bash.ps1`. Profile has the logic; use plain `bash .claude/scripts/...` after remap. See the helper and this memory file for details. - [Windows bash command mapping](feedback_windows_bash_mapping.md) — `bash` often resolves to WSL stub instead of Git/MSYS bash required by the harness. Fix by prepending `C:\Program Files\Git\bin` (and usr\bin) to PATH, or source `.claude/scripts/ensure-git-bash.ps1`. Profile has the logic; use plain `bash .claude/scripts/...` after remap. See the helper and this memory file for details.
- [Git must authenticate non-interactively](feedback_git_noninteractive_auth.md) — Mike's gripe with Git for Windows is the constant password prompts (GCM) that hang automation, NOT the tool itself. D:\ClaudeTools is set to `credential.helper=store` primed with the azcomputerguru Gitea API token (host 172.16.3.20:3000); always set `GIT_TERMINAL_PROMPT=0`. Any never-prompts solution is acceptable. - [Git must authenticate non-interactively](feedback_git_noninteractive_auth.md) — Mike's gripe with Git for Windows is the constant password prompts (GCM) that hang automation, NOT the tool itself. D:\ClaudeTools is set to `credential.helper=store` primed with the azcomputerguru Gitea API token (host 172.16.3.20:3000); always set `GIT_TERMINAL_PROMPT=0`. Any never-prompts solution is acceptable.
- [Vault git auth — GCM shadows store token](feedback_vault_gcm_shadow_auth.md) — vault sync "Failed to authenticate user" on git.azcomputerguru.com: GCM is first in the helper chain and shadows the valid store token. Fix (machine-local): store-only credential.helper reset + pin `azcomputerguru@` in the vault remote URL so store returns the durable PAT (not the volatile OAUTH_USER JWT). Applied GURU-5070 2026-06-07. - [Vault git auth — GCM shadows store token](feedback_vault_gcm_shadow_auth.md) — vault sync "Failed to authenticate user" on git.azcomputerguru.com: GCM is first in the helper chain and shadows the valid store token. Fix (machine-local): store-only credential.helper reset + pin `azcomputerguru@` in the vault remote URL so store returns the durable PAT (not the volatile OAUTH_USER JWT). Applied GURU-5070 2026-06-07.

View File

@@ -0,0 +1,35 @@
---
name: feedback_interview_ai_read_docs
description: Before guessing or probing an external AI/CLI's command syntax or capabilities, READ its bundled docs and/or interview the model itself — probing wastes tokens and misleads.
metadata:
type: feedback
---
When you need to understand an external AI's or CLI tool's **command syntax or
capabilities**, do NOT blindly guess flags or run slow trial-and-error probes.
First **read its own bundled documentation**, and/or **interview the model
itself** (ask it to read its own docs and explain). The authoritative source is
almost always already on disk.
**Why:** repeated timed probing is expensive (each Grok run is 80-300s), gives
ambiguous signals, and is the exact "blindly guessing or probing" pattern Mike
has flagged. The docs answer the question directly and for free.
**Concrete example (the lesson):** the long-standing `ask-grok.sh xsearch`
"no result (stopReason=)" failure was root-caused not by probing but by reading
**`~/.grok/docs/user-guide/` (esp. `14-headless-mode.md`, `11-custom-models.md`,
`05-configuration.md`) and `~/.grok/README.md`**. They revealed: `web_search`
runs a SEPARATE multi-agent model (`grok-4.20-multi-agent`), so the wrapper's
blanket `--no-subagents` strangled it; the documented headless JSON schema is
`{text,stopReason,sessionId,requestId}`; and `--yolo` is the documented
tool-run posture. One confirmatory run, not a dozen.
**How to apply:**
- For the Grok CLI: read `~/.grok/docs/user-guide/*.md` and `~/.grok/README.md`
(and `grok inspect` / `grok models` / `grok <cmd> --help` for live truth)
before changing [[feedback_windows_quote_stripping]]-style wrapper internals.
- For any vendor CLI/API: locate its shipped docs/`--help`/OpenAPI first; treat
one targeted run as *confirmation* of a doc-derived hypothesis, not as the
discovery method.
- Interviewing the model (its text path) is valid even when a tool path is
broken — asking Grok doesn't require its web_search to work.

View File

@@ -91,10 +91,10 @@ Grok is **per-machine** — the skill syncs fleet-wide but the binary does not.
## Safety / operational notes ## Safety / operational notes
- `~/.grok/config.toml` defaults to `permission_mode = "always-approve"` (auto-runs tools). The wrapper overrides with `--permission-mode dontAsk` and `--no-subagents`; do not bypass that. - `~/.grok/config.toml` defaults to `permission_mode = "always-approve"` (auto-runs tools). The wrapper's single-context modes (`text`/`verify`/`review*`) override with `--permission-mode dontAsk --no-subagents` (cheap, bounded, timeout-predictable). **`xsearch` is the deliberate exception:** it uses `--yolo` and leaves subagents ENABLED, because `web_search` runs the `grok-4.20-multi-agent` model — passing `--no-subagents` made it hang with zero output (the long-standing xsearch-empty bug; root-caused + fixed 2026-06-16). Don't re-add `--no-subagents` to xsearch.
- Prompts are passed via `--prompt-file` only (inline args break on shell quoting). - Prompts are passed via `--prompt-file` only (inline args break on shell quoting).
- `grok-build` rejects `--effort`/`--reasoning-effort` (400) — don't pass them. - `grok-build` does NOT support `--effort`/`--reasoning-effort` (`supports_reasoning_effort:false` in the model info) — passing them 400s. Don't.
- Models available: `grok-build` (default), `grok-composer-2.5-fast`. The agent self-reports as Grok 4.3. - Models (verify live with `grok models`): `grok-build` ("Grok Build", xAI's latest coding model, 512k ctx — the **documented** default per `docs/user-guide/11-custom-models.md`) and `grok-composer-2.5-fast` ("Cursor's latest coding model", fast). **Gotcha:** on this machine the runtime default that `grok models` reports is `grok-composer-2.5-fast`, and the wrapper pins no `-m`, so it currently rides whatever the runtime default is (not necessarily grok-build). `web_search` uses a SEPARATE model, `grok-4.20-multi-agent` (set via `[tools] web_search` / `GROK_WEB_SEARCH_MODEL`).
- After media gen, Claude should **view the image** (Read tool) to confirm correctness; videos can be confirmed by header/ffprobe. - After media gen, Claude should **view the image** (Read tool) to confirm correctness; videos can be confirmed by header/ffprobe.
## Reference ## Reference

View File

@@ -94,12 +94,27 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
TIMEOUT_CMD="$(command -v gtimeout 2>/dev/null || echo timeout)" TIMEOUT_CMD="$(command -v gtimeout 2>/dev/null || echo timeout)"
fi fi
# Policy flags, overridable per mode. Defaults suit the single-context subordinate
# calls (text/verify/review): a fixed permission posture and NO subagent fan-out --
# that keeps a one-shot second-opinion call cheap, bounded, and timeout-predictable.
# xsearch OVERRIDES both: web_search runs the grok-4.20-multi-agent model, so
# --no-subagents made it hang with zero output (root cause, 2026-06-16). See xsearch.
GROK_PERM_FLAGS=(--permission-mode dontAsk)
GROK_SUBAGENT_FLAGS=(--no-subagents)
# Pin xAI's flagship grok-build for the reasoning modes (text/verify/review*) so the
# second-opinion quality is deterministic and not at the mercy of the runtime default
# (which drifts -- this machine's `grok models` reports grok-composer-2.5-fast, a fast
# Cursor coding model). xsearch + image/video clear this to "" to use the runtime
# default (xsearch's searcher is a separate model; media gen was verified on default).
GROK_MODEL="${GROK_MODEL:-grok-build}"
run_grok() { run_grok() {
local to="$1"; shift local to="$1"; shift
local mflag=(); [ -n "$GROK_MODEL" ] && mflag=(-m "$GROK_MODEL")
# Hand grok native-Windows paths (cygpath); MSYS leaves already-Windows paths alone, # Hand grok native-Windows paths (cygpath); MSYS leaves already-Windows paths alone,
# so conversion is deterministic and space-safe. # so conversion is deterministic and space-safe.
"$TIMEOUT_CMD" "$to" "$GROK" --prompt-file "$(winpath "$PF")" --output-format json \ "$TIMEOUT_CMD" "$to" "$GROK" --prompt-file "$(winpath "$PF")" --output-format json \
--permission-mode dontAsk --no-subagents --no-plan --cwd "$(winpath "$RUN_CWD")" "$@" \ "${mflag[@]}" "${GROK_PERM_FLAGS[@]}" "${GROK_SUBAGENT_FLAGS[@]}" --no-plan --cwd "$(winpath "$RUN_CWD")" "$@" \
>"$OUT" 2>"$TMP/err.txt" || true >"$OUT" 2>"$TMP/err.txt" || true
} }
@@ -161,9 +176,12 @@ case "$MODE" in
SRC="${1:-}" SRC="${1:-}"
fi fi
[ -z "$SRC" ] && { echo "usage: $SELF $MODE \"<prompt>\" | $SELF $MODE --prompt-file <path>" >&2; exit 2; } [ -z "$SRC" ] && { echo "usage: $SELF $MODE \"<prompt>\" | $SELF $MODE --prompt-file <path>" >&2; exit 2; }
# Prompt-level steering keeps it text-only and (for verify) adversarial. # Prompt steering + --disable-web-search (below) keeps these text-only; no tools
# (The --disallowed-tools/--rules flags tripped the CLI; --check adds a slow # are needed. NOTE for future hardening: --rules and --disallowed-tools ARE valid
# multi-turn self-check loop — both avoided in favor of prompt steering.) # headless flags (per docs/user-guide/14-headless-mode.md) and could hard-restrict
# tools if prompt steering ever proves insufficient. --check/--self-verify also
# exists, but it makes grok verify its OWN output -- not adversarially critique an
# EXTERNAL claim -- so prompt steering is the correct semantic for verify mode.
if [ "$MODE" = "verify" ]; then if [ "$MODE" = "verify" ]; then
printf 'Adversarially evaluate the following claim/finding/document: try hard to find any reason it is WRONG, incomplete, or overstated. Then give a verdict plus specific justification. Answer in text only; do not use tools. Content:\n%s' "$SRC" > "$PF" printf 'Adversarially evaluate the following claim/finding/document: try hard to find any reason it is WRONG, incomplete, or overstated. Then give a verdict plus specific justification. Answer in text only; do not use tools. Content:\n%s' "$SRC" > "$PF"
else else
@@ -177,6 +195,7 @@ case "$MODE" in
image) image)
[ -z "${1:-}" ] && { echo "usage: $SELF image \"<prompt>\" [out.png]" >&2; exit 2; } [ -z "${1:-}" ] && { echo "usage: $SELF image \"<prompt>\" [out.png]" >&2; exit 2; }
out="${2:-grok-image.png}" out="${2:-grok-image.png}"
GROK_MODEL="" # media gen verified on the runtime default; don't force grok-build here
printf 'Use your image_gen tool to generate exactly this image, save it, then stop. Image: %s' "$1" > "$PF" printf 'Use your image_gen tool to generate exactly this image, save it, then stop. Image: %s' "$1" > "$PF"
run_grok 240 --disable-web-search --max-turns 12 run_grok 240 --disable-web-search --max-turns 12
sid="$(jfield sessionId)"; art="$(find_artifact "$sid" images)" sid="$(jfield sessionId)"; art="$(find_artifact "$sid" images)"
@@ -189,6 +208,7 @@ case "$MODE" in
input="$2"; out="${3:-grok-video.mp4}" input="$2"; out="${3:-grok-video.mp4}"
[ -f "$input" ] || { echo "[$SELF] input image not found: $input" >&2; exit 2; } [ -f "$input" ] || { echo "[$SELF] input image not found: $input" >&2; exit 2; }
cp -f "$input" "$WORK/input.jpg" cp -f "$input" "$WORK/input.jpg"
GROK_MODEL="" # media gen verified on the runtime default; don't force grok-build here
printf 'Use your image_to_video tool to animate input.jpg (in the current directory) into a short clip, save it, then stop. Motion: %s' "$1" > "$PF" printf 'Use your image_to_video tool to animate input.jpg (in the current directory) into a short clip, save it, then stop. Motion: %s' "$1" > "$PF"
run_grok 360 --disable-web-search --max-turns 20 run_grok 360 --disable-web-search --max-turns 20
sid="$(jfield sessionId)"; art="$(find_artifact "$sid" videos)" sid="$(jfield sessionId)"; art="$(find_artifact "$sid" videos)"
@@ -198,8 +218,18 @@ case "$MODE" in
;; ;;
xsearch) xsearch)
[ -z "${1:-}" ] && { echo "usage: $SELF xsearch \"<query>\"" >&2; exit 2; } [ -z "${1:-}" ] && { echo "usage: $SELF xsearch \"<query>\"" >&2; exit 2; }
printf 'Use your web_search and X/Twitter search tools to answer this, cite sources, then stop: %s' "$1" > "$PF" # web_search runs the grok-4.20-multi-agent model -> subagents MUST stay enabled
run_grok 150 --max-turns 6 # (--no-subagents made it hang with ZERO output, the long-standing xsearch bug).
# --yolo is the documented headless tool-run posture (README/14-headless-mode.md).
# Budget is generous: a live web_search measured ~83s end-to-end.
GROK_SUBAGENT_FLAGS=()
GROK_PERM_FLAGS=(--yolo)
GROK_MODEL="" # search runs the separate grok-4.20-multi-agent model; use runtime default orchestrator
# Lead with web_search (fast); pull in X/Twitter ONLY when the query is actually
# about social/news/sentiment. Mandating X search on every call ran the multi-agent
# searcher long enough to blow a 240s budget. Generous 300s budget for headroom.
printf 'Use your web_search tool to answer this question and cite sources. Use your X/Twitter search tools ONLY if the question is specifically about social media, breaking news, or public sentiment. Then stop.\n\nQuestion: %s' "$1" > "$PF"
run_grok 300 --max-turns 14
txt="$(jfield text)" txt="$(jfield text)"
if [ -n "$txt" ]; then printf '%s\n' "$txt"; else if [ -n "$txt" ]; then printf '%s\n' "$txt"; else
echo "[$SELF] no result (stopReason=$(jfield stopReason))" >&2; _logerr "grok xsearch returned no result" --context "mode=xsearch stopReason=$(jfield stopReason)"; exit 1; fi echo "[$SELF] no result (stopReason=$(jfield stopReason))" >&2; _logerr "grok xsearch returned no result" --context "mode=xsearch stopReason=$(jfield stopReason)"; exit 1; fi

View File

@@ -17,6 +17,16 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure ·
<!-- Append entries below this line --> <!-- Append entries below this line -->
2026-06-17 | GURU-5070 | grok/ask-grok.sh | [correction] blindly probed grok with slow timed runs to find the xsearch syntax instead of reading its bundled docs (~/.grok/docs/user-guide/ + README.md) / interviewing the model first; the docs gave the root cause directly (web_search=grok-4.20-multi-agent multi-agent model, killed by --no-subagents) [ctx: ref=feedback_interview_ai_read_docs]
2026-06-17 | GURU-5070 | grok | grok xsearch returned no result [ctx: mode=xsearch stopReason=]
2026-06-17 | GURU-5070 | grok | grok xsearch returned no result [ctx: mode=xsearch stopReason=]
2026-06-17 | GURU-5070 | grok/xsearch | xsearch returned empty (stopReason finalization quirk) on Grok-co-work research; fell back to text-mode design + Claude synthesis [ctx: topic=grok-as-claude-cowork]
2026-06-17 | GURU-5070 | grok | grok xsearch returned no result [ctx: mode=xsearch stopReason=]
2026-06-17 | GURU-5070 | syncro/customer-comment | [friction] posted a customer-facing emailed comment to #32333 WITHOUT previewing for human review first; preview-before-send is mandatory for ALL outgoing comms [ctx: ref=CLAUDE.md syncro 'Before any POST: Always show the full payload and wait for confirmation'] 2026-06-17 | GURU-5070 | syncro/customer-comment | [friction] posted a customer-facing emailed comment to #32333 WITHOUT previewing for human review first; preview-before-send is mandatory for ALL outgoing comms [ctx: ref=CLAUDE.md syncro 'Before any POST: Always show the full payload and wait for confirmation']
2026-06-17 | GURU-5070 | syncro/customer-comment | [correction] conflated Arizona Medical Transit (AMT-PC, Windows: Dell bloatware + misbehaving Syncro agent) cleanup into the Scileppi Mac (#32333) customer comment; Scileppi was a full-disk/Trash + Mail + Downloads-redesign job, no Dell/agent work [ctx: ticket=32333 mac=scileppi] 2026-06-17 | GURU-5070 | syncro/customer-comment | [correction] conflated Arizona Medical Transit (AMT-PC, Windows: Dell bloatware + misbehaving Syncro agent) cleanup into the Scileppi Mac (#32333) customer comment; Scileppi was a full-disk/Trash + Mail + Downloads-redesign job, no Dell/agent work [ctx: ticket=32333 mac=scileppi]