grok: fix xsearch (multi-agent web_search), pin grok-build, RTFM doc sweep
Root-caused the long-standing `ask-grok.sh xsearch` "no result (stopReason=)" failure by reading Grok's bundled docs (~/.grok/docs/user-guide + README) instead of probing: - web_search runs a SEPARATE multi-agent model (grok-4.20-multi-agent), so the wrapper's blanket --no-subagents strangled it -> indefinite hang, 0 bytes. Scoped --no-subagents OFF xsearch; use --yolo (documented headless tool-run posture). - xsearch prompt mandated X/Twitter search on every call (slow multi-agent) and the budget was 240s -> still timed out. Now web-primary (X only when relevant), 300s. Validated end-to-end through the wrapper: 23s, correct answer + 3 sources. Model: pin -m grok-build (xAI flagship, 512k, the documented default) for the reasoning modes (text/verify/review*) so quality is deterministic and not at the mercy of the runtime default (this machine drifted to grok-composer-2.5-fast, a fast Cursor coding model). xsearch + image/video keep the runtime default. Validated text mode on grok-build (13s). Doc accuracy (SKILL.md): corrected the model facts (default, the separate web_search model, --effort unsupported on grok-build per supports_reasoning_effort:false); documented the xsearch subagent exception. Fixed a stale in-script comment claiming --rules/--disallowed-tools "tripped the CLI" (both are valid headless flags). memory: add feedback_interview_ai_read_docs (read bundled docs / interview the model before probing) + index; errorlog correction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -55,6 +55,7 @@
|
||||
- [Point vault-access teammates at SOPS path](feedback_vault_pointer_for_teammates.md) — When relaying infra/credential info to Howard or other vault-access teammates, hand over the SOPS path + key anchors; don't transcribe the entry's fields into the message.
|
||||
- [/tmp path mismatch on Windows](feedback_tmp_path_windows.md) — Write tool and Git Bash resolve `/tmp` to DIFFERENT real dirs. Use heredoc or workspace path for JSON payloads handed to curl.
|
||||
- [Windows strips embedded double-quotes](feedback_windows_quote_stripping.md) — Embedded `"` in an arg gets eaten twice over: PowerShell->curl.exe (CommandLineToArgvW) AND RMM->cmd.exe. Use single-quoted heredoc `<<'JSON'` + `--data-binary @-` for bodies; build `"` from `[char]34`; or drop the quoted part (e.g. `shutdown /c`).
|
||||
- [Interview the AI / read its docs before probing](feedback_interview_ai_read_docs.md) — To learn an external AI/CLI's syntax or capabilities, READ its bundled docs (Grok: `~/.grok/docs/user-guide/`, `README.md`, `grok inspect`/`models`/`--help`) or interview the model; don't guess flags or run slow trial-and-error. One run to confirm a doc-derived hypothesis, not a dozen to discover.
|
||||
- [Windows bash command mapping](feedback_windows_bash_mapping.md) — `bash` often resolves to WSL stub instead of Git/MSYS bash required by the harness. Fix by prepending `C:\Program Files\Git\bin` (and usr\bin) to PATH, or source `.claude/scripts/ensure-git-bash.ps1`. Profile has the logic; use plain `bash .claude/scripts/...` after remap. See the helper and this memory file for details.
|
||||
- [Git must authenticate non-interactively](feedback_git_noninteractive_auth.md) — Mike's gripe with Git for Windows is the constant password prompts (GCM) that hang automation, NOT the tool itself. D:\ClaudeTools is set to `credential.helper=store` primed with the azcomputerguru Gitea API token (host 172.16.3.20:3000); always set `GIT_TERMINAL_PROMPT=0`. Any never-prompts solution is acceptable.
|
||||
- [Vault git auth — GCM shadows store token](feedback_vault_gcm_shadow_auth.md) — vault sync "Failed to authenticate user" on git.azcomputerguru.com: GCM is first in the helper chain and shadows the valid store token. Fix (machine-local): store-only credential.helper reset + pin `azcomputerguru@` in the vault remote URL so store returns the durable PAT (not the volatile OAUTH_USER JWT). Applied GURU-5070 2026-06-07.
|
||||
|
||||
35
.claude/memory/feedback_interview_ai_read_docs.md
Normal file
35
.claude/memory/feedback_interview_ai_read_docs.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
name: feedback_interview_ai_read_docs
|
||||
description: Before guessing or probing an external AI/CLI's command syntax or capabilities, READ its bundled docs and/or interview the model itself — probing wastes tokens and misleads.
|
||||
metadata:
|
||||
type: feedback
|
||||
---
|
||||
|
||||
When you need to understand an external AI's or CLI tool's **command syntax or
|
||||
capabilities**, do NOT blindly guess flags or run slow trial-and-error probes.
|
||||
First **read its own bundled documentation**, and/or **interview the model
|
||||
itself** (ask it to read its own docs and explain). The authoritative source is
|
||||
almost always already on disk.
|
||||
|
||||
**Why:** repeated timed probing is expensive (each Grok run is 80-300s), gives
|
||||
ambiguous signals, and is the exact "blindly guessing or probing" pattern Mike
|
||||
has flagged. The docs answer the question directly and for free.
|
||||
|
||||
**Concrete example (the lesson):** the long-standing `ask-grok.sh xsearch`
|
||||
"no result (stopReason=)" failure was root-caused not by probing but by reading
|
||||
**`~/.grok/docs/user-guide/` (esp. `14-headless-mode.md`, `11-custom-models.md`,
|
||||
`05-configuration.md`) and `~/.grok/README.md`**. They revealed: `web_search`
|
||||
runs a SEPARATE multi-agent model (`grok-4.20-multi-agent`), so the wrapper's
|
||||
blanket `--no-subagents` strangled it; the documented headless JSON schema is
|
||||
`{text,stopReason,sessionId,requestId}`; and `--yolo` is the documented
|
||||
tool-run posture. One confirmatory run, not a dozen.
|
||||
|
||||
**How to apply:**
|
||||
- For the Grok CLI: read `~/.grok/docs/user-guide/*.md` and `~/.grok/README.md`
|
||||
(and `grok inspect` / `grok models` / `grok <cmd> --help` for live truth)
|
||||
before changing [[feedback_windows_quote_stripping]]-style wrapper internals.
|
||||
- For any vendor CLI/API: locate its shipped docs/`--help`/OpenAPI first; treat
|
||||
one targeted run as *confirmation* of a doc-derived hypothesis, not as the
|
||||
discovery method.
|
||||
- Interviewing the model (its text path) is valid even when a tool path is
|
||||
broken — asking Grok doesn't require its web_search to work.
|
||||
@@ -91,10 +91,10 @@ Grok is **per-machine** — the skill syncs fleet-wide but the binary does not.
|
||||
|
||||
## Safety / operational notes
|
||||
|
||||
- `~/.grok/config.toml` defaults to `permission_mode = "always-approve"` (auto-runs tools). The wrapper overrides with `--permission-mode dontAsk` and `--no-subagents`; do not bypass that.
|
||||
- `~/.grok/config.toml` defaults to `permission_mode = "always-approve"` (auto-runs tools). The wrapper's single-context modes (`text`/`verify`/`review*`) override with `--permission-mode dontAsk --no-subagents` (cheap, bounded, timeout-predictable). **`xsearch` is the deliberate exception:** it uses `--yolo` and leaves subagents ENABLED, because `web_search` runs the `grok-4.20-multi-agent` model — passing `--no-subagents` made it hang with zero output (the long-standing xsearch-empty bug; root-caused + fixed 2026-06-16). Don't re-add `--no-subagents` to xsearch.
|
||||
- Prompts are passed via `--prompt-file` only (inline args break on shell quoting).
|
||||
- `grok-build` rejects `--effort`/`--reasoning-effort` (400) — don't pass them.
|
||||
- Models available: `grok-build` (default), `grok-composer-2.5-fast`. The agent self-reports as Grok 4.3.
|
||||
- `grok-build` does NOT support `--effort`/`--reasoning-effort` (`supports_reasoning_effort:false` in the model info) — passing them 400s. Don't.
|
||||
- Models (verify live with `grok models`): `grok-build` ("Grok Build", xAI's latest coding model, 512k ctx — the **documented** default per `docs/user-guide/11-custom-models.md`) and `grok-composer-2.5-fast` ("Cursor's latest coding model", fast). **Gotcha:** on this machine the runtime default that `grok models` reports is `grok-composer-2.5-fast`, and the wrapper pins no `-m`, so it currently rides whatever the runtime default is (not necessarily grok-build). `web_search` uses a SEPARATE model, `grok-4.20-multi-agent` (set via `[tools] web_search` / `GROK_WEB_SEARCH_MODEL`).
|
||||
- After media gen, Claude should **view the image** (Read tool) to confirm correctness; videos can be confirmed by header/ffprobe.
|
||||
|
||||
## Reference
|
||||
|
||||
@@ -94,12 +94,27 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
|
||||
TIMEOUT_CMD="$(command -v gtimeout 2>/dev/null || echo timeout)"
|
||||
fi
|
||||
|
||||
# Policy flags, overridable per mode. Defaults suit the single-context subordinate
|
||||
# calls (text/verify/review): a fixed permission posture and NO subagent fan-out --
|
||||
# that keeps a one-shot second-opinion call cheap, bounded, and timeout-predictable.
|
||||
# xsearch OVERRIDES both: web_search runs the grok-4.20-multi-agent model, so
|
||||
# --no-subagents made it hang with zero output (root cause, 2026-06-16). See xsearch.
|
||||
GROK_PERM_FLAGS=(--permission-mode dontAsk)
|
||||
GROK_SUBAGENT_FLAGS=(--no-subagents)
|
||||
# Pin xAI's flagship grok-build for the reasoning modes (text/verify/review*) so the
|
||||
# second-opinion quality is deterministic and not at the mercy of the runtime default
|
||||
# (which drifts -- this machine's `grok models` reports grok-composer-2.5-fast, a fast
|
||||
# Cursor coding model). xsearch + image/video clear this to "" to use the runtime
|
||||
# default (xsearch's searcher is a separate model; media gen was verified on default).
|
||||
GROK_MODEL="${GROK_MODEL:-grok-build}"
|
||||
|
||||
run_grok() {
|
||||
local to="$1"; shift
|
||||
local mflag=(); [ -n "$GROK_MODEL" ] && mflag=(-m "$GROK_MODEL")
|
||||
# Hand grok native-Windows paths (cygpath); MSYS leaves already-Windows paths alone,
|
||||
# so conversion is deterministic and space-safe.
|
||||
"$TIMEOUT_CMD" "$to" "$GROK" --prompt-file "$(winpath "$PF")" --output-format json \
|
||||
--permission-mode dontAsk --no-subagents --no-plan --cwd "$(winpath "$RUN_CWD")" "$@" \
|
||||
"${mflag[@]}" "${GROK_PERM_FLAGS[@]}" "${GROK_SUBAGENT_FLAGS[@]}" --no-plan --cwd "$(winpath "$RUN_CWD")" "$@" \
|
||||
>"$OUT" 2>"$TMP/err.txt" || true
|
||||
}
|
||||
|
||||
@@ -161,9 +176,12 @@ case "$MODE" in
|
||||
SRC="${1:-}"
|
||||
fi
|
||||
[ -z "$SRC" ] && { echo "usage: $SELF $MODE \"<prompt>\" | $SELF $MODE --prompt-file <path>" >&2; exit 2; }
|
||||
# Prompt-level steering keeps it text-only and (for verify) adversarial.
|
||||
# (The --disallowed-tools/--rules flags tripped the CLI; --check adds a slow
|
||||
# multi-turn self-check loop — both avoided in favor of prompt steering.)
|
||||
# Prompt steering + --disable-web-search (below) keeps these text-only; no tools
|
||||
# are needed. NOTE for future hardening: --rules and --disallowed-tools ARE valid
|
||||
# headless flags (per docs/user-guide/14-headless-mode.md) and could hard-restrict
|
||||
# tools if prompt steering ever proves insufficient. --check/--self-verify also
|
||||
# exists, but it makes grok verify its OWN output -- not adversarially critique an
|
||||
# EXTERNAL claim -- so prompt steering is the correct semantic for verify mode.
|
||||
if [ "$MODE" = "verify" ]; then
|
||||
printf 'Adversarially evaluate the following claim/finding/document: try hard to find any reason it is WRONG, incomplete, or overstated. Then give a verdict plus specific justification. Answer in text only; do not use tools. Content:\n%s' "$SRC" > "$PF"
|
||||
else
|
||||
@@ -177,6 +195,7 @@ case "$MODE" in
|
||||
image)
|
||||
[ -z "${1:-}" ] && { echo "usage: $SELF image \"<prompt>\" [out.png]" >&2; exit 2; }
|
||||
out="${2:-grok-image.png}"
|
||||
GROK_MODEL="" # media gen verified on the runtime default; don't force grok-build here
|
||||
printf 'Use your image_gen tool to generate exactly this image, save it, then stop. Image: %s' "$1" > "$PF"
|
||||
run_grok 240 --disable-web-search --max-turns 12
|
||||
sid="$(jfield sessionId)"; art="$(find_artifact "$sid" images)"
|
||||
@@ -189,6 +208,7 @@ case "$MODE" in
|
||||
input="$2"; out="${3:-grok-video.mp4}"
|
||||
[ -f "$input" ] || { echo "[$SELF] input image not found: $input" >&2; exit 2; }
|
||||
cp -f "$input" "$WORK/input.jpg"
|
||||
GROK_MODEL="" # media gen verified on the runtime default; don't force grok-build here
|
||||
printf 'Use your image_to_video tool to animate input.jpg (in the current directory) into a short clip, save it, then stop. Motion: %s' "$1" > "$PF"
|
||||
run_grok 360 --disable-web-search --max-turns 20
|
||||
sid="$(jfield sessionId)"; art="$(find_artifact "$sid" videos)"
|
||||
@@ -198,8 +218,18 @@ case "$MODE" in
|
||||
;;
|
||||
xsearch)
|
||||
[ -z "${1:-}" ] && { echo "usage: $SELF xsearch \"<query>\"" >&2; exit 2; }
|
||||
printf 'Use your web_search and X/Twitter search tools to answer this, cite sources, then stop: %s' "$1" > "$PF"
|
||||
run_grok 150 --max-turns 6
|
||||
# web_search runs the grok-4.20-multi-agent model -> subagents MUST stay enabled
|
||||
# (--no-subagents made it hang with ZERO output, the long-standing xsearch bug).
|
||||
# --yolo is the documented headless tool-run posture (README/14-headless-mode.md).
|
||||
# Budget is generous: a live web_search measured ~83s end-to-end.
|
||||
GROK_SUBAGENT_FLAGS=()
|
||||
GROK_PERM_FLAGS=(--yolo)
|
||||
GROK_MODEL="" # search runs the separate grok-4.20-multi-agent model; use runtime default orchestrator
|
||||
# Lead with web_search (fast); pull in X/Twitter ONLY when the query is actually
|
||||
# about social/news/sentiment. Mandating X search on every call ran the multi-agent
|
||||
# searcher long enough to blow a 240s budget. Generous 300s budget for headroom.
|
||||
printf 'Use your web_search tool to answer this question and cite sources. Use your X/Twitter search tools ONLY if the question is specifically about social media, breaking news, or public sentiment. Then stop.\n\nQuestion: %s' "$1" > "$PF"
|
||||
run_grok 300 --max-turns 14
|
||||
txt="$(jfield text)"
|
||||
if [ -n "$txt" ]; then printf '%s\n' "$txt"; else
|
||||
echo "[$SELF] no result (stopReason=$(jfield stopReason))" >&2; _logerr "grok xsearch returned no result" --context "mode=xsearch stopReason=$(jfield stopReason)"; exit 1; fi
|
||||
|
||||
Reference in New Issue
Block a user