Commit Graph

7 Commits

Author SHA1 Message Date
2b792ee5d1 agy(gemini): fix false auth-abort in retry loop + add quota fallback to default model
While using the new 3-retry gemini path for live VPN research, two bugs surfaced:
- emit_or_fail checked auth_failed INSIDE the retry loop; a benign mid-run token-refresh line
  matched the over-broad auth regex (bare login|credential|authenticat|oauth|401) and aborted the
  retries with a false "auth error" - even though `gemini -p` auth tested fine. Moved auth-classify
  to AFTER the retries (it only picks the final error message now) and tightened auth_failed to real
  signatures (invalid_grant, not authenticated, login with google, token expired, ...).
- Added quota_exhausted() + a QUOTA FALLBACK: the pinned strong model (gemini-3.1-pro-preview) hit
  "exhausted your capacity on this model" mid-session; emit_or_fail now retries once on the default
  (lighter) model by stripping -m (separate quota). Validated: capped pro run -> fell back -> 2.9KB answer.

CT_THOUGHTS Thought 2 Resolution updated with both. (Search-bot reliability hardening continues.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 12:09:58 -07:00
315f45bf7c search-bots: fix reliability (diagnosed) - gemini 3-retry + grok xsearch auto-fallback to gemini
Mike's must-fix. Diagnosed from RAW output of failing queries (not guessed):
- grok xsearch = TIMEOUT: grok-4.20-multi-agent web_search runs past budget on multi-part queries
  (286s/280s, rc=124, still searching - 183 thoughts, only progress-noise text); buffered json => total loss.
- gemini search = INTERMITTENT empty turn (a clean re-run gave a real 2.6KB answer in 122s); the wrapper
  retried only once, so two empties in a row failed spuriously.

Fixes:
- ask-gemini.sh emit_or_fail: retry up to 3x with 3s/6s backoff (was 1).
- ask-grok.sh xsearch: --output-format streaming-json (salvage partials) + AUTO-FALLBACK to
  ask-gemini.sh search when grok doesn't finish (rc!=0 or empty). Validated e2e: grok timed out
  (rc=124) -> fell back -> gemini returned a real sourced answer (UniFi Teleport invite-link API).

grok's own multi-agent timeout is an xAI-side limitation; the fallback makes xsearch reliable regardless.
Docs: grok SKILL.md xsearch row + CT_THOUGHTS Thought 2 Resolution.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 10:38:44 -07:00
f031ef9137 agy(gemini): RTFM audit — confirmed healthy, version + verified-date refresh
Audited the Gemini wrapper against the CLI's bundled help/README (gemini 0.45.2),
same pass as the grok skill. Unlike grok, found NO functional bug:
- All flags correct and real: -p, --skip-trust, -o json, --approval-mode plan|yolo,
  --include-directories, -m (verified against `gemini --help`).
- JSON schema {session_id, response, stats} -> .response confirmed via live probe.
- Pinned model gemini-3.1-pro-preview STILL VALID (live PONG); the GA-looking
  gemini-3.1-pro and gemini-3-pro both ModelNotFoundError -> keep the -preview suffix.
- Default text model is gemini-3.1-flash-lite (by design; verify/review/search/image
  pin pro). No thought-suppression flag exists in the CLI, so the gresponse() reasoning
  -leak scrub stays (justified, signature-gated, byte-exact otherwise).
- Live `search` re-validated end-to-end through the wrapper (58s, grounded sources).

Only change: version 0.45.1 -> 0.45.2 in SKILL.md + wrapper header, and refreshed the
verified-date notes with the 2026-06-17 re-validation findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 19:25:07 -07:00
da87d314c5 harness: fleet-wide functional-error + correction + friction logging
Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.

- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
  (b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
  + onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
  run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
  session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
  friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 11:40:25 -07:00
2795fab3c7 docs(skills): document review path-resolution gotcha in agy + grok
review/review-files resolve relative paths only against CWD or
$CLAUDETOOLS_ROOT, never a submodule/subdir — so submodule-relative
paths fail with "file not found". Add a [!WARNING] callout to both
SKILL.md files, fix the misleading "absolute or repo-relative" table
wording, and add inline GOTCHA comments at each resolution site in
both scripts. Bitten us repeatedly (latest: GuruConnect review).
2026-06-05 16:55:56 -07:00
8b99990884 feat(agy): add keyless image-analyze + search modes
image-analyze: independent second-model vision over OAuth (pins the
gemini-3.1-pro-preview vision model; the default flash-lite router
hallucinates image content) — reads an image via read_file and describes it.
search: Google-grounded live web results with citation URLs (google_web_search).
Both verified working on the keyless Google OAuth. Image GENERATION
(nano-banana) still needs an AI Studio key + extension and stays Grok's lane.
Includes a scoped best-effort output sanitizer for image-analyze (preview
model occasionally leaks reasoning tokens); text/verify/review/search
unchanged. migrate-identity.sh now upgrades the gemini capabilities array.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 09:03:21 -07:00
bd3624e35f feat(skills): add AGY — Google Gemini CLI second-opinion router
Sibling of the grok skill: routes text/verify/review (+ review-files,
review-diff, raw) to the official Google Gemini CLI (gemini, npm global,
v0.45.1) for an independent second model. ask-gemini.sh mirrors ask-grok.sh
(identity-aware gating, binary auto-locate, cygpath hardening, prompt-file
inputs, clean stdout/stderr separation, JSON .response extraction). review
modes copy targets into a temp dir + --include-directories to bypass
Gemini's gitignore/workspace sandbox. verify/review pinned to
gemini-3.1-pro-preview (GEMINI_MODEL overridable). migrate-identity.sh
auto-detects gemini and writes a per-machine identity.json gemini block.
Auth: Google OAuth (no key). Fleet Gemini host: GURU-5070.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 06:45:00 -07:00