- .claude/scripts/cdp.py: drive Chrome via DevTools Protocol; screenshots to disk (so Gemini/Grok can see the live site). Fixes invisible-window + no-disk-screenshot. - reference_cdp_chrome_driver.md (+ MEMORY index) - gururmm submodule pointer -> dashboard redesign docs (local 3cef6ba) - session log Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
39 lines
2.4 KiB
Markdown
39 lines
2.4 KiB
Markdown
---
|
|
name: reference_cdp_chrome_driver
|
|
description: Drive Chrome via CDP (debugger) with on-disk screenshots; how Gemini/Grok "see" the live site
|
|
metadata:
|
|
type: reference
|
|
---
|
|
|
|
`.claude/scripts/cdp.py` drives Chrome over the **Chrome DevTools Protocol** (same approach
|
|
Antigravity uses) — fixing two problems the claude-in-chrome MCP extension had: invisible windows,
|
|
and screenshots that never landed on disk.
|
|
|
|
**Why it matters:** CDP `Page.captureScreenshot` returns the PNG bytes, so cdp.py writes a **real
|
|
PNG file** → which can be fed to `agy image-analyze` (Gemini) or Grok. That is how Gemini/Grok
|
|
"look at the live site" (verified 2026-06-05: Gemini correctly read a CDP screenshot of the GuruRMM
|
|
login). The MCP extension's `save_to_disk` never produced a findable file.
|
|
|
|
**Setup (one-time per session):**
|
|
- `py -m pip install websocket-client` (uses stdlib `urllib` + `websocket-client`; no Playwright/Node).
|
|
- `py .claude/scripts/cdp.py launch [url]` — opens a **visible** Chrome on a **dedicated profile**
|
|
(`~/.claude/cdp-chrome-profile`) with `--remote-debugging-port=9222`. Dedicated profile = NOT logged
|
|
in; the user signs into authenticated apps once (Claude still must NOT type passwords — that rule
|
|
holds regardless of CDP).
|
|
|
|
**Gotchas:**
|
|
- Chrome's DNS-rebinding guard rejects `Host: 127.0.0.1` on the debug endpoint → **use `localhost`**
|
|
(cdp.py BASE is `http://localhost:9222`). Launch also passes `--remote-allow-origins=*`.
|
|
- Launching `chrome.exe` while Chrome runs on the SAME profile just opens a tab in the existing
|
|
instance (flags ignored). The dedicated `--user-data-dir` forces a real new instance with the port.
|
|
|
|
**Commands:** `launch [url]` · `status` · `nav <url> [tabid]` · `shot <out.png> [tabid]` ·
|
|
`click <x> <y>` · `type <text>` · `key <Key>` · `eval <js>`. Stateless (new WS per command).
|
|
|
|
**Letting Gemini/Grok DRIVE (not just see):** cdp.py is a plain CLI, so Grok's `run_terminal_command`
|
|
(or any agent with shell access) could call it to navigate/click. **Security caveat:** a debug Chrome
|
|
on :9222 is controllable by any local process, and if it holds authenticated sessions (M365, Syncro,
|
|
RMM) those are driveable by whatever drives it — including external-vendor CLIs. Safer model: **Claude
|
|
drives cdp.py; Gemini/Grok receive the on-disk screenshots.** Only expose direct driving to an
|
|
external CLI deliberately. See [[reference_gururmm]].
|