- .claude/scripts/cdp.py: drive Chrome via DevTools Protocol; screenshots to disk (so Gemini/Grok can see the live site). Fixes invisible-window + no-disk-screenshot. - reference_cdp_chrome_driver.md (+ MEMORY index) - gururmm submodule pointer -> dashboard redesign docs (local 3cef6ba) - session log Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2.4 KiB
name, description, metadata
| name | description | metadata | ||
|---|---|---|---|---|
| reference_cdp_chrome_driver | Drive Chrome via CDP (debugger) with on-disk screenshots; how Gemini/Grok "see" the live site |
|
.claude/scripts/cdp.py drives Chrome over the Chrome DevTools Protocol (same approach
Antigravity uses) — fixing two problems the claude-in-chrome MCP extension had: invisible windows,
and screenshots that never landed on disk.
Why it matters: CDP Page.captureScreenshot returns the PNG bytes, so cdp.py writes a real
PNG file → which can be fed to agy image-analyze (Gemini) or Grok. That is how Gemini/Grok
"look at the live site" (verified 2026-06-05: Gemini correctly read a CDP screenshot of the GuruRMM
login). The MCP extension's save_to_disk never produced a findable file.
Setup (one-time per session):
py -m pip install websocket-client(uses stdliburllib+websocket-client; no Playwright/Node).py .claude/scripts/cdp.py launch [url]— opens a visible Chrome on a dedicated profile (~/.claude/cdp-chrome-profile) with--remote-debugging-port=9222. Dedicated profile = NOT logged in; the user signs into authenticated apps once (Claude still must NOT type passwords — that rule holds regardless of CDP).
Gotchas:
- Chrome's DNS-rebinding guard rejects
Host: 127.0.0.1on the debug endpoint → uselocalhost(cdp.py BASE ishttp://localhost:9222). Launch also passes--remote-allow-origins=*. - Launching
chrome.exewhile Chrome runs on the SAME profile just opens a tab in the existing instance (flags ignored). The dedicated--user-data-dirforces a real new instance with the port.
Commands: launch [url] · status · nav <url> [tabid] · shot <out.png> [tabid] ·
click <x> <y> · type <text> · key <Key> · eval <js>. Stateless (new WS per command).
Letting Gemini/Grok DRIVE (not just see): cdp.py is a plain CLI, so Grok's run_terminal_command
(or any agent with shell access) could call it to navigate/click. Security caveat: a debug Chrome
on :9222 is controllable by any local process, and if it holds authenticated sessions (M365, Syncro,
RMM) those are driveable by whatever drives it — including external-vendor CLIs. Safer model: Claude
drives cdp.py; Gemini/Grok receive the on-disk screenshots. Only expose direct driving to an
external CLI deliberately. See reference_gururmm.