Files
claudetools/.claude/memory/reference_cdp_chrome_driver.md
Mike Swanson 47b71b7b3a rmm dashboard redesign (Gemini live review) + CDP Chrome driver
- .claude/scripts/cdp.py: drive Chrome via DevTools Protocol; screenshots to disk
  (so Gemini/Grok can see the live site). Fixes invisible-window + no-disk-screenshot.
- reference_cdp_chrome_driver.md (+ MEMORY index)
- gururmm submodule pointer -> dashboard redesign docs (local 3cef6ba)
- session log

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 13:10:37 -07:00

2.4 KiB

name, description, metadata
name description metadata
reference_cdp_chrome_driver Drive Chrome via CDP (debugger) with on-disk screenshots; how Gemini/Grok "see" the live site
type
reference

.claude/scripts/cdp.py drives Chrome over the Chrome DevTools Protocol (same approach Antigravity uses) — fixing two problems the claude-in-chrome MCP extension had: invisible windows, and screenshots that never landed on disk.

Why it matters: CDP Page.captureScreenshot returns the PNG bytes, so cdp.py writes a real PNG file → which can be fed to agy image-analyze (Gemini) or Grok. That is how Gemini/Grok "look at the live site" (verified 2026-06-05: Gemini correctly read a CDP screenshot of the GuruRMM login). The MCP extension's save_to_disk never produced a findable file.

Setup (one-time per session):

  • py -m pip install websocket-client (uses stdlib urllib + websocket-client; no Playwright/Node).
  • py .claude/scripts/cdp.py launch [url] — opens a visible Chrome on a dedicated profile (~/.claude/cdp-chrome-profile) with --remote-debugging-port=9222. Dedicated profile = NOT logged in; the user signs into authenticated apps once (Claude still must NOT type passwords — that rule holds regardless of CDP).

Gotchas:

  • Chrome's DNS-rebinding guard rejects Host: 127.0.0.1 on the debug endpoint → use localhost (cdp.py BASE is http://localhost:9222). Launch also passes --remote-allow-origins=*.
  • Launching chrome.exe while Chrome runs on the SAME profile just opens a tab in the existing instance (flags ignored). The dedicated --user-data-dir forces a real new instance with the port.

Commands: launch [url] · status · nav <url> [tabid] · shot <out.png> [tabid] · click <x> <y> · type <text> · key <Key> · eval <js>. Stateless (new WS per command).

Letting Gemini/Grok DRIVE (not just see): cdp.py is a plain CLI, so Grok's run_terminal_command (or any agent with shell access) could call it to navigate/click. Security caveat: a debug Chrome on :9222 is controllable by any local process, and if it holds authenticated sessions (M365, Syncro, RMM) those are driveable by whatever drives it — including external-vendor CLIs. Safer model: Claude drives cdp.py; Gemini/Grok receive the on-disk screenshots. Only expose direct driving to an external CLI deliberately. See reference_gururmm.