claudetools/projects/graphifyy-eval/FINDINGS.md

# Graphifyy vs GrepAI — findings (GURU-5070, 2026-06-15)

## TL;DR
On this machine, **Graphifyy does not clear the bar for day-to-day adoption.** Its code-graph
is fast/free but largely redundant with GrepAI (already wired in); its one real differentiator
(a graph over docs/PDFs) requires LLM semantic extraction that is **impractically slow on the
apples-to-apples local-ollama config**, and making it usable would require a cloud LLM backend
= ongoing API cost, which negates the local/free premise. Recommend **do not adopt**; keep GrepAI.

## Arm A — GrepAI (baseline) [complete]
10/10 answered, **18/20** rubric. Medians: ~3,025 ctx tokens/query, 3 calls, ~55s. Already
indexed (no build step). Two notable retrieval misses: C1 (phrasing surfaced the old timeout
reaper, not the comms-durability fix) and D2 (returned the SUPERSEDED kittle-design copy, missed
the canonical June BEC report) — i.e. it trips on indexed stale duplicates. Cost: ~514k agent
tokens for the 10-query arm.

## Arm B — Graphifyy (local ollama) [BLOCKED at indexing]
Setup done: `pip install graphifyy` (v0.8.39) + `openai` dep; GrepAI disabled per-machine
(backed up). Backend = local ollama (apples-to-apples: GrepAI also uses ollama).

Architecture confirmed:
- **Code extraction = AST (tree-sitter), no LLM** — instant, free, local. Strong.
- **Query/path/explain = local BFS over graph.json with a token budget** — cheap at query time.
- **Doc/PDF/image extraction = generative LLM (JSON) per chunk via the chosen backend** — heavy.

Indexing measurements (the blocker):
- `msp-pricing` (2 code + 18 docs + 2 PDFs), `qwen3:8b`, --mode deep: **did NOT finish in 600s**;
  graphify warned the 8B model is "too small for JSON instruction following" + VRAM/truncation.
- `msp-pricing/docs` (3 markdown files), `codestral:22b`, --token-budget 4096: got through
  **chunk 1 of 2 in 360s** and did not finish. ~minutes per chunk.
- AST-only code extraction ran instantly in both runs.

### Why local ollama is the wrong workload for Graphifyy (key insight)
"Both use ollama" is true but the workloads differ fundamentally:
- **GrepAI/ollama = embeddings** (nomic-embed-text): one fast forward pass per chunk. Cheap.
- **Graphifyy/ollama = generative structured (JSON) extraction** per chunk: slow, needs a
  large instruction-following model; small models fail JSON, large models are minutes/chunk.

So the doc-graph that is Graphifyy's only edge over GrepAI is gated behind an indexing cost that
is impractical locally on this hardware, and a cloud backend (gemini/claude/openai) would add
real per-ingest API cost + break the local/free + ollama-parity premise.

## Arm B addendum — Claude backend (cloud, breaks ollama-parity) [tested]
To see if a capable cloud backend makes the doc-graph viable: re-ran the SAME 3 `msp-pricing/docs`
files with `--backend claude` (key from vault `projects/gururmm/anthropic-api.sops.yaml`,
`anthropic` pip dep added).
- **Build: succeeded in 120s** (vs local-ollama DNF). 41 nodes, 68 edges, 9 communities.
- **Cost reported by graphify: 7,813 in / 12,008 out tokens, ~$0.20 for 3 small docs** (~$0.068/doc).
  Extrapolated: the doc slice (wiki + msp-pricing + kittle + dataforth + gc docs) is ~hundreds of
  files = ~$10-$30 initial; the full repo's docs (wiki + hundreds of client session-logs/reports)
  = ~$50-$200+; plus steady re-ingest as docs change (SHA256 cache skips unchanged, so steady-state
  = changed/new docs only — a constant trickle in an active repo). Code stays free (AST).
- **Query quality (the decisive finding):** `graphify query` is a local, free, 1s BFS over
  graph.json. For "GPS pricing tiers and prices" it returned the **concept/relationship MAP**
  (all tier + plan NODES, cross-doc concept links like "GPS Support Plans (Cross-Document
  Concept)") in ~1,573 tokens — but **NOT the actual prices** ($19/$26/$39 absent; nodes carry
  label + src file, not leaf values). To get the facts you still Read the source file. GrepAI
  (Arm A D1) returned the file chunk WITH the prices and answered outright.

### What this means
- **Fact/content retrieval (the common day-to-day query):** GrepAI is more direct (returns
  content). Graphifyy returns a map -> you still open the file. Extra hop.
- **Structural/relationship retrieval (architecture, impact, cross-doc concept links):**
  Graphifyy's genuine edge, and the cross-document concept synthesis is nice — but it's the rarer
  query, and overlaps GrepAI's RPG for code.

## Cost/benefit verdict (Mike's day-to-day)
- Day-to-day is mostly MSP ops + bursts of dev. The retrieval that helps is code (dev bursts) +
  docs/knowledge (client history). GrepAI already serves code well (18/20) and is zero-setup.
- Graphifyy's code side ≈ redundant with GrepAI. Its doc side (the differentiator) can't be
  cheaply/locally indexed here. Net marginal benefit is low; the standing index/maintenance cost
  (and the second-system overhead) is real.
- **Recommendation: do not adopt fleet-wide; do not replace GrepAI.** Revisit only if (a) a fast
  local generative model + GPU make doc-graph indexing cheap, or (b) the doc/PDF knowledge-graph
  becomes a must-have and a metered cloud-backend ingest budget is acceptable.

## Reversal / cleanup
- Re-enable GrepAI: restore `enabledMcpjsonServers` (backup at
  `projects/graphifyy-eval/settings.local.json.bak`) — needs session restart.
- Remove Graphifyy: `py -m pip uninstall -y graphifyy` (and `openai` if unwanted). Delete
  `projects/graphifyy-eval/out/` scratch graphs.