# grepai [![Go](https://github.com/yoanbernabeu/grepai/actions/workflows/ci.yml/badge.svg)](https://github.com/yoanbernabeu/grepai/actions/workflows/ci.yml) [![Go Report Card](https://goreportcard.com/badge/github.com/yoanbernabeu/grepai)](https://goreportcard.com/report/github.com/yoanbernabeu/grepai) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Documentation](https://img.shields.io/badge/docs-yoanbernabeu.github.io%2Fgrepai-blue)](https://yoanbernabeu.github.io/grepai/) > **[Full documentation available here](https://yoanbernabeu.github.io/grepai/)** — Installation guides, configuration options, AI agent integration, and more. **A privacy-first, CLI-native way to semantically search your codebase.** Search code by *what it does*, not just what it's called. `grepai` indexes the meaning of your code using vector embeddings, enabling natural language queries that find conceptually related code—even when naming conventions vary. ## Why grepai? `grep` was built in 1973 for exact text matching. Modern codebases need semantic understanding. | | `grep` / `ripgrep` | `grepai` | |----------------------|------------------------------|-----------------------------------| | **Search type** | Exact text / regex | Semantic understanding | | **Query** | `"func.*Login"` | `"user authentication flow"` | | **Finds** | Exact pattern matches | Conceptually related code | | **AI Agent context** | Requires many searches | Fewer, more relevant results | ### Built for AI Agents grepai is designed to provide **high-quality context** to AI coding assistants. By returning semantically relevant code chunks, your agents spend less time searching and more time coding. ## Getting Started ### Installation ```bash curl -sSL https://raw.githubusercontent.com/yoanbernabeu/grepai/main/install.sh | sh ``` Or download from [Releases](https://github.com/yoanbernabeu/grepai/releases). ### Quick Start ```bash grepai init # Initialize in your project grepai watch # Start background indexing daemon grepai search "error handling" # Search semantically grepai trace callers "Login" # Find who calls a function ``` ## Commands | Command | Description | |--------------------------|----------------------------------------| | `grepai init` | Initialize grepai in current directory | | `grepai watch` | Start real-time file watcher daemon | | `grepai search ` | Search codebase with natural language | | `grepai trace ` | Analyze call graph (callers/callees) | | `grepai status` | Browse index state interactively | | `grepai agent-setup` | Configure AI agents integration | | `grepai update` | Update grepai to the latest version | ```bash grepai search "authentication" -n 5 # Limit results (default: 10) grepai search "authentication" --json # JSON output for AI agents grepai search "authentication" --json -c # Compact JSON (~80% fewer tokens) ``` ### Background Daemon Run the watcher as a background process: ```bash grepai watch --background # Start in background grepai watch --status # Check if running grepai watch --stop # Stop gracefully ``` Logs are stored in OS-specific directories: | Platform | Log Directory | |----------|---------------| | Linux | `~/.local/state/grepai/logs/` | | macOS | `~/Library/Logs/grepai/` | | Windows | `%LOCALAPPDATA%\grepai\logs\` | Use `--log-dir /custom/path` to override (must be passed to all commands): ```bash grepai watch --background --log-dir /custom/path # Start in background grepai watch --status --log-dir /custom/path # Check if running grepai watch --stop --log-dir /custom/path # Stop gracefully ``` ### Self-Update Keep grepai up to date: ```bash grepai update --check # Check for available updates grepai update # Download and install latest version grepai update --force # Force update even if already on latest ``` The update command: - Fetches the latest release from GitHub - Verifies checksum integrity - Replaces the binary automatically - Works on all supported platforms (Linux, macOS, Windows) ### Call Graph Analysis Find function relationships in your codebase: ```bash grepai trace callers "Login" # Who calls Login? grepai trace callees "HandleRequest" # What does HandleRequest call? grepai trace graph "ProcessOrder" --depth 3 # Full call graph ``` Output as JSON for AI agents: ```bash grepai trace callers "Login" --json ``` ## AI Agent Integration grepai integrates natively with popular AI coding assistants. Run `grepai agent-setup` to auto-configure. | Agent | Configuration File | |--------------|----------------------------------------| | Cursor | `.cursorrules` | | Windsurf | `.windsurfrules` | | Claude Code | `CLAUDE.md` / `.claude/settings.md` | | Gemini CLI | `GEMINI.md` | | OpenAI Codex | `AGENTS.md` | ### MCP Server Mode grepai can run as an MCP (Model Context Protocol) server, making it available as a native tool for AI agents: ```bash grepai mcp-serve # Start MCP server (stdio transport) ``` Configure in your AI tool's MCP settings: ```json { "mcpServers": { "grepai": { "command": "grepai", "args": ["mcp-serve"] } } } ``` Available MCP tools: - `grepai_search` — Semantic code search - `grepai_trace_callers` — Find function callers - `grepai_trace_callees` — Find function callees - `grepai_trace_graph` — Build call graph - `grepai_index_status` — Check index health ### Claude Code Subagent For enhanced exploration capabilities in Claude Code, create a specialized subagent: ```bash grepai agent-setup --with-subagent ``` This creates `.claude/agents/deep-explore.md` with: - Semantic search via `grepai search` - Call graph tracing via `grepai trace` - Workflow guidance for code exploration Claude Code automatically uses this agent for deep codebase exploration tasks. ## Configuration Stored in `.grepai/config.yaml`: ```yaml embedder: provider: ollama # ollama | lmstudio | openai model: nomic-embed-text endpoint: http://localhost:11434 # Custom endpoint (for Azure OpenAI, etc.) dimensions: 768 # Vector dimensions (depends on model) store: backend: gob # gob | postgres chunking: size: 512 overlap: 50 search: boost: enabled: true # Structural boosting for better relevance trace: mode: fast # fast (regex) | precise (tree-sitter) external_gitignore: "" # Path to external gitignore (e.g., ~/.config/git/ignore) ``` > **Note**: Old configs without `endpoint` or `dimensions` are automatically updated with sensible defaults. ### Search Boost (enabled by default) grepai automatically adjusts search scores based on file paths. Patterns are language-agnostic: | Category | Patterns | Factor | |----------|----------|--------| | Tests | `/tests/`, `/test/`, `__tests__`, `_test.`, `.test.`, `.spec.` | ×0.5 | | Mocks | `/mocks/`, `/mock/`, `.mock.` | ×0.4 | | Fixtures | `/fixtures/`, `/testdata/` | ×0.4 | | Generated | `/generated/`, `.generated.`, `.gen.` | ×0.4 | | Docs | `.md`, `/docs/` | ×0.6 | | Source | `/src/`, `/lib/`, `/app/` | ×1.1 | Customize or disable in `.grepai/config.yaml`. See [documentation](https://yoanbernabeu.github.io/grepai/configuration/) for details. ### Hybrid Search (optional) Enable hybrid search to combine vector similarity with text matching: ```yaml search: hybrid: enabled: true k: 60 ``` Uses [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) to merge results. Useful when queries contain exact identifiers. ### Embedding Providers **Ollama (Default)** — Privacy-first, runs locally: ```bash ollama pull nomic-embed-text ``` **LM Studio** — Local, OpenAI-compatible API: ```bash # Start LM Studio and load an embedding model # Default endpoint: http://127.0.0.1:1234 ``` **OpenAI** — Cloud-based: ```bash export OPENAI_API_KEY=sk-... ``` ### Storage Backends - **GOB (Default)**: File-based, zero config - **PostgreSQL + pgvector**: For large monorepos - **Qdrant**: Docker-based vector database ## Requirements - Ollama, LM Studio, or OpenAI API key (for embeddings) - Go 1.22+ (only for building from source) ## Contributing See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ## License [MIT License](LICENSE) - Yoan Bernabeu 2026