Add /import command — generic folder ingestion with smart classification

Slash command that accepts any folder path, scans all files, classifies
by content (client work, project code, credentials, session logs, tools,
docs), sanitizes credentials into SOPS vault, presents a placement plan
for approval, then executes.

Handles Claude Code session data (delegates to tools/import-sessions.py),
existing project detection, duplicate checks, and credential extraction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-16 19:25:29 -07:00
parent 8a094529ab
commit f5acf9f453

132
.claude/commands/import.md Normal file
View File

@@ -0,0 +1,132 @@
# /import — Ingest a folder into ClaudeTools
Import any folder of data into the ClaudeTools structure. Claude analyzes each file's content, classifies it, proposes placement, sanitizes credentials, and organizes everything into the correct locations.
## Usage
```
/import <path> Import a folder
/import <path> --dry-run Show plan without executing
/import <path> --client <name> Hint: this data belongs to a specific client
/import <path> --project <name> Hint: this data belongs to a specific project
```
## Arguments
The first argument is a folder path to ingest. Everything inside (recursive) is scanned and classified.
## Process
Follow these steps IN ORDER. Do not skip any step.
### Step 1: Scan
Read the source folder recursively. For each file, note:
- Filename + extension
- Size
- First ~200 lines of content (for text files)
- Binary vs text detection
Skip files >50 MB (flag them for manual review).
### Step 2: Classify
For each file, determine its category based on content analysis:
| Category | Signals | Destination |
|---|---|---|
| **Session log** | Conversation transcript, dated entries, "accomplished", "session" | `session-logs/` or `projects/*/session-logs/` or `clients/*/session-logs/` |
| **Client work** | Client name mentioned, ticket/case references, client-specific infra | `clients/<client>/` |
| **Project code** | Source code, configs, build files, READMEs | `projects/<project>/` |
| **Credentials** | Passwords, API keys, tokens, connection strings, SSH keys | `D:\vault\` (SOPS encrypted) |
| **Infrastructure docs** | Server configs, network diagrams, IP lists, runbooks | `credentials.md` update or memory entry |
| **Tool/script** | Standalone utility, automation script, helper | `tools/` or `projects/msp-tools/` |
| **Documentation** | Guides, how-tos, notes, procedures | Project-specific docs or root docs |
| **Unknown** | Can't classify | Flag for user decision |
If `--client` or `--project` was specified, weight classification toward that target.
### Step 3: Credential extraction
Before placing ANY file, scan for sensitive data:
- Passwords (inline, in configs, in notes)
- API keys / tokens (any string matching `[A-Za-z0-9_\-]{20,}` near words like key/token/secret)
- Connection strings (jdbc:, postgres://, mysql://, mongodb://)
- SSH private keys (`-----BEGIN`)
- Certificate private keys
For each credential found:
1. Show the user: "Found credential in `<file>`: `<context>` — move to vault?"
2. If approved: create a vault SOPS entry, replace inline value with a vault reference
3. If declined: leave as-is but warn
### Step 4: Present plan
Show a table:
```
SOURCE → DESTINATION ACTION
────────────────────────────────────────────────────────────────────────────────────
notes/client-acme.md → clients/acme/notes.md copy
scripts/backup-check.ps1 → tools/backup-check.ps1 copy
creds.txt → D:\vault\clients\acme.sops.yaml vault + delete source
session-2026-04-10.md → clients/acme/session-logs/2026-04-10.md copy
my-tool/src/main.rs → projects/msp-tools/howard-tools/src/ copy (new project)
random-binary.exe → (SKIP - 85 MB, too large) flag
unknown-doc.pdf → (UNKNOWN - needs your input) ask
```
Ask: "Does this plan look right? I can adjust any placement before executing."
### Step 5: Execute
After approval:
1. Copy files to destinations (never move from source — source is the user's data)
2. Create destination directories as needed
3. Encrypt credential files via SOPS
4. Update `MEMORY.md` if new knowledge was gained
5. Update project `CONTEXT.md` files if project state changed
6. Update `credentials.md` if infrastructure details were discovered
### Step 6: Report
Write a summary showing:
- Files imported: N
- Credentials vaulted: N
- New directories created: list
- Skipped files: list with reasons
- Suggested follow-ups (e.g., "review clients/acme/ for completeness")
Commit the imported files with message: `import: ingested <N> files from <source_path>`
## Special cases
### Claude Code session data (~/.claude/projects/)
If the source folder IS a Claude Code projects directory (contains `.jsonl` files):
- Use `tools/import-sessions.py` to extract summaries first
- Then apply the standard classification to the summaries
- Don't import raw JSONL (too large, mostly noise)
### Existing project detection
If imported code has a `Cargo.toml`, `package.json`, `pyproject.toml`, or similar:
- Detect the project name from the manifest
- Check if it already exists under `projects/`
- If new: propose creating a new project directory
- If existing: propose merging into the existing project
### Duplicate detection
Before copying, check if a file with the same name already exists at the destination:
- If content is identical: skip (report as "already present")
- If content differs: ask user which version to keep, or keep both with suffix
## File placement rules
Follow the conventions in `.claude/FILE_PLACEMENT_GUIDE.md`. Key rules:
- Dataforth work → `projects/dataforth-dos/`
- GuruRMM work → `projects/msp-tools/guru-rmm/`
- Client work → `clients/<client-name>/`
- General session logs → `session-logs/`
- Credentials → SOPS vault at `D:\vault\`, NEVER in plaintext in the repo