Files
claudetools/docs/api/credentials/CREDENTIAL_SCANNER_GUIDE.md
Mike Swanson 06f7617718 feat: Major directory reorganization and cleanup
Reorganized project structure for better maintainability and reduced
disk usage by 95.9% (11 GB -> 451 MB).

Directory Reorganization (85% reduction in root files):
- Created docs/ with subdirectories (deployment, testing, database, etc.)
- Created infrastructure/vpn-configs/ for VPN scripts
- Moved 90+ files from root to organized locations
- Archived obsolete documentation (context system, offline mode, zombie debugging)
- Moved all test files to tests/ directory
- Root directory: 119 files -> 18 files

Disk Cleanup (10.55 GB recovered):
- Deleted Rust build artifacts: 9.6 GB (target/ directories)
- Deleted Python virtual environments: 161 MB (venv/ directories)
- Deleted Python cache: 50 KB (__pycache__/)

New Structure:
- docs/ - All documentation organized by category
- docs/archives/ - Obsolete but preserved documentation
- infrastructure/ - VPN configs and SSH setup
- tests/ - All test files consolidated
- logs/ - Ready for future logs

Benefits:
- Cleaner root directory (18 vs 119 files)
- Logical organization of documentation
- 95.9% disk space reduction
- Faster navigation and discovery
- Better portability (build artifacts excluded)

Build artifacts can be regenerated:
- Rust: cargo build --release (5-15 min per project)
- Python: pip install -r requirements.txt (2-3 min)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 20:42:28 -07:00

16 KiB

Credential Scanner and Importer Guide

Module: api/utils/credential_scanner.py Purpose: Scan for credential files and import them into the ClaudeTools credential vault with automatic encryption Status: Production Ready


Overview

The Credential Scanner and Importer provides automated discovery and secure import of credentials from structured files into the ClaudeTools database. All credentials are automatically encrypted using AES-256-GCM before storage, and comprehensive audit logs are created for compliance.

Key Features

  • Multi-format support: Markdown, .env, text files
  • Automatic encryption: Uses existing credential_service for AES-256-GCM encryption
  • Type detection: Auto-detects API keys, passwords, connection strings, tokens
  • Audit logging: Every import operation is logged with full traceability
  • Client association: Optional linking to specific clients
  • Safe parsing: Never logs plaintext credential values

Supported File Formats

1. Markdown Files (.md)

Structured format using headers and key-value pairs:

## Gitea Admin
Username: admin
Password: SecurePass123!
URL: https://git.example.com
Notes: Main admin account

## Database Server
Type: connection_string
Connection String: mysql://dbuser:dbpass@192.168.1.50:3306/mydb
Notes: Production database

## OpenAI API
API Key: sk-1234567890abcdefghijklmnopqrstuvwxyz
Notes: Production API key

Recognized keys:

  • Username, User, Login → username field
  • Password, Pass, Pwd → password field
  • API Key, API_Key, ApiKey, Key → api_key field
  • Token, Access Token, Bearer → token field
  • Client Secret, Secret → client_secret field
  • Connection String, Conn_Str → connection_string field
  • URL, Host, Server, Address → url (auto-detects internal/external)
  • Port → custom_port field
  • Notes, Description → notes field
  • Type, Credential_Type → credential_type field

2. Environment Files (.env)

Standard environment variable format:

# Database Configuration
DATABASE_URL=mysql://user:pass@host:3306/db

# API Keys
OPENAI_API_KEY=sk-1234567890abcdefghij
GITHUB_TOKEN=ghp_abc123def456ghi789

# Secrets
SECRET_KEY=super_secret_key_12345

Behavior:

  • Each KEY=value pair creates a separate credential
  • Service name derived from KEY (e.g., DATABASE_URL → "Database Url")
  • Credential type auto-detected from value pattern

3. Text Files (.txt)

Same format as Markdown, but uses .txt extension:

# Server Passwords

## Web Server
Username: webadmin
Password: Web@dmin2024!
Host: 192.168.1.100
Port: 22

## Backup Server
Username: backup
Password: BackupSecure789
Host: 10.0.0.50

Credential Type Detection

The scanner automatically detects credential types based on value patterns:

Pattern Detected Type Field
sk-* (20+ chars) api_key api_key
api_* (20+ chars) api_key api_key
ghp_* (36 chars) api_key api_key
gho_* (36 chars) api_key api_key
xoxb-* api_key api_key
-----BEGIN * PRIVATE KEY----- ssh_key password
mysql://... connection_string connection_string
postgresql://... connection_string connection_string
Server=...;Database=... connection_string connection_string
JWT (3 parts, 50+ chars) jwt token
ya29.*, ey*, oauth* oauth token
Default password password

API Reference

Function 1: scan_for_credential_files(base_path: str)

Find all credential files in a directory tree.

Parameters:

  • base_path (str): Root directory to search from

Returns:

  • List[str]: Absolute paths to credential files found

Scanned file names:

  • credentials.md, credentials.txt
  • passwords.md, passwords.txt
  • secrets.md, secrets.txt
  • auth.md, auth.txt
  • .env, .env.local, .env.production, .env.development, .env.staging

Excluded directories:

  • .git, .svn, node_modules, venv, __pycache__, .venv, dist, build

Example:

from api.utils.credential_scanner import scan_for_credential_files

files = scan_for_credential_files("C:/Projects/ClientA")
# Returns: ["C:/Projects/ClientA/credentials.md", "C:/Projects/ClientA/.env"]

Function 2: parse_credential_file(file_path: str)

Extract credentials from a file and return structured data.

Parameters:

  • file_path (str): Absolute path to credential file

Returns:

  • List[Dict]: List of credential dictionaries

Credential Dictionary Format:

{
    "service_name": "Gitea Admin",
    "credential_type": "password",
    "username": "admin",
    "password": "SecurePass123!",  # or api_key, token, etc.
    "internal_url": "192.168.1.100",
    "custom_port": 3000,
    "notes": "Main admin account"
}

Example:

from api.utils.credential_scanner import parse_credential_file

creds = parse_credential_file("C:/Projects/credentials.md")
for cred in creds:
    print(f"Service: {cred['service_name']}")
    print(f"Type: {cred['credential_type']}")

Function 3: import_credentials_to_db(db, credentials, client_id=None, user_id="system_import", ip_address=None)

Import credentials into the database with automatic encryption.

Parameters:

  • db (Session): SQLAlchemy database session
  • credentials (List[Dict]): List of credential dictionaries from parse_credential_file()
  • client_id (Optional[str]): UUID string to associate credentials with a client
  • user_id (str): User ID for audit logging (default: "system_import")
  • ip_address (Optional[str]): IP address for audit logging

Returns:

  • int: Count of successfully imported credentials

Security:

  • All sensitive fields automatically encrypted using AES-256-GCM
  • Audit log entry created for each import (action: "create")
  • Never logs plaintext credential values
  • Uses existing credential_service encryption infrastructure

Example:

from api.database import SessionLocal
from api.utils.credential_scanner import parse_credential_file, import_credentials_to_db

db = SessionLocal()
try:
    creds = parse_credential_file("C:/Projects/credentials.md")
    count = import_credentials_to_db(
        db=db,
        credentials=creds,
        client_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        user_id="mike@example.com",
        ip_address="192.168.1.100"
    )
    print(f"Imported {count} credentials")
finally:
    db.close()

Function 4: scan_and_import_credentials(base_path, db, client_id=None, user_id="system_import", ip_address=None)

Scan for credential files and import all found credentials in one operation.

Parameters:

  • base_path (str): Root directory to scan
  • db (Session): Database session
  • client_id (Optional[str]): Client UUID to associate credentials with
  • user_id (str): User ID for audit logging
  • ip_address (Optional[str]): IP address for audit logging

Returns:

  • Dict[str, int]: Summary statistics
    • files_found: Number of credential files found
    • credentials_parsed: Total credentials parsed from all files
    • credentials_imported: Number successfully imported to database

Example:

from api.database import SessionLocal
from api.utils.credential_scanner import scan_and_import_credentials

db = SessionLocal()
try:
    results = scan_and_import_credentials(
        base_path="C:/Projects/ClientA",
        db=db,
        client_id="client-uuid-here",
        user_id="mike@example.com"
    )

    print(f"Files found: {results['files_found']}")
    print(f"Credentials parsed: {results['credentials_parsed']}")
    print(f"Credentials imported: {results['credentials_imported']}")
finally:
    db.close()

Usage Examples

Example 1: Quick Import

from api.database import SessionLocal
from api.utils.credential_scanner import scan_and_import_credentials

db = SessionLocal()
try:
    results = scan_and_import_credentials(
        "C:/Projects/ClientProject",
        db,
        client_id="your-client-uuid"
    )
    print(f"Imported {results['credentials_imported']} credentials")
finally:
    db.close()

Example 2: Preview Before Import

from api.utils.credential_scanner import scan_for_credential_files, parse_credential_file

# Find files
files = scan_for_credential_files("C:/Projects/ClientProject")
print(f"Found {len(files)} files")

# Preview credentials
for file_path in files:
    creds = parse_credential_file(file_path)
    print(f"\n{file_path}:")
    for cred in creds:
        print(f"  - {cred['service_name']} ({cred['credential_type']})")

Example 3: Manual Import with Error Handling

from api.database import SessionLocal
from api.utils.credential_scanner import (
    scan_for_credential_files,
    parse_credential_file,
    import_credentials_to_db
)

db = SessionLocal()
try:
    # Scan
    files = scan_for_credential_files("C:/Projects/ClientProject")

    # Parse and import each file separately
    for file_path in files:
        try:
            creds = parse_credential_file(file_path)
            count = import_credentials_to_db(db, creds, client_id="uuid-here")
            print(f"✓ Imported {count} from {file_path}")
        except Exception as e:
            print(f"✗ Failed to import {file_path}: {e}")
            continue

except Exception as e:
    print(f"Error: {e}")
finally:
    db.close()

Example 4: Command-Line Import Tool

See example_credential_import.py:

# Preview without importing
python example_credential_import.py /path/to/project --preview

# Import with client association
python example_credential_import.py /path/to/project --client-id "uuid-here"

Testing

Run the test suite:

python test_credential_scanner.py

Tests included:

  1. Scan for credential files
  2. Parse credential files (all formats)
  3. Import credentials to database
  4. Full workflow (scan + parse + import)
  5. Markdown format variations

Security Considerations

Encryption

All credentials are encrypted before storage:

  • Algorithm: AES-256-GCM (via Fernet)
  • Key management: Stored in environment variable ENCRYPTION_KEY
  • Per-field encryption: password, api_key, client_secret, token, connection_string

Audit Trail

Every import operation creates audit log entries:

  • Action: "create"
  • User ID: From function parameter
  • IP address: From function parameter
  • Timestamp: Auto-generated
  • Details: Service name, credential type

Logging Safety

  • Plaintext credentials are NEVER logged
  • File paths and counts are logged
  • Service names (non-sensitive) are logged
  • Errors are logged without credential values

Best Practices

  1. Delete source files after successful import
  2. Verify imports using the API or database queries
  3. Use client_id to associate credentials with clients
  4. Review audit logs regularly for compliance
  5. Rotate credentials after initial import if they were stored in plaintext

Integration with ClaudeTools

Credential Service

The scanner uses api/services/credential_service.py for all database operations:

  • create_credential() - Handles encryption and audit logging
  • Automatic validation via Pydantic schemas
  • Foreign key enforcement (client_id, service_id, infrastructure_id)

Database Schema

Credentials are stored in the credentials table:

  • id - UUID primary key
  • service_name - Display name
  • credential_type - Type (password, api_key, etc.)
  • username - Username (optional)
  • password_encrypted - AES-256-GCM encrypted password
  • api_key_encrypted - Encrypted API key
  • token_encrypted - Encrypted token
  • connection_string_encrypted - Encrypted connection string
  • Plus 20+ other fields for metadata

Audit Logging

Audit logs stored in credential_audit_log table:

  • credential_id - Reference to credential
  • action - "create", "view", "update", "delete", "decrypt"
  • user_id - User performing action
  • ip_address - Source IP
  • timestamp - When action occurred
  • details - JSON metadata

Troubleshooting

No files found

Problem: scan_for_credential_files() returns empty list

Solutions:

  • Verify the base path exists and is a directory
  • Check file names match expected patterns (credentials.md, .env, etc.)
  • Ensure files are not in excluded directories (node_modules, .git, etc.)

Parsing errors

Problem: parse_credential_file() returns empty list

Solutions:

  • Verify file format matches expected structure (headers, key-value pairs)
  • Check for encoding issues (must be UTF-8)
  • Ensure key names are recognized (see "Recognized keys" section)

Import failures

Problem: import_credentials_to_db() fails or imports less than parsed

Solutions:

  • Check database connection is active
  • Verify client_id exists if provided (foreign key constraint)
  • Check encryption key is configured (ENCRYPTION_KEY environment variable)
  • Review logs for specific validation errors

Type detection issues

Problem: Credentials imported with wrong type

Solutions:

  • Manually specify Type: field in credential file
  • Update detection patterns in _detect_credential_type()
  • Use explicit field names (e.g., "API Key:" instead of "Key:")

Extending the Scanner

Add New File Format

def _parse_custom_format(content: str) -> List[Dict]:
    """Parse credentials from custom format."""
    credentials = []

    # Your parsing logic here

    return credentials

# Update parse_credential_file():
elif file_ext == '.custom':
    credentials = _parse_custom_format(content)

Add New Credential Type Pattern

# Add to API_KEY_PATTERNS, SSH_KEY_PATTERN, or CONNECTION_STRING_PATTERNS
API_KEY_PATTERNS.append(r"^custom_[a-zA-Z0-9]{20,}")

# Or add detection logic to _detect_credential_type()

Add Custom Field Mapping

# In _parse_markdown_credentials(), add mapping:
elif key in ['custom_field', 'alt_name']:
    current_cred['custom_field'] = value

Production Deployment

Environment Setup

# Required environment variable
export ENCRYPTION_KEY="64-character-hex-string"

# Generate new key:
python -c "from api.utils.crypto import generate_encryption_key; print(generate_encryption_key())"

Import Workflow

  1. Scan client project directories
  2. Preview credentials before import
  3. Import with client association
  4. Verify import success via API
  5. Delete source credential files
  6. Rotate credentials if needed
  7. Document import in client notes

Automation Example

# Automated import script for all clients
from api.database import SessionLocal
from api.models.client import Client
from api.utils.credential_scanner import scan_and_import_credentials

db = SessionLocal()
try:
    clients = db.query(Client).all()

    for client in clients:
        project_path = f"C:/Projects/{client.name}"
        if os.path.exists(project_path):
            results = scan_and_import_credentials(
                project_path,
                db,
                client_id=str(client.id)
            )
            print(f"{client.name}: {results['credentials_imported']} imported")
finally:
    db.close()

  • API Specification: .claude/API_SPEC.md
  • Credential Schema: .claude/SCHEMA_CREDENTIALS.md
  • Credential Service: api/services/credential_service.py
  • Encryption Utils: api/utils/crypto.py
  • Database Models: api/models/credential.py

Last Updated: 2026-01-16 Version: 1.0 Author: ClaudeTools Development Team