Files

Mike Swanson 390b10b32c Complete Phase 6: MSP Work Tracking with Context Recall System

Implements production-ready MSP platform with cross-machine persistent memory for Claude.

API Implementation:
- 130 REST API endpoints across 21 entities
- JWT authentication on all endpoints
- AES-256-GCM encryption for credentials
- Automatic audit logging
- Complete OpenAPI documentation

Database:
- 43 tables in MariaDB (172.16.3.20:3306)
- 42 SQLAlchemy models with modern 2.0 syntax
- Full Alembic migration system
- 99.1% CRUD test pass rate

Context Recall System (Phase 6):
- Cross-machine persistent memory via database
- Automatic context injection via Claude Code hooks
- Automatic context saving after task completion
- 90-95% token reduction with compression utilities
- Relevance scoring with time decay
- Tag-based semantic search
- One-command setup script

Security Features:
- JWT tokens with Argon2 password hashing
- AES-256-GCM encryption for all sensitive data
- Comprehensive audit trail for credentials
- HMAC tamper detection
- Secure configuration management

Test Results:
- Phase 3: 38/38 CRUD tests passing (100%)
- Phase 4: 34/35 core API tests passing (97.1%)
- Phase 5: 62/62 extended API tests passing (100%)
- Phase 6: 10/10 compression tests passing (100%)
- Overall: 144/145 tests passing (99.3%)

Documentation:
- Comprehensive architecture guides
- Setup automation scripts
- API documentation at /api/docs
- Complete test reports
- Troubleshooting guides

Project Status: 95% Complete (Production-Ready)
Phase 7 (optional work context APIs) remains for future enhancement.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-01-17 06:00:26 -07:00

12 KiB

Raw Permalink Blame History

Conversation Parser Usage Guide

Complete guide for using the ClaudeTools conversation transcript parser and intelligent categorizer.

Overview

The conversation parser extracts, analyzes, and categorizes conversation data from Claude Desktop/Code sessions. It intelligently classifies conversations as MSP Work, Development, or General and compresses them for efficient database storage.

Main Functions

1. `parse_jsonl_conversation(file_path: str)`

Parse conversation files (.jsonl or .json) and extract structured data.

Returns:

{
    "messages": [{"role": str, "content": str, "timestamp": str}, ...],
    "metadata": {"title": str, "model": str, "created_at": str, ...},
    "file_paths": [str, ...],           # Auto-extracted from content
    "tool_calls": [{"tool": str, "count": int}, ...],
    "duration_seconds": int,
    "message_count": int
}

Example:

from api.utils.conversation_parser import parse_jsonl_conversation

conversation = parse_jsonl_conversation("/path/to/conversation.jsonl")
print(f"Found {conversation['message_count']} messages")
print(f"Duration: {conversation['duration_seconds']} seconds")

2. `categorize_conversation(messages: List[Dict])`

Intelligently categorize conversation content using weighted keyword analysis.

Returns: "msp", "development", or "general"

Categorization Logic:

MSP Keywords (higher weight = stronger signal):

Client/Infrastructure: client, customer, site, firewall, network, server
Services: support, ticket, incident, billable, invoice
Microsoft 365: office365, azure, exchange, sharepoint, teams
MSP-specific: managed service, service desk, RDS, terminal server

Development Keywords:

API/Backend: api, endpoint, fastapi, flask, rest, webhook
Database: database, migration, alembic, sqlalchemy, postgresql
Code: implement, refactor, debug, test, pytest, function, class
Tools: docker, kubernetes, ci/cd, deployment

Example:

from api.utils.conversation_parser import categorize_conversation

# MSP conversation
messages = [
    {"role": "user", "content": "Client firewall blocking Office365"},
    {"role": "assistant", "content": "Checking client site configuration"}
]
category = categorize_conversation(messages)  # Returns "msp"

# Development conversation
messages = [
    {"role": "user", "content": "Build FastAPI endpoint with PostgreSQL"},
    {"role": "assistant", "content": "Creating API using SQLAlchemy"}
]
category = categorize_conversation(messages)  # Returns "development"

3. `extract_context_from_conversation(conversation: Dict)`

Extract dense, compressed context suitable for database storage.

Returns:

{
    "category": str,                    # "msp", "development", or "general"
    "summary": Dict,                    # From compress_conversation_summary()
    "tags": List[str],                  # Auto-extracted technology/topic tags
    "decisions": List[Dict],            # Key decisions with rationale
    "key_files": List[str],            # Top 20 file paths mentioned
    "key_tools": List[str],            # Top 10 tools used
    "metrics": {
        "message_count": int,
        "duration_seconds": int,
        "file_count": int,
        "tool_count": int,
        "decision_count": int,
        "quality_score": float         # 0-10 quality rating
    },
    "raw_metadata": Dict               # Original metadata
}

Quality Score Calculation:

More messages = higher quality (up to 5 points)
Decisions indicate depth (up to 2 points)
File mentions indicate concrete work (up to 2 points)
Sessions >5 minutes (+1 point)

Example:

from api.utils.conversation_parser import (
    parse_jsonl_conversation,
    extract_context_from_conversation
)

# Parse and extract context
conversation = parse_jsonl_conversation("/path/to/file.jsonl")
context = extract_context_from_conversation(conversation)

print(f"Category: {context['category']}")
print(f"Tags: {context['tags']}")
print(f"Quality: {context['metrics']['quality_score']}/10")
print(f"Decisions: {len(context['decisions'])}")

4. `scan_folder_for_conversations(base_path: str)`

Recursively find all conversation files in a directory.

Features:

Finds both .jsonl and .json files
Automatically skips config files (config.json, settings.json)
Skips common non-conversation files (package.json, tsconfig.json)
Cross-platform path handling

Returns: List of absolute file paths

Example:

from api.utils.conversation_parser import scan_folder_for_conversations

# Scan Claude Code sessions
files = scan_folder_for_conversations(
    r"C:\Users\MikeSwanson\claude-projects"
)

print(f"Found {len(files)} conversation files")
for file in files[:5]:
    print(f"  - {file}")

Complete Workflow Example

Batch Process Conversation Folder

from api.utils.conversation_parser import (
    scan_folder_for_conversations,
    parse_jsonl_conversation,
    extract_context_from_conversation
)

# 1. Scan for conversation files
base_path = r"C:\Users\MikeSwanson\claude-projects"
files = scan_folder_for_conversations(base_path)

# 2. Process each conversation
contexts = []
for file_path in files:
    try:
        # Parse conversation
        conversation = parse_jsonl_conversation(file_path)

        # Extract context
        context = extract_context_from_conversation(conversation)

        # Add source file
        context["source_file"] = file_path

        contexts.append(context)

        print(f"Processed: {file_path}")
        print(f"  Category: {context['category']}")
        print(f"  Messages: {context['metrics']['message_count']}")
        print(f"  Quality: {context['metrics']['quality_score']}/10")

    except Exception as e:
        print(f"Error processing {file_path}: {e}")

# 3. Categorize by type
msp_contexts = [c for c in contexts if c['category'] == 'msp']
dev_contexts = [c for c in contexts if c['category'] == 'development']

print(f"\nSummary:")
print(f"  MSP conversations: {len(msp_contexts)}")
print(f"  Development conversations: {len(dev_contexts)}")

Using the Batch Helper Function

from api.utils.conversation_parser import batch_process_conversations

def progress_callback(file_path, context):
    """Called for each processed file"""
    print(f"Processed: {context['category']} - {context['metrics']['quality_score']}/10")

# Process all conversations with callback
contexts = batch_process_conversations(
    r"C:\Users\MikeSwanson\claude-projects",
    output_callback=progress_callback
)

print(f"Total processed: {len(contexts)}")

Integration with Database

Insert Context into Database

from sqlalchemy.orm import Session
from api.models import ContextSnippet
from api.utils.conversation_parser import (
    parse_jsonl_conversation,
    extract_context_from_conversation
)

def import_conversation_to_db(db: Session, file_path: str):
    """Import a conversation file into the database."""

    # 1. Parse and extract context
    conversation = parse_jsonl_conversation(file_path)
    context = extract_context_from_conversation(conversation)

    # 2. Create context snippet for summary
    summary_snippet = ContextSnippet(
        content=str(context['summary']),
        snippet_type="session_summary",
        tags=context['tags'],
        importance=min(10, int(context['metrics']['quality_score'])),
        metadata={
            "category": context['category'],
            "source_file": file_path,
            "message_count": context['metrics']['message_count'],
            "duration_seconds": context['metrics']['duration_seconds']
        }
    )
    db.add(summary_snippet)

    # 3. Create decision snippets
    for decision in context['decisions']:
        decision_snippet = ContextSnippet(
            content=f"{decision['decision']} - {decision['rationale']}",
            snippet_type="decision",
            tags=context['tags'][:5],
            importance=7 if decision['impact'] == 'high' else 5,
            metadata={
                "category": context['category'],
                "impact": decision['impact'],
                "source_file": file_path
            }
        )
        db.add(decision_snippet)

    db.commit()
    print(f"Imported conversation from {file_path}")

CLI Quick Test

The module includes a standalone CLI for quick testing:

# Test a specific conversation file
python api/utils/conversation_parser.py /path/to/conversation.jsonl

# Output:
# Conversation: Build authentication system
# Category: development
# Messages: 15
# Duration: 1200s (20m)
# Tags: development, fastapi, postgresql, auth, api
# Quality: 7.5/10

Categorization Examples

MSP Conversation

User: Client at BGBuilders site reported VPN connection issues
Assistant: I'll check the firewall configuration and VPN settings for the client

Category: msp Score Logic: client (3), site (2), vpn (2), firewall (3) = 10 points

Development Conversation

User: Build a FastAPI REST API with PostgreSQL and implement JWT authentication
Assistant: I'll create the API endpoints using SQLAlchemy ORM and add JWT token support

Category: development Score Logic: fastapi (4), api (3), postgresql (3), jwt (auth tag), sqlalchemy (3) = 13+ points

General Conversation

User: What's the best way to organize my project files?
Assistant: I recommend organizing by feature rather than by file type

Category: general Score Logic: No strong MSP or dev keywords, low scores on both

Advanced Features

File Path Extraction

Automatically extracts file paths from conversation content:

conversation = parse_jsonl_conversation("/path/to/file.jsonl")
print(conversation['file_paths'])
# ['api/auth.py', 'api/models.py', 'tests/test_auth.py']

Supports:

Windows absolute paths: C:\Users\...\file.py
Unix absolute paths: /home/user/file.py
Relative paths: ./api/file.py, ../utils/helper.py
Code paths: api/auth.py, src/models.py

Tool Call Tracking

Automatically tracks which tools were used:

conversation = parse_jsonl_conversation("/path/to/file.jsonl")
print(conversation['tool_calls'])
# [
#   {"tool": "write", "count": 5},
#   {"tool": "read", "count": 3},
#   {"tool": "bash", "count": 2}
# ]

Best Practices

Use quality scores to filter: Only import high-quality conversations (score > 5.0)
Batch process in chunks: Process large folders in batches to manage memory
Add source file tracking: Always include source_file in context for traceability
Validate before import: Check message_count > 0 before importing to database
Use callbacks for progress: Implement progress callbacks for long-running batch jobs

Error Handling

from api.utils.conversation_parser import parse_jsonl_conversation

try:
    conversation = parse_jsonl_conversation(file_path)

    if conversation['message_count'] == 0:
        print("Warning: Empty conversation, skipping")
        return

    # Process conversation...

except FileNotFoundError:
    print(f"File not found: {file_path}")
except ValueError as e:
    print(f"Invalid file format: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

context_compression.py: Provides compression utilities used by the parser
test_conversation_parser.py: Comprehensive test suite with examples
Database Models: api/models.py - ContextSnippet model for storage

Future Enhancements

Potential improvements for future versions:

Multi-language detection: Identify primary programming language
Sentiment analysis: Detect problem-solving vs. exploratory conversations
Entity extraction: Extract specific client names, project names, technologies
Time-based patterns: Identify working hours, session patterns
Conversation linking: Link related conversations by topic/project

12 KiB Raw Permalink Blame History

Conversation Parser Usage Guide

Overview

Main Functions

1. parse_jsonl_conversation(file_path: str)

2. categorize_conversation(messages: List[Dict])

3. extract_context_from_conversation(conversation: Dict)

4. scan_folder_for_conversations(base_path: str)

Complete Workflow Example

Batch Process Conversation Folder

Using the Batch Helper Function

Integration with Database

Insert Context into Database

CLI Quick Test

Categorization Examples

MSP Conversation

Development Conversation

General Conversation

Advanced Features

File Path Extraction

Tool Call Tracking

Best Practices

Error Handling

Related Files

Future Enhancements

12 KiB

Raw Permalink Blame History

1. `parse_jsonl_conversation(file_path: str)`

2. `categorize_conversation(messages: List[Dict])`

3. `extract_context_from_conversation(conversation: Dict)`

4. `scan_folder_for_conversations(base_path: str)`