Reasoning Cache

Intelligent caching system for reasoning results with Redis persistence, compression, and cost optimization.

Features

Redis Persistence: Persistent caching with Redis backend
Compression: LZ4 compression for efficient storage
TTL Support: Time-to-live for cache entries
Cost Tracking: Track cache hits and cost savings
Fallback Support: In-memory fallback when Redis unavailable
Statistics: Comprehensive cache performance metrics

Quick Start

from recoagent.reasoning import ReasoningCache, CacheConfig

# Configure cache
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    ttl_seconds=3600,
    max_size_mb=100,
    enable_compression=True,
    compression_level=9
)

# Initialize cache
cache = ReasoningCache(config)

# Store reasoning result
cache.store(
    query="What is machine learning?",
    result={
        "conclusion": "Machine learning is a subset of AI...",
        "confidence": 0.95,
        "reasoning_trace": ["Step 1", "Step 2", "Step 3"]
    },
    context={"subject": "ai", "difficulty": "medium"},
    cost=0.05
)

# Retrieve reasoning result
result = cache.get(
    query="What is machine learning?",
    context={"subject": "ai", "difficulty": "medium"}
)

if result:
    print(f"Answer: {result['conclusion']}")
    print(f"Confidence: {result['confidence']:.2f}")
    print(f"Cost Saved: {result.get('cost', 0):.4f}")

Configuration

Basic Configuration

config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    ttl_seconds=3600,  # 1 hour TTL
    max_size_mb=100,   # 100MB max size
    enable_compression=True,
    compression_level=9  # Highest compression
)

Advanced Configuration

config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    ttl_seconds=7200,  # 2 hours TTL
    max_size_mb=500,   # 500MB max size
    enable_compression=True,
    compression_level=6,  # Balanced compression
    # Additional Redis options
    redis_options={
        "socket_timeout": 5,
        "socket_connect_timeout": 5,
        "retry_on_timeout": True
    }
)

Environment Variables

# Set via environment variables
export REASONING_CACHE_REDIS_URL="redis://localhost:6379/0"
export REASONING_CACHE_TTL_SECONDS="3600"
export REASONING_CACHE_MAX_SIZE_MB="100"
export REASONING_CACHE_ENABLE_COMPRESSION="true"
export REASONING_CACHE_COMPRESSION_LEVEL="9"

Cache Operations

Storing Results

# Store simple result
cache.store(
    query="What is 2+2?",
    result={"conclusion": "4", "confidence": 1.0},
    context={"subject": "math"},
    cost=0.01
)

# Store complex result
cache.store(
    query="Explain quantum computing",
    result={
        "conclusion": "Quantum computing uses quantum mechanical phenomena...",
        "confidence": 0.92,
        "reasoning_trace": [
            "Step 1: Define quantum mechanics",
            "Step 2: Explain superposition",
            "Step 3: Describe quantum gates"
        ],
        "metadata": {
            "sources": ["textbook", "research_paper"],
            "difficulty": "advanced"
        }
    },
    context={"subject": "physics", "level": "advanced"},
    cost=0.15
)

Retrieving Results

# Retrieve result
result = cache.get(
    query="What is 2+2?",
    context={"subject": "math"}
)

if result:
    print(f"Answer: {result['conclusion']}")
    print(f"Confidence: {result['confidence']:.2f}")
    print(f"Cost Saved: ${result.get('cost', 0):.4f}")
else:
    print("Cache miss - need to compute")

Cache Management

# Clear all cache
cache.clear()

# Get cache statistics
stats = cache.get_stats()
print(f"Cache Hits: {stats['hits']}")
print(f"Cache Misses: {stats['misses']}")
print(f"Hit Rate: {stats['hit_rate']:.2%}")
print(f"Total Stores: {stats['stores']}")
print(f"Evictions: {stats['evictions']}")
print(f"Cache Type: {stats['cache_type']}")
print(f"Current Size: {stats['current_size']}")

Integration Examples

With DSPy Reasoning

from recoagent.reasoning import DSPyReasoningEngine, ReasoningCache

# Initialize reasoning engine with cache
cache = ReasoningCache(config)
engine = DSPyReasoningEngine(
    enable_caching=True,
    cache=cache
)

# Use cached reasoning
result = engine.reason(
    query="What is machine learning?",
    use_cache=True  # Enable caching
)

With Cost Tracking

from packages.observability import get_cost_tracker

# Track cache cost savings
cost_tracker = get_cost_tracker()

# Store with cost tracking
cache.store(
    query="Expensive reasoning problem",
    result=reasoning_result,
    context=context,
    cost=0.10  # Original cost
)

# Track cost savings
cost_tracker.add_cost_entry(
    category=CostCategory.LLM_TOKENS,
    provider="cache",
    model="reasoning_cache",
    operation="cache_hit",
    cost_usd=-0.10,  # Negative cost (savings)
    metadata={"cache_type": "reasoning"}
)

With Workflows

from packages.observability import trace_workflow

@trace_workflow(name="cached_reasoning_workflow")
async def reasoning_workflow(problem):
    # Check cache first
    cached_result = cache.get(
        query=problem,
        context={"workflow": "reasoning_workflow"}
    )
    
    if cached_result:
        print("Using cached result")
        return cached_result
    
    # Compute if not cached
    result = await compute_reasoning(problem)
    
    # Store in cache
    cache.store(
        query=problem,
        result=result,
        context={"workflow": "reasoning_workflow"},
        cost=result.get('cost', 0)
    )
    
    return result

Performance Optimization

Compression Settings

# High compression (more CPU, less storage)
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    enable_compression=True,
    compression_level=9  # Highest compression
)

# Balanced compression
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    enable_compression=True,
    compression_level=6  # Balanced
)

# No compression (faster, more storage)
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    enable_compression=False
)

TTL Optimization

# Short TTL for dynamic content
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    ttl_seconds=300  # 5 minutes
)

# Long TTL for stable content
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    ttl_seconds=86400  # 24 hours
)

# Variable TTL based on content
cache.store(
    query="Current weather",
    result=weather_result,
    context=context,
    cost=0.01,
    ttl_seconds=600  # 10 minutes for weather
)

cache.store(
    query="Historical fact",
    result=fact_result,
    context=context,
    cost=0.01,
    ttl_seconds=86400  # 24 hours for facts
)

Memory Management

# Configure memory limits
config = CacheConfig(
    redis_url="redis://localhost:6379/0",
    max_size_mb=1000,  # 1GB limit
    enable_compression=True,
    compression_level=6
)

# Monitor memory usage
stats = cache.get_stats()
if stats['current_size'] > 800:  # 80% of limit
    print("Warning: Cache approaching size limit")
    # Consider clearing old entries

Monitoring and Analytics

Cache Statistics

# Get comprehensive statistics
stats = cache.get_stats()

print("=== Cache Statistics ===")
print(f"Cache Type: {stats['cache_type']}")
print(f"Total Hits: {stats['hits']}")
print(f"Total Misses: {stats['misses']}")
print(f"Hit Rate: {stats['hit_rate']:.2%}")
print(f"Total Stores: {stats['stores']}")
print(f"Evictions: {stats['evictions']}")
print(f"Current Size: {stats['current_size']}")

# Calculate cost savings
if stats['hits'] > 0:
    avg_cost_per_hit = 0.05  # Average cost per LLM call
    total_savings = stats['hits'] * avg_cost_per_hit
    print(f"Estimated Cost Savings: ${total_savings:.2f}")

Performance Metrics

import time

# Measure cache performance
start_time = time.time()

# Cache hit
result = cache.get(query="test query", context={})
hit_time = time.time() - start_time

if result:
    print(f"Cache Hit Time: {hit_time:.4f} seconds")
else:
    print("Cache Miss")
    # Measure cache store time
    start_time = time.time()
    cache.store(query="test query", result={"answer": "test"}, context={})
    store_time = time.time() - start_time
    print(f"Cache Store Time: {store_time:.4f} seconds")

Error Handling

Connection Errors

try:
    cache = ReasoningCache(config)
except ConnectionError as e:
    print(f"Redis connection failed: {e}")
    # Fallback to in-memory cache
    config.redis_url = None
    cache = ReasoningCache(config)

Cache Errors

# Handle cache errors gracefully
try:
    result = cache.get(query="test", context={})
except Exception as e:
    print(f"Cache error: {e}")
    # Fallback to computation
    result = compute_reasoning("test")

Fallback Strategy

# Implement fallback strategy
def get_reasoning_with_fallback(query, context):
    # Try cache first
    try:
        result = cache.get(query=query, context=context)
        if result:
            return result
    except Exception as e:
        print(f"Cache error: {e}")
    
    # Fallback to computation
    result = compute_reasoning(query)
    
    # Try to store in cache
    try:
        cache.store(query=query, result=result, context=context)
    except Exception as e:
        print(f"Cache store error: {e}")
    
    return result

Best Practices

Set Appropriate TTL: Based on content volatility
Use Compression: For large reasoning results
Monitor Performance: Track hit rates and costs
Handle Errors: Implement fallback strategies
Optimize Keys: Use consistent query/context keys
Clean Up: Regularly clear old entries
Test Thoroughly: Test with various scenarios

Troubleshooting

Common Issues

Redis Connection: Check Redis server and connection string
Memory Issues: Monitor cache size and Redis memory
Compression Errors: Check LZ4 installation
TTL Issues: Verify TTL settings and Redis configuration

Debug Mode

# Enable debug logging
import logging
logging.getLogger('recoagent.reasoning.reasoning_cache').setLevel(logging.DEBUG)

Health Check

# Check cache health
def check_cache_health():
    try:
        # Test basic operations
        test_key = "health_check"
        test_result = {"test": "value"}
        
        # Store
        cache.store(
            query=test_key,
            result=test_result,
            context={}
        )
        
        # Retrieve
        retrieved = cache.get(query=test_key, context={})
        
        # Clean up
        cache.clear()
        
        if retrieved == test_result:
            print("✅ Cache is healthy")
            return True
        else:
            print("❌ Cache data corruption")
            return False
            
    except Exception as e:
        print(f"❌ Cache health check failed: {e}")
        return False

API Reference

CacheConfig

Parameter	Type	Description
`redis_url`	str	Redis connection URL
`ttl_seconds`	int	Time-to-live in seconds
`max_size_mb`	int	Maximum cache size in MB
`enable_compression`	bool	Enable LZ4 compression
`compression_level`	int	Compression level (0-9)

ReasoningCache

Method	Description
`store(query, result, context, cost)`	Store reasoning result
`get(query, context)`	Retrieve reasoning result
`clear()`	Clear all cache entries
`get_stats()`	Get cache statistics

Statistics

Metric	Description
`hits`	Number of cache hits
`misses`	Number of cache misses
`hit_rate`	Cache hit rate percentage
`stores`	Number of cache stores
`evictions`	Number of evictions
`cache_type`	Cache backend type
`current_size`	Current cache size

Features​

Quick Start​

Configuration​

Basic Configuration​

Advanced Configuration​

Environment Variables​

Cache Operations​

Storing Results​

Retrieving Results​

Cache Management​

Integration Examples​

With DSPy Reasoning​

With Cost Tracking​

With Workflows​

Performance Optimization​

Compression Settings​

TTL Optimization​

Memory Management​

Monitoring and Analytics​

Cache Statistics​

Performance Metrics​

Error Handling​

Connection Errors​

Cache Errors​

Fallback Strategy​

Best Practices​

Troubleshooting​

Common Issues​

Debug Mode​

Health Check​

API Reference​

CacheConfig​

ReasoningCache​

Statistics​

Features

Quick Start

Configuration

Basic Configuration

Advanced Configuration

Environment Variables

Cache Operations

Storing Results

Retrieving Results

Cache Management

Integration Examples

With DSPy Reasoning

With Cost Tracking

With Workflows

Performance Optimization

Compression Settings

TTL Optimization

Memory Management

Monitoring and Analytics

Cache Statistics

Performance Metrics

Error Handling

Connection Errors

Cache Errors

Fallback Strategy

Best Practices

Troubleshooting

Common Issues

Debug Mode

Health Check

API Reference

CacheConfig

ReasoningCache

Statistics