Production API

Location: apps/api/
Tech Stack: FastAPI, OpenAPI, Docker
Purpose: Production-grade RESTful API for deploying RecoAgent at scale

🎯 Overview

The Production API is a FastAPI-based REST API designed for enterprise deployments. It provides secure, scalable access to RecoAgent with built-in authentication, rate limiting, and comprehensive observability.

Use this when:

Deploying to production environments
Need RESTful endpoints for integration
Require authentication and access control
Want OpenAPI/Swagger documentation
Need to scale horizontally

⚡ Quick Start

# Navigate to API directory
cd apps/api

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export OPENAI_API_KEY="your-key"
export DATABASE_URL="postgresql://..."

# Run server
uvicorn main:app --reload

# Open API docs: http://localhost:8000/docs

📡 Key Endpoints

Query Agent

POST /api/v1/query
Content-Type: application/json
Authorization: Bearer <token>

{
  "question": "What is hybrid search?",
  "session_id": "user123",
  "include_sources": true
}

Response:

{
  "answer": "Hybrid search combines...",
  "sources": [...],
  "confidence": 0.92,
  "latency_ms": 850,
  "cost": 0.012
}

Health Check

GET /health

# Response
{
  "status": "healthy",
  "version": "1.0.0",
  "uptime_seconds": 12345
}

Metrics

GET /metrics

# Returns Prometheus-format metrics

🔐 Authentication

The API supports multiple authentication methods:

API Key Authentication

curl -H "X-API-Key: your-api-key" \
     http://localhost:8000/api/v1/query

JWT Bearer Tokens

# Get token
POST /auth/token
{
  "username": "user",
  "password": "pass"
}

# Use token
curl -H "Authorization: Bearer <token>" \
     http://localhost:8000/api/v1/query

⚙️ Configuration

Create .env file:

# LLM Configuration
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-ada-002

# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=4

# Database
DATABASE_URL=postgresql://user:pass@localhost/recoagent

# Redis (for caching)
REDIS_URL=redis://localhost:6379

# Security
SECRET_KEY=your-secret-key-here
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30

# Rate Limiting
RATE_LIMIT_PER_MINUTE=100
RATE_LIMIT_PER_HOUR=1000

# Monitoring
LANGSMITH_API_KEY=your-langsmith-key
ENABLE_METRICS=true

🚀 Production Deployment

Docker Deployment

# Dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

# Build and run
docker build -t recoagent-api .
docker run -p 8000:8000 --env-file .env recoagent-api

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: recoagent-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: recoagent-api
  template:
    metadata:
      labels:
        app: recoagent-api
    spec:
      containers:
      - name: api
        image: recoagent-api:latest
        ports:
        - containerPort: 8000
        env:
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: recoagent-secrets
              key: openai-api-key
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"

📊 Monitoring & Observability

Prometheus Metrics

# Available metrics
recoagent_requests_total
recoagent_request_duration_seconds
recoagent_llm_calls_total
recoagent_llm_cost_total
recoagent_errors_total

LangSmith Tracing

All requests are automatically traced in LangSmith for debugging.

Structured Logging

{
  "timestamp": "2024-10-09T10:30:00Z",
  "level": "INFO",
  "request_id": "abc123",
  "endpoint": "/api/v1/query",
  "latency_ms": 850,
  "cost": 0.012,
  "user_id": "user123"
}

🛡️ Security Features

Feature	Status	Description
Authentication	✅	API keys + JWT tokens
Rate Limiting	✅	Per-user and global limits
Input Validation	✅	Pydantic models
SQL Injection Prevention	✅	Parameterized queries
CORS	✅	Configurable origins
HTTPS	✅	TLS/SSL support
Secrets Management	✅	Environment variables
Audit Logging	✅	All requests logged

🔥 Performance Tips

1. Enable Caching

# Redis caching for responses
ENABLE_REDIS_CACHE=true
CACHE_TTL_SECONDS=3600

2. Optimize Workers

# Adjust worker count based on load
uvicorn main:app --workers 4 --host 0.0.0.0

3. Use Connection Pooling

# PostgreSQL connection pool
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=10

📖 API Documentation

OpenAPI Docs

Visit http://localhost:8000/docs for interactive Swagger UI.

ReDoc

Visit http://localhost:8000/redoc for alternative documentation.

🧪 Testing

# Run tests
pytest tests/

# Load testing
locust -f locustfile.py --host=http://localhost:8000

🆘 Troubleshooting

Issue	Solution
API not starting	Check port 8000 is free: `lsof -i :8000`
401 Unauthorized	Verify API key or JWT token
429 Rate Limited	Wait or increase rate limits
Slow responses	Check LLM API latency, enable caching
Memory errors	Increase worker memory limits

Ready to deploy? Check out the Deployment Guide! 🚀

🎯 Overview​

⚡ Quick Start​

📡 Key Endpoints​

Query Agent​

Health Check​

Metrics​

🔐 Authentication​

API Key Authentication​

JWT Bearer Tokens​

⚙️ Configuration​

🚀 Production Deployment​

Docker Deployment​

Kubernetes Deployment​

📊 Monitoring & Observability​

Prometheus Metrics​

LangSmith Tracing​

Structured Logging​

🛡️ Security Features​

🔥 Performance Tips​

1. Enable Caching​

2. Optimize Workers​

3. Use Connection Pooling​

📖 API Documentation​

OpenAPI Docs​

ReDoc​

🧪 Testing​

🆘 Troubleshooting​

📚 Related Docs​