Skip to main content

Production API

Location: apps/api/
Tech Stack: FastAPI, OpenAPI, Docker
Purpose: Production-grade RESTful API for deploying RecoAgent at scale

๐ŸŽฏ Overviewโ€‹

The Production API is a FastAPI-based REST API designed for enterprise deployments. It provides secure, scalable access to RecoAgent with built-in authentication, rate limiting, and comprehensive observability.

Use this when:

  • Deploying to production environments
  • Need RESTful endpoints for integration
  • Require authentication and access control
  • Want OpenAPI/Swagger documentation
  • Need to scale horizontally

โšก Quick Startโ€‹

# Navigate to API directory
cd apps/api

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export OPENAI_API_KEY="your-key"
export DATABASE_URL="postgresql://..."

# Run server
uvicorn main:app --reload

# Open API docs: http://localhost:8000/docs

๐Ÿ“ก Key Endpointsโ€‹

Query Agentโ€‹

POST /api/v1/query
Content-Type: application/json
Authorization: Bearer <token>

{
"question": "What is hybrid search?",
"session_id": "user123",
"include_sources": true
}

Response:

{
"answer": "Hybrid search combines...",
"sources": [...],
"confidence": 0.92,
"latency_ms": 850,
"cost": 0.012
}

Health Checkโ€‹

GET /health

# Response
{
"status": "healthy",
"version": "1.0.0",
"uptime_seconds": 12345
}

Metricsโ€‹

GET /metrics

# Returns Prometheus-format metrics

๐Ÿ” Authenticationโ€‹

The API supports multiple authentication methods:

API Key Authenticationโ€‹

curl -H "X-API-Key: your-api-key" \
http://localhost:8000/api/v1/query

JWT Bearer Tokensโ€‹

# Get token
POST /auth/token
{
"username": "user",
"password": "pass"
}

# Use token
curl -H "Authorization: Bearer <token>" \
http://localhost:8000/api/v1/query

โš™๏ธ Configurationโ€‹

Create .env file:

# LLM Configuration
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-ada-002

# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=4

# Database
DATABASE_URL=postgresql://user:pass@localhost/recoagent

# Redis (for caching)
REDIS_URL=redis://localhost:6379

# Security
SECRET_KEY=your-secret-key-here
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30

# Rate Limiting
RATE_LIMIT_PER_MINUTE=100
RATE_LIMIT_PER_HOUR=1000

# Monitoring
LANGSMITH_API_KEY=your-langsmith-key
ENABLE_METRICS=true

๐Ÿš€ Production Deploymentโ€‹

Docker Deploymentโ€‹

# Dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .
EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
# Build and run
docker build -t recoagent-api .
docker run -p 8000:8000 --env-file .env recoagent-api

Kubernetes Deploymentโ€‹

apiVersion: apps/v1
kind: Deployment
metadata:
name: recoagent-api
spec:
replicas: 3
selector:
matchLabels:
app: recoagent-api
template:
metadata:
labels:
app: recoagent-api
spec:
containers:
- name: api
image: recoagent-api:latest
ports:
- containerPort: 8000
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: recoagent-secrets
key: openai-api-key
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"

๐Ÿ“Š Monitoring & Observabilityโ€‹

Prometheus Metricsโ€‹

# Available metrics
recoagent_requests_total
recoagent_request_duration_seconds
recoagent_llm_calls_total
recoagent_llm_cost_total
recoagent_errors_total

LangSmith Tracingโ€‹

All requests are automatically traced in LangSmith for debugging.

Structured Loggingโ€‹

{
"timestamp": "2024-10-09T10:30:00Z",
"level": "INFO",
"request_id": "abc123",
"endpoint": "/api/v1/query",
"latency_ms": 850,
"cost": 0.012,
"user_id": "user123"
}

๐Ÿ›ก๏ธ Security Featuresโ€‹

FeatureStatusDescription
Authenticationโœ…API keys + JWT tokens
Rate Limitingโœ…Per-user and global limits
Input Validationโœ…Pydantic models
SQL Injection Preventionโœ…Parameterized queries
CORSโœ…Configurable origins
HTTPSโœ…TLS/SSL support
Secrets Managementโœ…Environment variables
Audit Loggingโœ…All requests logged

๐Ÿ”ฅ Performance Tipsโ€‹

1. Enable Cachingโ€‹

# Redis caching for responses
ENABLE_REDIS_CACHE=true
CACHE_TTL_SECONDS=3600

2. Optimize Workersโ€‹

# Adjust worker count based on load
uvicorn main:app --workers 4 --host 0.0.0.0

3. Use Connection Poolingโ€‹

# PostgreSQL connection pool
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=10

๐Ÿ“– API Documentationโ€‹

OpenAPI Docsโ€‹

Visit http://localhost:8000/docs for interactive Swagger UI.

ReDocโ€‹

Visit http://localhost:8000/redoc for alternative documentation.

๐Ÿงช Testingโ€‹

# Run tests
pytest tests/

# Load testing
locust -f locustfile.py --host=http://localhost:8000

๐Ÿ†˜ Troubleshootingโ€‹

IssueSolution
API not startingCheck port 8000 is free: lsof -i :8000
401 UnauthorizedVerify API key or JWT token
429 Rate LimitedWait or increase rate limits
Slow responsesCheck LLM API latency, enable caching
Memory errorsIncrease worker memory limits

Ready to deploy? Check out the Deployment Guide! ๐Ÿš€