Skip to main content

LLM Providers

Multi-LLM provider factory supporting OpenAI, Anthropic, Google, Groq, and other providers with automatic fallback, intelligent routing, and cost optimization.

Core Classes

ProviderFactory

Description: Factory for creating and managing multiple LLM providers

Parameters:

  • config (MultiLLMConfig): Multi-LLM configuration
  • enable_fallback (bool): Enable automatic fallback (default: True)
  • routing_strategy (str): Routing strategy ("cost", "latency", "quality", "round_robin")

Returns: ProviderFactory instance

Example:

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create configuration
config = MultiLLMConfig(
primary_provider="openai",
fallback_providers=["anthropic", "google"],
routing_strategy="cost",
enable_cost_tracking=True
)

# Create provider factory
factory = ProviderFactory(config=config)

# Get provider with automatic routing
llm = factory.get_provider()
response = llm.invoke("What is machine learning?")

MultiLLMConfig

Description: Configuration for multi-LLM setup

Parameters:

  • primary_provider (str): Primary provider name
  • fallback_providers (List[str]): List of fallback providers
  • routing_strategy (str): Routing strategy
  • cost_limits (Dict[str, float]): Cost limits per provider
  • latency_limits (Dict[str, float]): Latency limits per provider

Returns: MultiLLMConfig instance

Example:

from recoagent.llm import MultiLLMConfig

# Create comprehensive configuration
config = MultiLLMConfig(
primary_provider="openai",
fallback_providers=["anthropic", "google", "groq"],
routing_strategy="cost",
cost_limits={
"openai": 0.10,
"anthropic": 0.15,
"google": 0.05
},
latency_limits={
"openai": 5.0,
"anthropic": 8.0,
"google": 3.0
},
enable_cost_tracking=True,
enable_latency_tracking=True
)

LLMRouter

Description: Intelligent router for selecting optimal LLM provider

Parameters:

  • routing_strategy (str): Routing strategy
  • cost_tracker (CostTracker): Cost tracking component
  • latency_tracker (LatencyTracker): Latency tracking component
  • quality_tracker (QualityTracker): Quality tracking component

Returns: LLMRouter instance

Example:

from recoagent.llm import LLMRouter, CostTracker, LatencyTracker

# Create tracking components
cost_tracker = CostTracker()
latency_tracker = LatencyTracker()
quality_tracker = QualityTracker()

# Create router
router = LLMRouter(
routing_strategy="cost",
cost_tracker=cost_tracker,
latency_tracker=latency_tracker,
quality_tracker=quality_tracker
)

# Route request
provider = router.route_request(
request_type="chat",
priority="high",
context={"user_tier": "premium"}
)

Usage Examples

Basic Multi-Provider Setup

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create basic configuration
config = MultiLLMConfig(
primary_provider="openai",
fallback_providers=["anthropic", "google"],
routing_strategy="round_robin"
)

# Create factory
factory = ProviderFactory(config=config)

# Get provider
llm = factory.get_provider()
response = llm.invoke("Explain quantum computing")

print(f"Provider used: {factory.get_current_provider()}")
print(f"Response: {response.content}")

Cost-Optimized Routing

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create cost-optimized configuration
config = MultiLLMConfig(
primary_provider="groq", # Cheapest
fallback_providers=["google", "openai", "anthropic"],
routing_strategy="cost",
cost_limits={
"groq": 0.01,
"google": 0.05,
"openai": 0.10,
"anthropic": 0.15
},
enable_cost_tracking=True
)

# Create factory
factory = ProviderFactory(config=config)

# Process multiple requests with cost optimization
queries = [
"What is AI?",
"Explain machine learning",
"How do neural networks work?"
]

total_cost = 0
for query in queries:
llm = factory.get_provider()
response = llm.invoke(query)

cost = factory.get_last_request_cost()
total_cost += cost

print(f"Query: {query}")
print(f"Provider: {factory.get_current_provider()}")
print(f"Cost: ${cost:.4f}")
print(f"Response: {response.content[:100]}...")
print("---")

print(f"Total cost: ${total_cost:.4f}")

Latency-Optimized Routing

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create latency-optimized configuration
config = MultiLLMConfig(
primary_provider="google", # Fastest
fallback_providers=["groq", "openai", "anthropic"],
routing_strategy="latency",
latency_limits={
"google": 2.0,
"groq": 3.0,
"openai": 5.0,
"anthropic": 8.0
},
enable_latency_tracking=True
)

# Create factory
factory = ProviderFactory(config=config)

# Process requests with latency optimization
import time

queries = [
"Quick question about AI",
"Brief explanation of ML",
"Short answer about neural networks"
]

for query in queries:
start_time = time.time()

llm = factory.get_provider()
response = llm.invoke(query)

end_time = time.time()
latency = end_time - start_time

print(f"Query: {query}")
print(f"Provider: {factory.get_current_provider()}")
print(f"Latency: {latency:.2f}s")
print(f"Response: {response.content[:100]}...")
print("---")

Quality-Based Routing

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create quality-optimized configuration
config = MultiLLMConfig(
primary_provider="anthropic", # Highest quality
fallback_providers=["openai", "google", "groq"],
routing_strategy="quality",
quality_thresholds={
"anthropic": 0.9,
"openai": 0.8,
"google": 0.7,
"groq": 0.6
},
enable_quality_tracking=True
)

# Create factory
factory = ProviderFactory(config=config)

# Process complex queries with quality focus
complex_queries = [
"Explain the mathematical foundations of deep learning",
"Compare different approaches to natural language processing",
"Analyze the ethical implications of AI development"
]

for query in complex_queries:
llm = factory.get_provider()
response = llm.invoke(query)

quality_score = factory.get_last_quality_score()

print(f"Query: {query}")
print(f"Provider: {factory.get_current_provider()}")
print(f"Quality Score: {quality_score:.3f}")
print(f"Response: {response.content[:200]}...")
print("---")

Advanced Fallback System

from recoagent.llm import ProviderFactory, MultiLLMConfig

# Create configuration with comprehensive fallback
config = MultiLLMConfig(
primary_provider="openai",
fallback_providers=["anthropic", "google", "groq"],
routing_strategy="cost",
enable_fallback=True,
fallback_conditions=["error", "timeout", "cost_limit", "quality_threshold"],
timeout_seconds=10,
max_retries=3
)

# Create factory
factory = ProviderFactory(config=config)

# Test fallback scenarios
test_queries = [
"What is machine learning?",
"Explain deep learning concepts",
"How do neural networks work?"
]

for query in test_queries:
try:
llm = factory.get_provider()
response = llm.invoke(query)

print(f"Query: {query}")
print(f"Provider: {factory.get_current_provider()}")
print(f"Success: True")
print(f"Response: {response.content[:100]}...")

except Exception as e:
print(f"Query: {query}")
print(f"Error: {str(e)}")
print(f"Fallback triggered: {factory.was_fallback_used()}")
print(f"Final provider: {factory.get_current_provider()}")

print("---")

Custom Provider Integration

from recoagent.llm import ProviderFactory, MultiLLMConfig, CustomLLMProvider
from langchain.schema.language_model import BaseLanguageModel

# Create custom provider
class CustomLLMProvider(BaseLanguageModel):
"""Custom LLM provider implementation."""

def __init__(self, model_name: str, api_key: str):
self.model_name = model_name
self.api_key = api_key

def _generate(self, messages, **kwargs):
# Custom implementation
return "Custom LLM response"

def _llm_type(self):
return "custom"

# Create configuration with custom provider
config = MultiLLMConfig(
primary_provider="custom",
fallback_providers=["openai", "anthropic"],
custom_providers={
"custom": CustomLLMProvider("custom-model", "api-key")
}
)

# Create factory
factory = ProviderFactory(config=config)

# Use custom provider
llm = factory.get_provider("custom")
response = llm.invoke("Test custom provider")

print(f"Provider: {factory.get_current_provider()}")
print(f"Response: {response}")

Real-time Provider Monitoring

from recoagent.llm import ProviderFactory, MultiLLMConfig
import asyncio

# Create configuration with monitoring
config = MultiLLMConfig(
primary_provider="openai",
fallback_providers=["anthropic", "google", "groq"],
routing_strategy="cost",
enable_monitoring=True,
monitoring_interval=60 # seconds
)

# Create factory
factory = ProviderFactory(config=config)

async def monitor_providers():
"""Monitor provider performance in real-time."""
while True:
# Get provider statistics
stats = factory.get_provider_statistics()

print("=== Provider Statistics ===")
for provider, metrics in stats.items():
print(f"{provider}:")
print(f" Success rate: {metrics['success_rate']:.2%}")
print(f" Avg latency: {metrics['avg_latency']:.2f}s")
print(f" Avg cost: ${metrics['avg_cost']:.4f}")
print(f" Quality score: {metrics['quality_score']:.3f}")

# Check for provider issues
for provider, metrics in stats.items():
if metrics['success_rate'] < 0.9:
print(f"⚠️ Warning: {provider} success rate below 90%")

if metrics['avg_latency'] > 10.0:
print(f"⚠️ Warning: {provider} latency above 10s")

await asyncio.sleep(60) # Monitor every minute

# Run monitoring
asyncio.run(monitor_providers())

API Reference

ProviderFactory Methods

get_provider(provider_name: str = None) -> BaseLanguageModel

Get LLM provider instance

Parameters:

  • provider_name (str, optional): Specific provider name

Returns: LLM provider instance

get_current_provider() -> str

Get currently active provider name

Returns: Provider name

get_last_request_cost() -> float

Get cost of last request

Returns: Cost in USD

get_last_quality_score() -> float

Get quality score of last request

Returns: Quality score (0.0-1.0)

was_fallback_used() -> bool

Check if fallback was used in last request

Returns: True if fallback was used

MultiLLMConfig Methods

add_provider(provider_name: str, config: Dict) -> None

Add new provider configuration

Parameters:

  • provider_name (str): Provider name
  • config (Dict): Provider configuration

update_cost_limits(limits: Dict[str, float]) -> None

Update cost limits for providers

Parameters:

  • limits (Dict): Cost limits per provider

update_latency_limits(limits: Dict[str, float]) -> None

Update latency limits for providers

Parameters:

  • limits (Dict): Latency limits per provider

LLMRouter Methods

route_request(request_type: str, priority: str, context: Dict) -> str

Route request to optimal provider

Parameters:

  • request_type (str): Type of request
  • priority (str): Request priority
  • context (Dict): Request context

Returns: Selected provider name

update_routing_strategy(strategy: str) -> None

Update routing strategy

Parameters:

  • strategy (str): New routing strategy

See Also