Skip to main content

Recommendation Evaluation

Comprehensive evaluation framework for recommendation systems including metrics, A/B testing, and performance analysis.

Overview

The recommendation evaluation system provides comprehensive evaluation capabilities for recommendation systems including offline and online evaluation methods.

Core Features

  • Evaluation Metrics: Precision, Recall, NDCG, MAP, and more
  • A/B Testing: Online evaluation and experimentation
  • Offline Evaluation: Historical data evaluation
  • Performance Analysis: Detailed performance insights
  • Statistical Testing: Significance testing for improvements

Usage Examples

Basic Evaluation

from recoagent.recommendations.evaluation import RecommendationEvaluator

# Create evaluator
evaluator = RecommendationEvaluator()

# Evaluate recommendations
metrics = evaluator.evaluate(
recommendations=recommendations,
ground_truth=ground_truth,
metrics=["precision", "recall", "ndcg"]
)

print(f"Precision: {metrics['precision']:.3f}")
print(f"Recall: {metrics['recall']:.3f}")
print(f"NDCG: {metrics['ndcg']:.3f}")

Advanced A/B Testing

from recoagent.recommendations.evaluation import ABTestEvaluator

# Create A/B test evaluator
ab_evaluator = ABTestEvaluator()

# Run A/B test
ab_results = ab_evaluator.run_ab_test(
control_recommendations=control_recs,
treatment_recommendations=treatment_recs,
user_assignments=user_assignments,
evaluation_period=30 # days
)

# Analyze results
significance = ab_evaluator.test_significance(ab_results)
print(f"Significant improvement: {significance}")

API Reference

RecommendationEvaluator Methods

evaluate(recommendations: List, ground_truth: List, metrics: List[str]) -> Dict

Evaluate recommendations

Parameters:

  • recommendations (List): Generated recommendations
  • ground_truth (List): Ground truth data
  • metrics (List[str]): Metrics to compute

Returns: Dictionary of metric values

ABTestEvaluator Methods

run_ab_test(control_recommendations: List, treatment_recommendations: List, user_assignments: Dict, evaluation_period: int) -> ABTestResults

Run A/B test for recommendations

Parameters:

  • control_recommendations (List): Control group recommendations
  • treatment_recommendations (List): Treatment group recommendations
  • user_assignments (Dict): User group assignments
  • evaluation_period (int): Evaluation period in days

Returns: A/B test results

See Also