Recommendation Bandits
Multi-armed bandit algorithms for adaptive and efficient recommendation systems.
Overview
The recommendation bandits system provides various multi-armed bandit algorithms for adaptive recommendation strategies that balance exploration and exploitation.
Core Features
- Multiple Algorithms: Thompson Sampling, UCB, Epsilon-Greedy
- Adaptive Learning: Dynamic strategy adjustment
- Exploration vs Exploitation: Balanced exploration strategies
- Contextual Bandits: Context-aware recommendations
- Performance Optimization: Efficient bandit implementations
Usage Examples
Basic Bandit Algorithm
from recoagent.recommendations.bandits import ThompsonSamplingBandit
# Create Thompson Sampling bandit
bandit = ThompsonSamplingBandit(
n_arms=10,
alpha=1.0,
beta=1.0
)
# Select arm (recommendation)
arm = bandit.select_arm(context={"user_id": "user_123"})
# Update with reward
bandit.update(arm, reward=0.8)
Advanced Contextual Bandit
from recoagent.recommendations.bandits import ContextualBandit
# Create contextual bandit
contextual_bandit = ContextualBandit(
algorithm="linucb",
context_dim=50,
exploration_parameter=0.1
)
# Select arm with context
arm = contextual_bandit.select_arm(
context={
"user_features": [0.1, 0.5, 0.3],
"item_features": [0.2, 0.4, 0.6]
}
)
# Update with reward and context
contextual_bandit.update(arm, reward=0.9, context=context)
API Reference
ThompsonSamplingBandit Methods
select_arm(context: Dict = None) -> int
Select arm using Thompson Sampling
Parameters:
context(Dict, optional): Context information
Returns: Selected arm index
update(arm: int, reward: float) -> None
Update bandit with reward
Parameters:
arm(int): Arm indexreward(float): Reward value
See Also
- Recommendation Algorithms - Recommendation algorithms
- Recommendation Agents - Agent-based recommendations