Skip to main content

Document Search & Summarization

Practical examples demonstrating the Document Search & Summarization service with hybrid retrieval, reranking, and grounded summarization.

Overview

This guide provides code examples and patterns for:

  • Setting up the document search pipeline
  • Executing searches with different profiles
  • Understanding hybrid retrieval (BM25 + Vector)
  • Working with grounded summarization and citations
  • Production deployment patterns

Quick Start

from packages.rag.document_search import (
DocumentSearchPipeline,
ProfileType,
OpenSearchDocumentStore
)

# Initialize store
store = OpenSearchDocumentStore(
host="localhost",
port=9200,
index_name="documents"
)

# Create pipeline with balanced profile
pipeline = DocumentSearchPipeline.create_profile(
profile=ProfileType.BALANCED,
store=store,
embedding_function=your_embedding_function
)

# Execute search and summarization
result = pipeline.execute(
query="How do I reset my password?",
filters={"category": "account"}
)

# Access results
print(f"Summary: {result.summary.text}")
print(f"Citations: {len(result.summary.citations)}")
print(f"Latency: {result.timing['total_ms']:.2f}ms")

Example Dataset

All examples use a common test dataset with 10 queries across 3 user stories:

  1. Knowledge Assistant (4 queries) - Customer support scenarios
  2. Policy/Compliance QA (3 queries) - Regulatory compliance
  3. Engineering Docs (3 queries) - Technical documentation

Running Examples

# Run the main demo
python examples/document_search_demo.py

# Run specific examples (coming soon)
python examples/document_search/basic_usage_example.py
python examples/document_search/profile_comparison_example.py

Prerequisites

Required Dependencies

pip install opensearch-py langchain langchain-openai sumy nltk

Optional Dependencies

pip install sentence-transformers  # For local embeddings
pip install anthropic # For Claude models

OpenSearch Setup

# Using Docker
docker run -p 9200:9200 -p 9600:9600 \
-e "discovery.type=single-node" \
opensearchproject/opensearch:latest

Common Patterns

# Customer support KB search
result = pipeline.execute(
query=customer_question,
filters={"category": "support", "status": "published"}
)

# Return cited answer to support agent
return {
"answer": result.summary.text,
"sources": [c.document_title for c in result.summary.citations.values()],
"confidence": result.summary.faithfulness
}

Pattern 2: Compliance Research

# Use quality-first profile for compliance
pipeline = DocumentSearchPipeline.create_profile(
profile=ProfileType.QUALITY_FIRST,
store=store,
embedding_function=embedding_fn
)

result = pipeline.execute(
query=compliance_question,
filters={"document_type": "policy", "approved": True}
)

# Verify high faithfulness
if result.summary.faithfulness < 0.95:
log_warning("Faithfulness below compliance threshold")
# Use latency-first for auto-complete
pipeline = DocumentSearchPipeline.create_profile(
profile=ProfileType.LATENCY_FIRST,
store=store,
embedding_function=embedding_fn
)

# Fast results for instant feedback
result = pipeline.execute(
query=partial_query,
filters={"recent": True}
)

# Should complete in < 250ms
assert result.slo_met

Evaluation

All examples include evaluation metrics:

from packages.rag.document_search.test_fixtures import get_all_fixtures

fixtures = get_all_fixtures()

for test_case in fixtures:
result = pipeline.execute(
query=test_case.query,
filters=test_case.filters
)

# Check SLO compliance
assert result.slo_met
assert result.summary.faithfulness >= test_case.slo_requirements["faithfulness"]

Troubleshooting

Issue: Slow searches

Solution: Check profile configuration, enable caching, optimize filters

Issue: Low faithfulness

Solution: Verify source quality, adjust confidence threshold, use extractive mode

Issue: Poor relevance

Solution: Tune alpha parameter, enable query expansion, adjust reranking

Issue: High costs

Solution: Use extractive summarization, implement caching, optimize LLM calls

Next Steps

  1. Read the Architecture Guide
  2. Explore the Complete Guide
  3. Check the Storage & Indexing guide
  4. See API Integration examples

Additional Resources