RAG Document Search
Document search pipeline with advanced search capabilities and summarization features.
Overview
The RAG document search system provides comprehensive document search capabilities including indexing, search, and summarization.
Core Features
- Document Indexing: Efficient document indexing and storage
- Advanced Search: Full-text, semantic, and hybrid search
- Summarization: Automatic document summarization
- Caching: Smart caching for improved performance
- Analytics: Search analytics and insights
Usage Examples
Basic Document Search
from recoagent.rag.document_search import DocumentSearchEngine
# Create document search engine
search_engine = DocumentSearchEngine()
# Index documents
search_engine.index_documents([
{"id": "doc1", "content": "Document content...", "metadata": {"title": "Doc 1"}},
{"id": "doc2", "content": "Another document...", "metadata": {"title": "Doc 2"}}
])
# Search documents
results = search_engine.search("query text", limit=10)
Advanced Search with Summarization
# Search with summarization
search_results = search_engine.search_with_summarization(
query="machine learning algorithms",
summarize_results=True,
summary_length=200
)
# Get summarized results
for result in search_results:
print(f"Title: {result.title}")
print(f"Summary: {result.summary}")
print(f"Relevance: {result.relevance_score}")
API Reference
DocumentSearchEngine Methods
index_documents(documents: List[Dict]) -> None
Index documents for search
Parameters:
documents(List[Dict]): List of documents to index
search(query: str, limit: int = 10) -> List[SearchResult]
Search documents
Parameters:
query(str): Search querylimit(int): Maximum results
Returns: List of search results
See Also
- RAG Retrievers - Document retrieval
- RAG Chunkers - Document chunking