MongoDB Atlas Vector Search API Reference
Complete API reference for MongoDB Atlas Vector Search integration with RecoAgent.
MongoDBAtlasVectorStore
Constructor
MongoDBAtlasVectorStore(
uri: str,
database: str = "recoagent",
collection: str = "documents",
vector_search_index: str = "vector_index",
embedding_dim: int = 3072,
max_pool_size: int = 100,
min_pool_size: int = 10,
max_idle_time_ms: int = 30000,
connect_timeout_ms: int = 10000,
server_selection_timeout_ms: int = 10000
)
Parameters:
uri
(str): MongoDB connection URIdatabase
(str): Database namecollection
(str): Collection namevector_search_index
(str): Vector search index nameembedding_dim
(int): Embedding dimension sizemax_pool_size
(int): Maximum connection pool sizemin_pool_size
(int): Minimum connection pool sizemax_idle_time_ms
(int): Maximum idle time for connectionsconnect_timeout_ms
(int): Connection timeout in millisecondsserver_selection_timeout_ms
(int): Server selection timeout in milliseconds
Methods
add_documents(documents: List[VectorDocument]) -> bool
Add documents to the vector store.
Parameters:
documents
(List[VectorDocument]): List of documents to add
Returns:
bool
: True if successful, False otherwise
Example:
documents = [
VectorDocument(
id="doc1",
content="Machine learning content",
embedding=[0.1, 0.2, 0.3, ...],
metadata={"category": "AI"}
)
]
success = vector_store.add_documents(documents)
add_documents_async(documents: List[VectorDocument]) -> bool
Add documents asynchronously to the vector store.
Parameters:
documents
(List[VectorDocument]): List of documents to add
Returns:
bool
: True if successful, False otherwise
Example:
success = await vector_store.add_documents_async(documents)
search(query_embedding: List[float], k: int = 5, include_metadata: bool = True, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]
Search documents using vector similarity.
Parameters:
query_embedding
(List[float]): Query vector embeddingk
(int): Number of results to returninclude_metadata
(bool): Whether to include metadata in resultsfilter_metadata
(Optional[Dict]): Metadata filters to apply
Returns:
List[Dict[str, Any]]
: List of search results
Example:
query_embedding = [0.1, 0.2, 0.3, ...]
results = vector_store.search(
query_embedding=query_embedding,
k=10,
include_metadata=True,
filter_metadata={"category": "AI"}
)
search_async(query_embedding: List[float], k: int = 5, include_metadata: bool = True, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]
Search documents asynchronously using vector similarity.
Parameters:
query_embedding
(List[float]): Query vector embeddingk
(int): Number of results to returninclude_metadata
(bool): Whether to include metadata in resultsfilter_metadata
(Optional[Dict]): Metadata filters to apply
Returns:
List[Dict[str, Any]]
: List of search results
Example:
results = await vector_store.search_async(
query_embedding=query_embedding,
k=10,
include_metadata=True
)
hybrid_search(query_text: str, query_embedding: List[float], k: int = 5, text_weight: float = 0.3, vector_weight: float = 0.7, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]
Perform hybrid search combining text and vector search.
Parameters:
query_text
(str): Text query for text searchquery_embedding
(List[float]): Vector embedding for vector searchk
(int): Number of results to returntext_weight
(float): Weight for text search scorevector_weight
(float): Weight for vector search scorefilter_metadata
(Optional[Dict]): Metadata filters to apply
Returns:
List[Dict[str, Any]]
: List of hybrid search results
Example:
results = vector_store.hybrid_search(
query_text="machine learning algorithms",
query_embedding=query_embedding,
k=10,
text_weight=0.3,
vector_weight=0.7
)
faceted_search(query_embedding: List[float], facets: List[str], k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]
Perform faceted search with metadata aggregation.
Parameters:
query_embedding
(List[float]): Query vector embeddingfacets
(List[str]): List of facet fields to aggregatek
(int): Number of results to returnfilter_metadata
(Optional[Dict]): Metadata filters to apply
Returns:
Dict[str, Any]
: Dictionary containing results and facets
Example:
results = vector_store.faceted_search(
query_embedding=query_embedding,
facets=["category", "year", "difficulty"],
k=10
)
print(f"Results: {results['results']}")
print(f"Facets: {results['facets']}")
delete_documents(document_ids: List[str]) -> bool
Delete documents by their IDs.
Parameters:
document_ids
(List[str]): List of document IDs to delete
Returns:
bool
: True if successful, False otherwise
Example:
success = vector_store.delete_documents(["doc1", "doc2", "doc3"])
delete_documents_async(document_ids: List[str]) -> bool
Delete documents asynchronously by their IDs.
Parameters:
document_ids
(List[str]): List of document IDs to delete
Returns:
bool
: True if successful, False otherwise
Example:
success = await vector_store.delete_documents_async(["doc1", "doc2"])
get_stats() -> Dict[str, Any]
Get collection statistics.
Returns:
Dict[str, Any]
: Dictionary containing collection statistics
Example:
stats = vector_store.get_stats()
print(f"Total documents: {stats['total_documents']}")
print(f"Storage size: {stats['storage_size']} bytes")
create_text_index(text_fields: List[str] = None) -> None
Create text index for hybrid search.
Parameters:
text_fields
(List[str]): List of fields to index for text search
Example:
vector_store.create_text_index(['content', 'title', 'description'])
close() -> None
Close database connections.
Example:
vector_store.close()
MongoDB Retrievers
MongoDBVectorRetriever
Basic vector search retriever.
Constructor
MongoDBVectorRetriever(
vector_store: MongoDBAtlasVectorStore,
embedding_model: str = "text-embedding-3-large"
)
Methods
retrieve(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents using vector search.
Parameters:
query
(str): Search queryk
(int): Number of results to returnfilter_metadata
(Optional[Dict]): Metadata filters
Returns:
List[RetrievalResult]
: List of retrieval results
retrieve_async(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents asynchronously using vector search.
MongoDBHybridRetriever
Hybrid search retriever combining text and vector search.
Constructor
MongoDBHybridRetriever(
vector_store: MongoDBAtlasVectorStore,
config: MongoDBHybridConfig = None,
embedding_model: str = "text-embedding-3-large"
)
Methods
retrieve(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents using hybrid search.
retrieve_async(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents asynchronously using hybrid search.
MongoDBFacetedRetriever
Faceted search retriever with metadata filtering.
Constructor
MongoDBFacetedRetriever(
vector_store: MongoDBAtlasVectorStore,
embedding_model: str = "text-embedding-3-large"
)
Methods
retrieve(query: str, k: int = 5, facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents using faceted search.
Parameters:
query
(str): Search queryk
(int): Number of results to returnfacets
(List[str]): List of facet fieldsfilter_metadata
(Optional[Dict]): Metadata filters
get_facets(query: str, facets: List[str], filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]
Get facet information for a query.
MongoDBAdvancedRetriever
Advanced retriever supporting multiple search strategies.
Constructor
MongoDBAdvancedRetriever(
vector_store: MongoDBAtlasVectorStore,
config: MongoDBHybridConfig = None,
embedding_model: str = "text-embedding-3-large"
)
Methods
retrieve(query: str, k: int = 5, search_type: str = "hybrid", facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents using specified search strategy.
Parameters:
query
(str): Search queryk
(int): Number of results to returnsearch_type
(str): Search type ("vector", "hybrid", "faceted")facets
(List[str]): List of facet fields (for faceted search)filter_metadata
(Optional[Dict]): Metadata filters
Search Types:
"vector"
: Pure vector similarity search"hybrid"
: Combined text and vector search"faceted"
: Faceted search with metadata aggregation
retrieve_async(query: str, k: int = 5, search_type: str = "hybrid", facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]
Retrieve documents asynchronously using specified search strategy.
get_facets(query: str, facets: List[str], filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]
Get facet information for a query.
create_text_index(text_fields: List[str] = None) -> None
Create text index for hybrid search.
get_stats() -> Dict[str, Any]
Get collection statistics.
Configuration Classes
MongoDBHybridConfig
Configuration for hybrid search.
@dataclass
class MongoDBHybridConfig:
text_weight: float = 0.3
vector_weight: float = 0.7
vector_k: int = 20
text_k: int = 20
final_k: int = 5
enable_faceted_search: bool = True
enable_metadata_filtering: bool = True
Fields:
text_weight
(float): Weight for text search scorevector_weight
(float): Weight for vector search scorevector_k
(int): Number of vector search resultstext_k
(int): Number of text search resultsfinal_k
(int): Final number of results to returnenable_faceted_search
(bool): Enable faceted searchenable_metadata_filtering
(bool): Enable metadata filtering
Data Types
VectorDocument
Document structure for vector storage.
@dataclass
class VectorDocument:
id: str
content: str
embedding: List[float]
metadata: Dict[str, Any]
Fields:
id
(str): Unique document identifiercontent
(str): Document contentembedding
(List[float]): Vector embeddingmetadata
(Dict[str, Any]): Document metadata
RetrievalResult
Result structure for retrieval operations.
@dataclass
class RetrievalResult:
chunk: Chunk
score: float
retrieval_method: str
Fields:
chunk
(Chunk): Retrieved document chunkscore
(float): Relevance scoreretrieval_method
(str): Method used for retrieval
Error Handling
Common Exceptions
ConnectionFailure
: MongoDB connection failedServerSelectionTimeoutError
: Server selection timeoutOperationFailure
: Database operation failedIndexNotFound
: Vector search index not foundAuthenticationFailed
: Authentication failed
Error Handling Example
try:
results = vector_store.search(query_embedding, k=10)
except ConnectionFailure as e:
print(f"Connection failed: {e}")
except ServerSelectionTimeoutError as e:
print(f"Server selection timeout: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Performance Considerations
Query Optimization
- Use appropriate
k
values for your use case - Apply metadata filtering to reduce search space
- Use connection pooling for high concurrency
- Consider async operations for better throughput
Memory Management
- Monitor connection pool usage
- Close connections when done
- Use batch operations for large datasets
- Consider pagination for large result sets
This API reference provides comprehensive documentation for all MongoDB Atlas Vector Search functionality in RecoAgent.