Skip to main content

MongoDB Atlas Vector Search API Reference

Complete API reference for MongoDB Atlas Vector Search integration with RecoAgent.

MongoDBAtlasVectorStore

Constructor

MongoDBAtlasVectorStore(
uri: str,
database: str = "recoagent",
collection: str = "documents",
vector_search_index: str = "vector_index",
embedding_dim: int = 3072,
max_pool_size: int = 100,
min_pool_size: int = 10,
max_idle_time_ms: int = 30000,
connect_timeout_ms: int = 10000,
server_selection_timeout_ms: int = 10000
)

Parameters:

  • uri (str): MongoDB connection URI
  • database (str): Database name
  • collection (str): Collection name
  • vector_search_index (str): Vector search index name
  • embedding_dim (int): Embedding dimension size
  • max_pool_size (int): Maximum connection pool size
  • min_pool_size (int): Minimum connection pool size
  • max_idle_time_ms (int): Maximum idle time for connections
  • connect_timeout_ms (int): Connection timeout in milliseconds
  • server_selection_timeout_ms (int): Server selection timeout in milliseconds

Methods

add_documents(documents: List[VectorDocument]) -> bool

Add documents to the vector store.

Parameters:

  • documents (List[VectorDocument]): List of documents to add

Returns:

  • bool: True if successful, False otherwise

Example:

documents = [
VectorDocument(
id="doc1",
content="Machine learning content",
embedding=[0.1, 0.2, 0.3, ...],
metadata={"category": "AI"}
)
]
success = vector_store.add_documents(documents)

add_documents_async(documents: List[VectorDocument]) -> bool

Add documents asynchronously to the vector store.

Parameters:

  • documents (List[VectorDocument]): List of documents to add

Returns:

  • bool: True if successful, False otherwise

Example:

success = await vector_store.add_documents_async(documents)

search(query_embedding: List[float], k: int = 5, include_metadata: bool = True, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]

Search documents using vector similarity.

Parameters:

  • query_embedding (List[float]): Query vector embedding
  • k (int): Number of results to return
  • include_metadata (bool): Whether to include metadata in results
  • filter_metadata (Optional[Dict]): Metadata filters to apply

Returns:

  • List[Dict[str, Any]]: List of search results

Example:

query_embedding = [0.1, 0.2, 0.3, ...]
results = vector_store.search(
query_embedding=query_embedding,
k=10,
include_metadata=True,
filter_metadata={"category": "AI"}
)

search_async(query_embedding: List[float], k: int = 5, include_metadata: bool = True, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]

Search documents asynchronously using vector similarity.

Parameters:

  • query_embedding (List[float]): Query vector embedding
  • k (int): Number of results to return
  • include_metadata (bool): Whether to include metadata in results
  • filter_metadata (Optional[Dict]): Metadata filters to apply

Returns:

  • List[Dict[str, Any]]: List of search results

Example:

results = await vector_store.search_async(
query_embedding=query_embedding,
k=10,
include_metadata=True
)

hybrid_search(query_text: str, query_embedding: List[float], k: int = 5, text_weight: float = 0.3, vector_weight: float = 0.7, filter_metadata: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]

Perform hybrid search combining text and vector search.

Parameters:

  • query_text (str): Text query for text search
  • query_embedding (List[float]): Vector embedding for vector search
  • k (int): Number of results to return
  • text_weight (float): Weight for text search score
  • vector_weight (float): Weight for vector search score
  • filter_metadata (Optional[Dict]): Metadata filters to apply

Returns:

  • List[Dict[str, Any]]: List of hybrid search results

Example:

results = vector_store.hybrid_search(
query_text="machine learning algorithms",
query_embedding=query_embedding,
k=10,
text_weight=0.3,
vector_weight=0.7
)

faceted_search(query_embedding: List[float], facets: List[str], k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]

Perform faceted search with metadata aggregation.

Parameters:

  • query_embedding (List[float]): Query vector embedding
  • facets (List[str]): List of facet fields to aggregate
  • k (int): Number of results to return
  • filter_metadata (Optional[Dict]): Metadata filters to apply

Returns:

  • Dict[str, Any]: Dictionary containing results and facets

Example:

results = vector_store.faceted_search(
query_embedding=query_embedding,
facets=["category", "year", "difficulty"],
k=10
)

print(f"Results: {results['results']}")
print(f"Facets: {results['facets']}")

delete_documents(document_ids: List[str]) -> bool

Delete documents by their IDs.

Parameters:

  • document_ids (List[str]): List of document IDs to delete

Returns:

  • bool: True if successful, False otherwise

Example:

success = vector_store.delete_documents(["doc1", "doc2", "doc3"])

delete_documents_async(document_ids: List[str]) -> bool

Delete documents asynchronously by their IDs.

Parameters:

  • document_ids (List[str]): List of document IDs to delete

Returns:

  • bool: True if successful, False otherwise

Example:

success = await vector_store.delete_documents_async(["doc1", "doc2"])

get_stats() -> Dict[str, Any]

Get collection statistics.

Returns:

  • Dict[str, Any]: Dictionary containing collection statistics

Example:

stats = vector_store.get_stats()
print(f"Total documents: {stats['total_documents']}")
print(f"Storage size: {stats['storage_size']} bytes")

create_text_index(text_fields: List[str] = None) -> None

Create text index for hybrid search.

Parameters:

  • text_fields (List[str]): List of fields to index for text search

Example:

vector_store.create_text_index(['content', 'title', 'description'])

close() -> None

Close database connections.

Example:

vector_store.close()

MongoDB Retrievers

MongoDBVectorRetriever

Basic vector search retriever.

Constructor

MongoDBVectorRetriever(
vector_store: MongoDBAtlasVectorStore,
embedding_model: str = "text-embedding-3-large"
)

Methods

retrieve(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents using vector search.

Parameters:

  • query (str): Search query
  • k (int): Number of results to return
  • filter_metadata (Optional[Dict]): Metadata filters

Returns:

  • List[RetrievalResult]: List of retrieval results
retrieve_async(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents asynchronously using vector search.

MongoDBHybridRetriever

Hybrid search retriever combining text and vector search.

Constructor

MongoDBHybridRetriever(
vector_store: MongoDBAtlasVectorStore,
config: MongoDBHybridConfig = None,
embedding_model: str = "text-embedding-3-large"
)

Methods

retrieve(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents using hybrid search.

retrieve_async(query: str, k: int = 5, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents asynchronously using hybrid search.

MongoDBFacetedRetriever

Faceted search retriever with metadata filtering.

Constructor

MongoDBFacetedRetriever(
vector_store: MongoDBAtlasVectorStore,
embedding_model: str = "text-embedding-3-large"
)

Methods

retrieve(query: str, k: int = 5, facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents using faceted search.

Parameters:

  • query (str): Search query
  • k (int): Number of results to return
  • facets (List[str]): List of facet fields
  • filter_metadata (Optional[Dict]): Metadata filters
get_facets(query: str, facets: List[str], filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]

Get facet information for a query.

MongoDBAdvancedRetriever

Advanced retriever supporting multiple search strategies.

Constructor

MongoDBAdvancedRetriever(
vector_store: MongoDBAtlasVectorStore,
config: MongoDBHybridConfig = None,
embedding_model: str = "text-embedding-3-large"
)

Methods

retrieve(query: str, k: int = 5, search_type: str = "hybrid", facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents using specified search strategy.

Parameters:

  • query (str): Search query
  • k (int): Number of results to return
  • search_type (str): Search type ("vector", "hybrid", "faceted")
  • facets (List[str]): List of facet fields (for faceted search)
  • filter_metadata (Optional[Dict]): Metadata filters

Search Types:

  • "vector": Pure vector similarity search
  • "hybrid": Combined text and vector search
  • "faceted": Faceted search with metadata aggregation
retrieve_async(query: str, k: int = 5, search_type: str = "hybrid", facets: List[str] = None, filter_metadata: Optional[Dict[str, Any]] = None) -> List[RetrievalResult]

Retrieve documents asynchronously using specified search strategy.

get_facets(query: str, facets: List[str], filter_metadata: Optional[Dict[str, Any]] = None) -> Dict[str, Any]

Get facet information for a query.

create_text_index(text_fields: List[str] = None) -> None

Create text index for hybrid search.

get_stats() -> Dict[str, Any]

Get collection statistics.

Configuration Classes

MongoDBHybridConfig

Configuration for hybrid search.

@dataclass
class MongoDBHybridConfig:
text_weight: float = 0.3
vector_weight: float = 0.7
vector_k: int = 20
text_k: int = 20
final_k: int = 5
enable_faceted_search: bool = True
enable_metadata_filtering: bool = True

Fields:

  • text_weight (float): Weight for text search score
  • vector_weight (float): Weight for vector search score
  • vector_k (int): Number of vector search results
  • text_k (int): Number of text search results
  • final_k (int): Final number of results to return
  • enable_faceted_search (bool): Enable faceted search
  • enable_metadata_filtering (bool): Enable metadata filtering

Data Types

VectorDocument

Document structure for vector storage.

@dataclass
class VectorDocument:
id: str
content: str
embedding: List[float]
metadata: Dict[str, Any]

Fields:

  • id (str): Unique document identifier
  • content (str): Document content
  • embedding (List[float]): Vector embedding
  • metadata (Dict[str, Any]): Document metadata

RetrievalResult

Result structure for retrieval operations.

@dataclass
class RetrievalResult:
chunk: Chunk
score: float
retrieval_method: str

Fields:

  • chunk (Chunk): Retrieved document chunk
  • score (float): Relevance score
  • retrieval_method (str): Method used for retrieval

Error Handling

Common Exceptions

  • ConnectionFailure: MongoDB connection failed
  • ServerSelectionTimeoutError: Server selection timeout
  • OperationFailure: Database operation failed
  • IndexNotFound: Vector search index not found
  • AuthenticationFailed: Authentication failed

Error Handling Example

try:
results = vector_store.search(query_embedding, k=10)
except ConnectionFailure as e:
print(f"Connection failed: {e}")
except ServerSelectionTimeoutError as e:
print(f"Server selection timeout: {e}")
except Exception as e:
print(f"Unexpected error: {e}")

Performance Considerations

Query Optimization

  • Use appropriate k values for your use case
  • Apply metadata filtering to reduce search space
  • Use connection pooling for high concurrency
  • Consider async operations for better throughput

Memory Management

  • Monitor connection pool usage
  • Close connections when done
  • Use batch operations for large datasets
  • Consider pagination for large result sets

This API reference provides comprehensive documentation for all MongoDB Atlas Vector Search functionality in RecoAgent.