`relevanceai.api.endpoints.recommend`

Recommmend services.

Module Contents

Classes

RecommendClient

Base class for all relevanceai client utilities

class relevanceai.api.endpoints.recommend.RecommendClient(project, api_key)

Bases: relevanceai.base._Base

Base class for all relevanceai client utilities

vector(self, dataset_id: str, positive_document_ids: dict = {}, negative_document_ids: dict = {}, vector_fields=[], approximation_depth: int = 0, vector_operation: str = 'sum', sum_fields: bool = True, page_size: int = 20, page: int = 1, similarity_metric: str = 'cosine', facets: list = [], filters: list = [], min_score: float = 0, select_fields: list = [], include_vector: bool = False, include_count: bool = True, asc: bool = False, keep_search_history: bool = False, hundred_scale: bool = False)

Vector Search based recommendations are done by extracting the vectors of the documents ids specified performing some vector operations and then searching the dataset with the resultant vector. This allows us to not only do recommendations but personalized and weighted recommendations.

Here are a couple of different scenarios and what the queries would look like for those:

Recommendations Personalized by single liked product:

>>> positive_document_ids=['A']

-> Document ID A Vector = Search Query

Recommendations Personalized by multiple liked product:

>>> positive_document_ids=['A', 'B']

-> Document ID A Vector + Document ID B Vector = Search Query

Recommendations Personalized by multiple liked product and disliked products:

>>> positive_document_ids=['A', 'B'], negative_document_ids=['C', 'D']

-> (Document ID A Vector + Document ID B Vector) - (Document ID C Vector + Document ID C Vector) = Search Query

Recommendations Personalized by multiple liked product and disliked products with weights:

>>> positive_document_ids={'A':0.5, 'B':1}, negative_document_ids={'C':0.6, 'D':0.4}

-> (Document ID A Vector * 0.5 + Document ID B Vector * 1) - (Document ID C Vector * 0.6 + Document ID D Vector * 0.4) = Search Query

You can change the operator between vectors with vector_operation:

e.g. >>> positive_document_ids=[‘A’, ‘B’], negative_document_ids=[‘C’, ‘D’], vector_operation=’multiply’

-> (Document ID A Vector * Document ID B Vector) - (Document ID C Vector * Document ID D Vector) = Search Query

Parameters

dataset_id (string) – Unique name of dataset
positive_document_ids (dict) – Positive document IDs to personalize the results with, this will retrive the vectors from the document IDs and consider it in the operation.
negative_document_ids (dict) – Negative document IDs to personalize the results with, this will retrive the vectors from the document IDs and consider it in the operation.
vector_fields (list) – The vector field to search in. It can either be an array of strings (automatically equally weighted) (e.g. [’check_vector_’, ‘yellow_vector_’]) or it is a dictionary mapping field to float where the weighting is explicitly specified (e.g. {’check_vector_’: 0.2, ‘yellow_vector_’: 0.5})
approximation_depth (int) – Used for approximate search to speed up search. The higher the number, faster the search but potentially less accurate.
vector_operation (string) – Aggregation for the vectors when using positive and negative document IDs, choose from [‘mean’, ‘sum’, ‘min’, ‘max’, ‘divide’, ‘mulitple’]
sum_fields (bool) – Whether to sum the multiple vectors similarity search score as 1 or seperate
page_size (int) – Size of each page of results
page (int) – Page of the results
similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]
facets (list) – Fields to include in the facets, if [] then all
filters (list) – Query for filtering the search results
min_score (int) – Minimum score for similarity metric
select_fields (list) – Fields to include in the search results, empty array/list means all fields.
include_vector (bool) – Include vectors in the search results
include_count (bool) – Include the total count of results in the search results
asc (bool) – Whether to sort results by ascending or descending order
keep_search_history (bool) – Whether to store the history into VecDB. This will increase the storage costs over time.
hundred_scale (bool) – Whether to scale up the metric by 100

diversity(self, dataset_id: str, cluster_vector_field: str, n_clusters: int, positive_document_ids: dict = {}, negative_document_ids: dict = {}, vector_fields=[], approximation_depth: int = 0, vector_operation: str = 'sum', sum_fields: bool = True, page_size: int = 20, page: int = 1, similarity_metric: str = 'cosine', facets: list = [], filters: list = [], min_score: float = 0, select_fields: list = [], include_vector: bool = False, include_count: bool = True, asc: bool = False, keep_search_history: bool = False, hundred_scale: bool = False, search_history_id: str = None, n_init: int = 5, n_iter: int = 10, return_as_clusters: bool = False)