relevanceai.api.endpoints.centroids
Module Contents
Classes
Base class for all relevanceai client utilities |
- class relevanceai.api.endpoints.centroids.CentroidsClient(project, api_key)
Bases:
relevanceai.base._BaseBase class for all relevanceai client utilities
- docs_closest_to_center
- docs_furthest_from_center
- list(self, dataset_id: str, vector_fields: List, alias: str = 'default', page_size: int = 5, cursor: str = None, include_vector: bool = False, base_url='https://gateway-api-aueast.relevance.ai/latest')
Retrieve the cluster centroid
- Parameters
dataset_id (string) – Unique name of dataset
vector_fields (list) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
page_size (int) – Size of each page of results
cursor (string) – Cursor to paginate the document retrieval
include_vector (bool) – Include vectors in the search results
- get(self, dataset_id: str, cluster_ids: List, vector_fields: List, alias: str = 'default', page_size: int = 5, cursor: str = None)
Retrieve the cluster centroids by IDs
- Parameters
dataset_id (string) – Unique name of dataset
cluster_ids (list) – List of cluster IDs
vector_field (string) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
page_size (int) – Size of each page of results
cursor (string) – Cursor to paginate the document retrieval
- insert(self, dataset_id: str, cluster_centers: List, vector_fields: List, alias: str = 'default')
Insert your own cluster centroids for it to be used in approximate search settings and cluster aggregations. :param dataset_id: Unique name of dataset :type dataset_id: string :param cluster_centers: Cluster centers with the key being the index number :type cluster_centers: list :param vector_field: The vector field where a clustering task was run. :type vector_field: string :param alias: Alias is used to name a cluster :type alias: string
- documents(self, dataset_id: str, cluster_ids: List, vector_fields: List, alias: str = 'default', page_size: int = 5, cursor: str = None, page: int = 1, include_vector: bool = False, similarity_metric: str = 'cosine')
Retrieve the cluster centroids by IDs
- Parameters
dataset_id (string) – Unique name of dataset
cluster_ids (list) – List of cluster IDs
vector_fields (list) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
page_size (int) – Size of each page of results
cursor (string) – Cursor to paginate the document retrieval
page (int) – Page of the results
include_vector (bool) – Include vectors in the search results
similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]
- metadata(self, dataset_id: str, vector_fields: List, alias: str = 'default', metadata: Optional[Dict[str, Any]] = None)
If metadata is none, retrieves metadata about a dataset. notably description, data source, etc Otherwise, you can store the metadata about your cluster here.
- Parameters
dataset_id (string) – Unique name of dataset
vector_field (string) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
metadata (Optional[dict]) – If None, it will retrieve the metadata, otherwise it will overwrite the metadata of the cluster
- list_closest_to_center(self, dataset_id: str, vector_fields: List, cluster_ids: List = [], alias: str = 'default', centroid_vector_fields: List = ['centroid_vector_'], select_fields: List = [], approx: int = 0, sum_fields: bool = True, page_size: int = 1, page: int = 1, similarity_metric: str = 'cosine', filters: List = [], facets: List = [], min_score: int = 0, include_vector: bool = False, include_count: bool = True, include_facets: bool = False)
List of documents closest from the centre.
- Parameters
dataset_id (string) – Unique name of dataset
vector_field (string) – The vector field where a clustering task was run.
cluster_ids (lsit) – Any of the cluster ids
alias (string) – Alias is used to name a cluster
centroid_vector_fields (list) – Vector fields stored
select_fields (list) – Fields to include in the search results, empty array/list means all fields
approx (int) – Used for approximate search to speed up search. The higher the number, faster the search but potentially less accurate
sum_fields (bool) – Whether to sum the multiple vectors similarity search score as 1 or seperate
page_size (int) – Size of each page of results
page (int) – Page of the results
similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]
filters (list) – Query for filtering the search results
facets (list) – Fields to include in the facets, if [] then all
min_score (int) – Minimum score for similarity metric
include_vectors (bool) – Include vectors in the search results
include_count (bool) – Include the total count of results in the search results
include_facets (bool) – Include facets in the search results
- list_furthest_from_center(self, dataset_id: str, vector_fields: str, cluster_ids: List = [], alias: str = 'default', select_fields: List = [], approx: int = 0, sum_fields: bool = True, page_size: int = 1, page: int = 1, similarity_metric: str = 'cosine', filters: List = [], facets: List = [], min_score: int = 0, include_vector: bool = False, include_count: bool = True, include_facets: bool = False)
List of documents furthest from the centre.
- Parameters
dataset_id (string) – Unique name of dataset
vector_fields (list) – The vector field where a clustering task was run.
cluster_ids (list) – Any of the cluster ids
alias (string) – Alias is used to name a cluster
select_fields (list) – Fields to include in the search results, empty array/list means all fields
approx (int) – Used for approximate search to speed up search. The higher the number, faster the search but potentially less accurate
sum_fields (bool) – Whether to sum the multiple vectors similarity search score as 1 or seperate
page_size (int) – Size of each page of results
page (int) – Page of the results
similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]
filters (list) – Query for filtering the search results
facets (list) – Fields to include in the facets, if [] then all
min_score (int) – Minimum score for similarity metric
include_vectors (bool) – Include vectors in the search results
include_count (bool) – Include the total count of results in the search results
include_facets (bool) – Include facets in the search results
- delete(self, dataset_id: str, vector_fields: List, alias: str = 'default')
Delete centroids by dataset ID, vector field and alias
- Parameters
dataset_id (string) – Unique name of dataset
vector_field (string) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
- update(self, dataset_id: str, vector_fields: List, id: str, update: dict = {}, alias: str = 'default')
Delete centroids by dataset ID, vector field and alias
- Parameters
dataset_id (string) – Unique name of dataset
vector_field (List) – The vector field where a clustering task was run.
alias (string) – Alias is used to name a cluster
id (string) – The centroid ID
update (dict) – The update to be applied to the document