relevanceai.api.endpoints.tagger

Tagger services

Module Contents

Classes

TaggerClient

Base class for all relevanceai client utilities

class relevanceai.api.endpoints.tagger.TaggerClient(project, api_key)

Bases: relevanceai.base._Base

Base class for all relevanceai client utilities

tag(self, data: str, tag_dataset_id: str, encoder: str, tag_field: str = None, approximation_depth: int = 0, sum_fields: bool = True, page_size: int = 20, page: int = 1, similarity_metric: str = 'cosine', filters: list = [], min_score: float = 0, include_search_relevance: bool = False, search_relevance_cutoff_aggressiveness: int = 1, asc: bool = False, include_score: bool = False)

Tag documents or vectors

Parameters
  • data (string) – Image Url or text or any data suited for the encoder

  • tag_dataset_id (string) – Name of the dataset you want to tag

  • encoder (string) – Which encoder to use.

  • tag_field (string) – The field used to tag in a dataset. If None, automatically uses the one stated in the encoder.

  • approximation_depth (int) – Used for approximate search to speed up search. The higher the number, faster the search but potentially less accurate.

  • sum_fields (bool) – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size (int) – Size of each page of results

  • page (int) – Page of the results

  • similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters (list) – Query for filtering the search results

  • min_score (int) – Minimum score for similarity metric

  • include_search_relevance (bool) – Whether to calculate a search_relevance cutoff score to flag relevant and less relevant results

  • search_relevance_cutoff_aggressiveness (int) – How aggressive the search_relevance cutoff score is (higher value the less results will be relevant)

  • asc (bool) – Whether to sort results by ascending or descending order

  • include_score (bool) – Whether to include score

diversity(self, data: str, tag_dataset_id: str, encoder: str, cluster_vector_field: str, n_clusters: int, tag_field: str = None, approximation_depth: int = 0, sum_fields: bool = True, page_size: int = 20, page: int = 1, similarity_metric: str = 'cosine', filters: list = [], min_score: float = 0, include_search_relevance: bool = False, search_relevance_cutoff_aggressiveness: int = 1, asc: bool = False, include_score: bool = False, n_init: int = 5, n_iter: int = 10)

Tagging and then clustering the tags and returning one from each cluster (starting from the closest tag)

Parameters
  • data (string) – Image Url or text or any data suited for the encoder

  • tag_dataset_id (string) – Name of the dataset you want to tag

  • encoder (string) – Which encoder to use.

  • cluster_vector_field (str) – The field to cluster on.

  • n_clusters (int) – Number of clusters to be specified.

  • tag_field (string) – The field used to tag in a dataset. If None, automatically uses the one stated in the encoder.

  • approximation_depth (int) – Used for approximate search to speed up search. The higher the number, faster the search but potentially less accurate.

  • sum_fields (bool) – Whether to sum the multiple vectors similarity search score as 1 or seperate

  • page_size (int) – Size of each page of results

  • page (int) – Page of the results

  • similarity_metric (string) – Similarity Metric, choose from [‘cosine’, ‘l1’, ‘l2’, ‘dp’]

  • filters (list) – Query for filtering the search results

  • min_score (int) – Minimum score for similarity metric

  • include_search_relevance (bool) – Whether to calculate a search_relevance cutoff score to flag relevant and less relevant results

  • search_relevance_cutoff_aggressiveness (int) – How aggressive the search_relevance cutoff score is (higher value the less results will be relevant)

  • asc (bool) – Whether to sort results by ascending or descending order

  • include_score (bool) – Whether to include score

  • n_init (int) – Number of runs to run with different centroid seeds

  • n_iter (int) – Number of iterations in each run