relevanceai.api.endpoints.tasks
Tasks Module
Module Contents
Classes
Base class for all relevanceai client utilities |
- class relevanceai.api.endpoints.tasks.TasksClient(project: str, api_key: str)
Bases:
relevanceai.base._BaseBase class for all relevanceai client utilities
- create(self, dataset_id, task_name, task_parameters)
Tasks unlock the power of VecDb AI by adding a lot more new functionality with a flexible way of searching.
- Parameters
dataset_id (string) – Unique name of dataset
task_name (string) – Name of task to complete
task_parameters (dict) – Parameters of task to complete
- status(self, dataset_id: str, task_id: str)
Get status of a collection level job. Whether its starting, running, failed or finished.
- Parameters
dataset_id (string) – Unique name of dataset
task_id (string) – Unique name of task
- list(self, dataset_id: str, show_active_only: bool = True)
List and get a history of all the jobs and its job_id, parameters, start time, etc.
- Parameters
dataset_id (string) – Unique name of dataset
show_active_only (bool) – Whether to show active only
- _loop_status_until_finish(self, dataset_id: str, task_id: str, verbose: bool = True, time_between_ping: int = 10)
- _check_status_until_finish(self, dataset_id: str, task_id: str, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
- create_cluster_task(self, dataset_id, vector_field: str, n_clusters: int, alias: str = 'default', refresh: bool = False, n_iter: int = 10, n_init: int = 5, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Start a task which creates clusters for a dataset based on a vector field :param dataset_id: Unique name of dataset :type dataset_id: string :param vector_field: The field to cluster on. :type vector_field: string :param alias: Alias is used to name a cluster :type alias: string :param n_clusters: Number of clusters to be specified. :type n_clusters: int :param n_iter: Number of iterations in each run :type n_iter: int :param n_init: Number of runs to run with different centroid seeds :type n_init: int :param refresh: Whether to rerun task on the whole dataset or just the ones missing the output :type refresh: bool
- create_numeric_encoder_task(self, dataset_id: str, fields: list, vector_name: str = '_vector_', status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Within a collection encode the specified dictionary field in every document into vectors.
For example: a dictionary that represents a person’s characteristics visiting a store: >>> document 1 field: {“person_characteristics” : {“height”:180, “age”:40, “weight”:70}} >>> document 2 field: {“person_characteristics” : {“age”:32, “purchases”:10, “visits”: 24}} >>> -> <Encode the dictionaries to vectors> -> >>> | height | age | weight | purchases | visits | >>> |--------|—–|--------|———–|--------| >>> | 180 | 40 | 70 | 0 | 0 | >>> | 0 | 32 | 0 | 10 | 24 | >>> document 1 dictionary vector: {”person_characteristics_vector_”: [180, 40, 70, 0, 0]} >>> document 2 dictionary vector: {”person_characteristics_vector_”: [0, 32, 0, 10, 24]} :param dataset_id: Unique name of dataset :type dataset_id: string :param fields: The numeric fields to encode into vectors. :type fields: list :param vector_name: The name of the vector field created :type vector_name: string
- create_encode_categories_task(self, dataset_id: str, fields: list, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Within a collection encode the specified array field in every document into vectors.
For example, array that represents a movie’s categories: >>> document 1 array field: {“category” : [“sci-fi”, “thriller”, “comedy”]} >>> document 2 array field: {“category” : [“sci-fi”, “romance”, “drama”]} >>> -> <Encode the arrays to vectors> -> >>> | sci-fi | thriller | comedy | romance | drama | >>> |--------|———-|--------|———|-------| >>> | 1 | 1 | 1 | 0 | 0 | >>> | 1 | 0 | 0 | 1 | 1 | >>> document 1 array vector: {”movie_categories_vector_”: [1, 1, 1, 0, 0]} >>> document 2 array vector: {”movie_categories_vector_”: [1, 0, 0, 1, 1]}
- Parameters
dataset_id (string) – Unique name of dataset
fields (list) – The numeric fields to encode into vectors.
- create_encode_text_task(self, dataset_id: str, field: str, alias: str = 'default', refresh: bool = False, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Start a task which encodes a text field :param dataset_id: Unique name of dataset :type dataset_id: string :param field: The field to encode :type field: string :param alias: Alias used to name a vector field. Belongs in field_{alias}vector :type alias: string :param refresh: Whether to rerun task on the whole dataset or just the ones missing the output :type refresh: bool
- create_encode_textimage_task(self, dataset_id: str, field: str, alias: str = 'default', refresh: bool = False, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Start a task which encodes a text field for image representation :param dataset_id: Unique name of dataset :type dataset_id: string :param field: The field to encode :type field: string :param alias: Alias used to name a vector field. Belongs in field_{alias}vector :type alias: string :param refresh: Whether to rerun task on the whole dataset or just the ones missing the output :type refresh: bool
- create_encode_imagetext_task(self, dataset_id: str, field: str, alias: str = 'default', refresh: bool = False, status_checker: bool = True, verbose: bool = True, time_between_ping: int = 10)
Start a task which encodes an image field for text representation :param dataset_id: Unique name of dataset :type dataset_id: string :param field: The field to encode :type field: string :param alias: Alias used to name a vector field. Belongs in field_{alias}vector :type alias: string :param refresh: Whether to rerun task on the whole dataset or just the ones missing the output :type refresh: bool