relevanceai.api.batch.batch_retrieve
Batch Retrieve
Module Contents
Classes
API Client |
Attributes
- relevanceai.api.batch.batch_retrieve.BYTE_TO_MB
- relevanceai.api.batch.batch_retrieve.LIST_SIZE_MULTIPLIER = 3
- class relevanceai.api.batch.batch_retrieve.BatchRetrieveClient(project: str, api_key: str)
Bases:
relevanceai.api.endpoints.client.APIClient,relevanceai.api.batch.chunk.ChunkerAPI Client
- get_documents(self, dataset_id: str, number_of_documents: int = 20, filters: list = [], cursor: str = None, batch_size: int = 1000, sort: list = [], select_fields: list = [], include_vector: bool = True)
Retrieve documents with filters. Filter is used to retrieve documents that match the conditions set in a filter query. This is used in advance search to filter the documents that are searched.
If you are looking to combine your filters with multiple ORs, simply add the following inside the query {“strict”:”must_or”}. :param dataset_id: Unique name of dataset :type dataset_id: string :param number_of_documents: Number of documents to retrieve :type number_of_documents: int :param select_fields: Fields to include in the search results, empty array/list means all fields. :type select_fields: list :param cursor: Cursor to paginate the document retrieval :type cursor: string :param batch_size: Number of documents to retrieve per iteration :type batch_size: int :param include_vector: Include vectors in the search results :type include_vector: bool :param sort: Fields to sort by. For each field, sort by descending or ascending. If you are using descending by datetime, it will get the most recent ones. :type sort: list :param filters: Query for filtering the search results :type filters: list
- get_all_documents(self, dataset_id: str, chunk_size: int = 1000, filters: List = [], sort: List = [], select_fields: List = [], include_vector: bool = True, show_progress_bar: bool = True)
Retrieve all documents with filters. Filter is used to retrieve documents that match the conditions set in a filter query. This is used in advance search to filter the documents that are searched. For more details see documents.get_where.
Example
>>> client = Client() >>> client.get_all_documents(dataset_id="sample_dataset"")
- Parameters
dataset_id (string) – Unique name of dataset
chunk_size (list) – Number of documents to retrieve per retrieval
include_vector (bool) – Include vectors in the search results
sort (list) – Fields to sort by. For each field, sort by descending or ascending. If you are using descending by datetime, it will get the most recent ones.
filters (list) – Query for filtering the search results
select_fields (list) – Fields to include in the search results, empty array/list means all fields.
- get_number_of_documents(self, dataset_id, filters=[])
Get number of documents in a dataset. Filter can be used to select documents that match the conditions set in a filter query. For more details see documents.get_where.
- Parameters
dataset_ids (list) – Unique names of datasets
filters (list) – Filters to select documents
- get_vector_fields(self, dataset_id)
Returns list of valid vector fields in dataset :param dataset_id: Unique name of dataset :type dataset_id: string