Gen3 Index Class

class gen3.index.Gen3Index(endpoint=None, auth_provider=None, service_location='index')[source]

Bases: object

A class for interacting with the Gen3 Index services.

Parameters:
  • endpoint (str) – public endpoint for reading/querying indexd - only necessary if auth_provider not provided

  • auth_provider (Gen3Auth) – A Gen3Auth class instance or indexd basic creds tuple

Examples

This generates the Gen3Index class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.

>>> auth = Gen3Auth(refresh_file="credentials.json")
... index = Gen3Index(auth)
async async_create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None)[source]

Asynchronous function to create a record in indexd.

Parameters:
  • hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}

  • size (int) – file size metadata associated with a given uuid

  • did (str) – provide a UUID for the new indexd to be made

  • urls (list) – list of URLs where you can download the UUID

  • acl (list) – access control list

  • authz (str) – RBAC string

  • file_name (str) – name of the file associated with a given UUID

  • metadata (dict) – additional key value metadata for this entry

  • urls_metadata (dict) – metadata attached to each url

  • baseid (str) – optional baseid to group with previous entries versions

  • version (str) – entry version string

  • description (str) – optional description of the object

  • content_created_date (datetime) – optional creation date and time of the content being indexed

  • content_updated_date (datetime) – optional update date and time of the content being indexed

Returns:

json representation of an entry in indexd

Return type:

Document

async async_get_record(guid=None, _ssl=None)[source]

Asynchronous function to request a record from indexd.

Parameters:

guid (str) – record guid

Returns:

indexd record

Return type:

dict

async async_get_records_from_checksum(checksum, checksum_type='md5', _ssl=None)[source]

Asynchronous function to request records from indexd matching checksum.

Parameters:
  • checksum (str) – indexd checksum to request

  • checksum_type (str) – type of checksum, defaults to md5

Returns:

List of indexd records

Return type:

List[dict]

async async_get_records_on_page(limit=None, page=None, _ssl=None)[source]

Asynchronous function to request a page from indexd.

Parameters:

page (int/str) – indexd page to request

Returns:

List of indexd records from the page

Return type:

List[dict]

async async_get_with_params(params, _ssl=None)[source]

Return a document object corresponding to the supplied parameter

  • need to include all the hashes in the request

  • need to handle the query param ‘hash’: ‘hash_type:hash’

Parameters:
  • params (dict) – params to search with

  • _ssl (None, optional) – whether or not to use ssl

Returns:

json representation of an entry in indexd

Return type:

Document

async async_query_urls(pattern, _ssl=None)[source]

Asynchronous function to query urls from indexd.

Parameters:

pattern (str) – pattern to match against indexd urls

Returns:

indexd records with urls matching pattern

Return type:

List[records]

async async_update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None, **kwargs)[source]

Asynchronous function to update a record in indexd.

Parameters:
  • guid – string - record id

  • body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that

create_blank(uploader, file_name=None)[source]

Create a blank record

Parameters:
  • format (json - json in the) –

  • { – ‘uploader’: type(string) ‘file_name’: type(string) (optional*)

  • }

create_new_version(guid, hashes, size, did=None, urls=None, file_name=None, metadata=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]

Add new version for the document associated to the provided uuid

Since data content is immutable, when you want to change the size or hash, a new index document with a new uuid needs to be created as its new version. That uuid is returned in the did field of the response. The old index document is not deleted.

Parameters:
  • guid – (string): record id

  • hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}

  • size (int) – file size metadata associated with a given uuid

  • did (str) – provide a UUID for the new indexd to be made

  • urls (list) – list of URLs where you can download the UUID

  • file_name (str) – name of the file associated with a given UUID

  • metadata (dict) – additional key value metadata for this entry

  • acl (list) – access control list

  • urls_metadata (dict) – metadata attached to each url

  • version (str) – entry version string

  • authz (str) – RBAC string

  • description (str) – optional description of the object

  • content_created_date (datetime) – optional creation date and time of the content being indexed

  • content_updated_date (datetime) – optional update date and time of the content being indexed

  • body – json/dictionary format

  • store. (- Metadata object that needs to be added to the) – Providing size and at least one hash is necessary and sufficient. Note: it is a good idea to add a version number

create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]

Create a new record and add it to the index

Parameters:
  • hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}

  • size (int) – file size metadata associated with a given uuid

  • did (str) – provide a UUID for the new indexd to be made

  • urls (list) – list of URLs where you can download the UUID

  • acl (list) – access control list

  • authz (str) – RBAC string

  • file_name (str) – name of the file associated with a given UUID

  • metadata (dict) – additional key value metadata for this entry

  • urls_metadata (dict) – metadata attached to each url

  • baseid (str) – optional baseid to group with previous entries versions

  • version (str) – entry version string

  • description (str) – optional description of the object

  • content_created_date (datetime) – optional creation date and time of the content being indexed

  • content_updated_date (datetime) – optional update date and time of the content being indexed

Returns:

json representation of an entry in indexd

Return type:

Document

delete_record(guid)[source]

Delete an entry from the index

Parameters:

guid – string - record id

Returns: Nothing

get(guid, dist_resolution=True)[source]

Get the metadata associated with the given id, alias, or distributed identifier

Parameters:
  • guid – string - record id

  • dist_resolution – boolean

  • not (- optional Specify if we want distributed dist_resolution or) –

get_all_records(limit=None, paginate=False, start=None)[source]

Get a list of all records

get_guids_prefix()[source]

Get the prefix for GUIDs if there is one :returns: prefix for this instance :rtype: str

get_latest_version(guid, has_version=False)[source]

Get the metadata of the latest index record version associated with the given id

Parameters:
  • guid – string - record id

  • has_version – boolean - optional exclude entries without a version

get_record(guid)[source]

Get the metadata associated with a given id

get_record_doc(guid)[source]

Get the metadata associated with a given id

get_records(dids)[source]

Get a list of documents given a list of dids

Parameters:

dids – list - a list of record ids

Returns:

json representing index records

Return type:

list

get_records_on_page(limit=None, page=None)[source]

Get a list of all records given the page and page size limit

get_stats()[source]

Return basic info about the records in indexd

get_urls(size=None, hashes=None, guids=None)[source]

Get a list of urls that match query params

Parameters:
  • size – integer - object size

  • hashes – string - hashes specified as algorithm:value

  • guids – list - list of ids

get_valid_guids(count=None)[source]

Get a list of valid GUIDs without indexing :param count: number of GUIDs to request :type count: int

Returns:

list of valid indexd GUIDs

Return type:

List[str]

get_version()[source]

Return the version of indexd

get_versions(guid)[source]

Get the metadata of index record version associated with the given id

Parameters:

guid – string - record id

get_with_params(params=None)[source]

Return a document object corresponding to the supplied parameters, such as {'hashes': {'md5': '...'}, 'size': '...', 'metadata': {'file_state': '...'}}.

  • need to include all the hashes in the request

  • index client like signpost or indexd will need to handle the query param ‘hash’: ‘hash_type:hash’

is_healthy()[source]

Return if indexd is healthy or not

query_urls(pattern)[source]

Query all record URLs for given pattern

Parameters:

pattern (str) – pattern to match against indexd urls

Returns:

indexd records with urls matching pattern

Return type:

List[records]

update_blank(guid, rev, hashes, size, urls=None, authz=None)[source]

Update only hashes and size for a blank index

Parameters:
  • guid (string) – record id

  • rev (string) – data revision - simple consistency mechanism

  • hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}

  • size (int) – file size metadata associated with a given uuid

update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, description=None, content_created_date=None, content_updated_date=None)[source]

Update an existing entry in the index

Parameters:
  • guid – string - record id

  • body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that