Gen3 Index Class¶

class gen3.index.Gen3Index(endpoint=None, auth_provider=None, service_location='index')[source]¶

Bases: object

A class for interacting with the Gen3 Index services.

Parameters:

endpoint (str) – public endpoint for reading/querying indexd - only necessary if auth_provider not provided
auth_provider (Gen3Auth) – A Gen3Auth class instance or indexd basic creds tuple

Examples

This generates the Gen3Index class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.

>>> auth = Gen3Auth(refresh_file="credentials.json")
... index = Gen3Index(auth)

async async_create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None)[source]¶

Asynchronous function to create a record in indexd.

Parameters:

hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
acl (list) – access control list
authz (str) – RBAC string
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
urls_metadata (dict) – metadata attached to each url
baseid (str) – optional baseid to group with previous entries versions
version (str) – entry version string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed

Returns:

json representation of an entry in indexd

Return type:

Document

async async_get_record(guid=None, _ssl=None)[source]¶

Asynchronous function to request a record from indexd.

Parameters:: guid (str) – record guid
Returns:: indexd record
Return type:: dict

async async_get_records_from_checksum(checksum, checksum_type='md5', _ssl=None)[source]¶

Asynchronous function to request records from indexd matching checksum.

Parameters:

checksum (str) – indexd checksum to request
checksum_type (str) – type of checksum, defaults to md5

Returns:

List of indexd records

Return type:

List[dict]

async async_get_records_on_page(limit=None, page=None, _ssl=None)[source]¶

Asynchronous function to request a page from indexd.

Parameters:: page (int/str) – indexd page to request
Returns:: List of indexd records from the page
Return type:: List[dict]

async async_get_with_params(params, _ssl=None)[source]¶

Return a document object corresponding to the supplied parameter

need to include all the hashes in the request
need to handle the query param ‘hash’: ‘hash_type:hash’

Parameters:

params (dict) – params to search with
_ssl (None, optional) – whether or not to use ssl

Returns:

json representation of an entry in indexd

Return type:

Document

async async_query_urls(pattern, _ssl=None)[source]¶

Asynchronous function to query urls from indexd.

Parameters:: pattern (str) – pattern to match against indexd urls
Returns:: indexd records with urls matching pattern
Return type:: List[records]

async async_update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None, **kwargs)[source]¶

Asynchronous function to update a record in indexd.

Parameters:

guid – string - record id
body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that

create_blank(uploader, file_name=None)[source]¶

Create a blank record

Parameters:

format (json - json in the) –
{ – ‘uploader’: type(string) ‘file_name’: type(string) (optional*)
} –

create_new_version(guid, hashes, size, did=None, urls=None, file_name=None, metadata=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]¶

Add new version for the document associated to the provided uuid

Since data content is immutable, when you want to change the size or hash, a new index document with a new uuid needs to be created as its new version. That uuid is returned in the did field of the response. The old index document is not deleted.

Parameters:

guid – (string): record id
hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
acl (list) – access control list
urls_metadata (dict) – metadata attached to each url
version (str) – entry version string
authz (str) – RBAC string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed
body – json/dictionary format
store. (- Metadata object that needs to be added to the) – Providing size and at least one hash is necessary and sufficient. Note: it is a good idea to add a version number

create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]¶

Create a new record and add it to the index

Parameters:

hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
acl (list) – access control list
authz (str) – RBAC string
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
urls_metadata (dict) – metadata attached to each url
baseid (str) – optional baseid to group with previous entries versions
version (str) – entry version string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed

Returns:

json representation of an entry in indexd

Return type:

Document

delete_record(guid)[source]¶

Delete an entry from the index

Parameters:: guid – string - record id

Returns: Nothing

get(guid, dist_resolution=True)[source]¶

Get the metadata associated with the given id, alias, or distributed identifier

Parameters:

guid – string - record id
dist_resolution – boolean
not (- optional Specify if we want distributed dist_resolution or) –

get_all_records(limit=None, paginate=False, start=None)[source]¶: Get a list of all records

get_guids_prefix()[source]¶: Get the prefix for GUIDs if there is one :returns: prefix for this instance :rtype: str

get_latest_version(guid, has_version=False)[source]¶

Get the metadata of the latest index record version associated with the given id

Parameters:

guid – string - record id
has_version – boolean - optional exclude entries without a version

get_record(guid)[source]¶: Get the metadata associated with a given id

get_record_doc(guid)[source]¶: Get the metadata associated with a given id

get_records(dids)[source]¶

Get a list of documents given a list of dids

Parameters:: dids – list - a list of record ids
Returns:: json representing index records
Return type:: list

get_records_on_page(limit=None, page=None)[source]¶: Get a list of all records given the page and page size limit

get_stats()[source]¶: Return basic info about the records in indexd

get_urls(size=None, hashes=None, guids=None)[source]¶

Get a list of urls that match query params

Parameters:

size – integer - object size
hashes – string - hashes specified as algorithm:value
guids – list - list of ids

get_valid_guids(count=None)[source]¶

Get a list of valid GUIDs without indexing :param count: number of GUIDs to request :type count: int

Returns:: list of valid indexd GUIDs
Return type:: List[str]

get_version()[source]¶: Return the version of indexd

get_versions(guid)[source]¶

Get the metadata of index record version associated with the given id

Parameters:: guid – string - record id

get_with_params(params=None)[source]¶

Return a document object corresponding to the supplied parameters, such as {'hashes': {'md5': '...'}, 'size': '...', 'metadata': {'file_state': '...'}}.

need to include all the hashes in the request

index client like signpost or indexd will need to handle the query param ‘hash’: ‘hash_type:hash’

is_healthy()[source]¶: Return if indexd is healthy or not

query_urls(pattern)[source]¶

Query all record URLs for given pattern

Parameters:: pattern (str) – pattern to match against indexd urls
Returns:: indexd records with urls matching pattern
Return type:: List[records]

update_blank(guid, rev, hashes, size, urls=None, authz=None)[source]¶

Update only hashes and size for a blank index

Parameters:

guid (string) – record id
rev (string) – data revision - simple consistency mechanism
hashes (dict) – {hash type: hash value,} eg hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid

update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, description=None, content_created_date=None, content_updated_date=None)[source]¶

Update an existing entry in the index

Parameters:

guid – string - record id
body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that

Gen3 Index Class¶

Gen3 SDK

Navigation

Related Topics