Gen3 Index Class¶
- class gen3.index.Gen3Index(endpoint=None, auth_provider=None, service_location='index')[source]¶
Bases:
object
A class for interacting with the Gen3 Index services.
- Parameters:
endpoint (str) – public endpoint for reading/querying indexd - only necessary if auth_provider not provided
auth_provider (Gen3Auth) – A Gen3Auth class instance or indexd basic creds tuple
Examples
This generates the Gen3Index class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.
>>> auth = Gen3Auth(refresh_file="credentials.json") ... index = Gen3Index(auth)
- async async_create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None)[source]¶
Asynchronous function to create a record in indexd.
- Parameters:
hashes (dict) – {hash type: hash value,} eg
hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
acl (list) – access control list
authz (str) – RBAC string
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
urls_metadata (dict) – metadata attached to each url
baseid (str) – optional baseid to group with previous entries versions
version (str) – entry version string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed
- Returns:
json representation of an entry in indexd
- Return type:
Document
- async async_get_record(guid=None, _ssl=None)[source]¶
Asynchronous function to request a record from indexd.
- Parameters:
guid (str) – record guid
- Returns:
indexd record
- Return type:
dict
- async async_get_records_from_checksum(checksum, checksum_type='md5', _ssl=None)[source]¶
Asynchronous function to request records from indexd matching checksum.
- Parameters:
checksum (str) – indexd checksum to request
checksum_type (str) – type of checksum, defaults to md5
- Returns:
List of indexd records
- Return type:
List[dict]
- async async_get_records_on_page(limit=None, page=None, _ssl=None)[source]¶
Asynchronous function to request a page from indexd.
- Parameters:
page (int/str) – indexd page to request
- Returns:
List of indexd records from the page
- Return type:
List[dict]
- async async_get_with_params(params, _ssl=None)[source]¶
Return a document object corresponding to the supplied parameter
need to include all the hashes in the request
need to handle the query param ‘hash’: ‘hash_type:hash’
- Parameters:
params (dict) – params to search with
_ssl (None, optional) – whether or not to use ssl
- Returns:
json representation of an entry in indexd
- Return type:
Document
- async async_query_urls(pattern, _ssl=None)[source]¶
Asynchronous function to query urls from indexd.
- Parameters:
pattern (str) – pattern to match against indexd urls
- Returns:
indexd records with urls matching pattern
- Return type:
List[records]
- async async_update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, _ssl=None, description=None, content_created_date=None, content_updated_date=None, **kwargs)[source]¶
Asynchronous function to update a record in indexd.
- Parameters:
guid – string - record id
body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that
- create_blank(uploader, file_name=None)[source]¶
Create a blank record
- Parameters:
format (json - json in the)
{ – ‘uploader’: type(string) ‘file_name’: type(string) (optional*)
}
- create_new_version(guid, hashes, size, did=None, urls=None, file_name=None, metadata=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]¶
Add new version for the document associated to the provided uuid
Since data content is immutable, when you want to change the size or hash, a new index document with a new uuid needs to be created as its new version. That uuid is returned in the did field of the response. The old index document is not deleted.
- Parameters:
guid – (string): record id
hashes (dict) – {hash type: hash value,} eg
hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
acl (list) – access control list
urls_metadata (dict) – metadata attached to each url
version (str) – entry version string
authz (str) – RBAC string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed
body – json/dictionary format
store. (- Metadata object that needs to be added to the) – Providing size and at least one hash is necessary and sufficient. Note: it is a good idea to add a version number
- create_record(hashes, size, did=None, urls=None, file_name=None, metadata=None, baseid=None, acl=None, urls_metadata=None, version=None, authz=None, description=None, content_created_date=None, content_updated_date=None)[source]¶
Create a new record and add it to the index
- Parameters:
hashes (dict) – {hash type: hash value,} eg
hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
did (str) – provide a UUID for the new indexd to be made
urls (list) – list of URLs where you can download the UUID
acl (list) – access control list
authz (list) – RBAC strings
file_name (str) – name of the file associated with a given UUID
metadata (dict) – additional key value metadata for this entry
urls_metadata (dict) – metadata attached to each url
baseid (str) – optional baseid to group with previous entries versions
version (str) – entry version string
description (str) – optional description of the object
content_created_date (datetime) – optional creation date and time of the content being indexed
content_updated_date (datetime) – optional update date and time of the content being indexed
- Returns:
json representation of an entry in indexd
- Return type:
Document
- delete_record(guid)[source]¶
Delete an entry from the index
- Parameters:
guid – string - record id
Returns: Nothing
- get(guid, dist_resolution=True)[source]¶
Get the metadata associated with the given id, alias, or distributed identifier
- Parameters:
guid – string - record id
dist_resolution – boolean
not (- optional Specify if we want distributed dist_resolution or)
- get_guids_prefix()[source]¶
Get the prefix for GUIDs if there is one :returns: prefix for this instance :rtype: str
- get_latest_version(guid, has_version=False)[source]¶
Get the metadata of the latest index record version associated with the given id
- Parameters:
guid – string - record id
has_version – boolean - optional exclude entries without a version
- get_records(dids)[source]¶
Get a list of documents given a list of dids
- Parameters:
dids – list - a list of record ids
- Returns:
json representing index records
- Return type:
list
- get_records_on_page(limit=None, page=None)[source]¶
Get a list of all records given the page and page size limit
- get_urls(size=None, hashes=None, guids=None)[source]¶
Get a list of urls that match query params
- Parameters:
size – integer - object size
hashes – string - hashes specified as algorithm:value
guids – list - list of ids
- get_valid_guids(count=None)[source]¶
Get a list of valid GUIDs without indexing :param count: number of GUIDs to request :type count: int
- Returns:
list of valid indexd GUIDs
- Return type:
List[str]
- get_versions(guid)[source]¶
Get the metadata of index record version associated with the given id
- Parameters:
guid – string - record id
- get_with_params(params=None)[source]¶
Return a document object corresponding to the supplied parameters, such as
{'hashes': {'md5': '...'}, 'size': '...', 'metadata': {'file_state': '...'}}
.need to include all the hashes in the request
index client like signpost or indexd will need to handle the query param ‘hash’: ‘hash_type:hash’
- query_urls(pattern)[source]¶
Query all record URLs for given pattern
- Parameters:
pattern (str) – pattern to match against indexd urls
- Returns:
indexd records with urls matching pattern
- Return type:
List[records]
- update_blank(guid, rev, hashes, size, urls=None, authz=None)[source]¶
Update only hashes and size for a blank index
- Parameters:
guid (string) – record id
rev (string) – data revision - simple consistency mechanism
hashes (dict) – {hash type: hash value,} eg
hashes={'md5': ab167e49d25b488939b1ede42752458b'}
size (int) – file size metadata associated with a given uuid
- update_record(guid, file_name=None, urls=None, version=None, metadata=None, acl=None, authz=None, urls_metadata=None, description=None, content_created_date=None, content_updated_date=None)[source]¶
Update an existing entry in the index
- Parameters:
guid – string - record id
body – json/dictionary format - index record information that needs to be updated. - can not update size or hash, use new version for that