Gen3 Submission Class¶
- class gen3.submission.Gen3Submission(endpoint=None, auth_provider=None)[source]¶
Bases:
object
Submit/Export/Query data from a Gen3 Submission system.
A class for interacting with the Gen3 submission services. Supports submitting and exporting from Sheepdog. Supports GraphQL queries through Peregrine.
- Parameters:
auth_provider (Gen3Auth) – A Gen3Auth class instance.
Examples
This generates the Gen3Submission class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.
>>> auth = Gen3Auth(refresh_file="credentials.json") ... sub = Gen3Submission(auth)
- create_program(json)[source]¶
Create a program. :param json: The json of the program to create :type json: object
Examples
This creates a program in the sandbox commons.
>>> Gen3Submission.create_program(json)
- create_project(program, json)[source]¶
Create a project. :param program: The program to create a project on :type program: str :param json: The json of the project to create :type json: object
Examples
This creates a project on the DCF program in the sandbox commons.
>>> Gen3Submission.create_project("DCF", json)
- delete_node(program, project, node_name, batch_size=100, verbose=True)[source]¶
Delete all records for a node from a project.
- Parameters:
program (str) – The program to delete from.
project (str) – The project to delete from.
node_name (str) – Name of the node to delete
(int (batch_size) – 100): how many records to query and delete at a time
optional – 100): how many records to query and delete at a time
default – 100): how many records to query and delete at a time
(bool (verbose) – True): whether to print progress logs
optional – True): whether to print progress logs
default – True): whether to print progress logs
Examples
This deletes a node from the CCLE project in the sandbox commons.
>>> Gen3Submission.delete_node("DCF", "CCLE", "demographic")
- delete_nodes(program, project, ordered_node_list, batch_size=100, verbose=True)[source]¶
Delete all records for a list of nodes from a project.
- Parameters:
program (str) – The program to delete from.
project (str) – The project to delete from.
ordered_node_list (list) – The list of nodes to delete, in reverse graph submission order
(int (batch_size) – 100): how many records to query and delete at a time
optional – 100): how many records to query and delete at a time
default – 100): how many records to query and delete at a time
(bool (verbose) – True): whether to print progress logs
optional – True): whether to print progress logs
default – True): whether to print progress logs
Examples
This deletes a list of nodes from the CCLE project in the sandbox commons.
>>> Gen3Submission.delete_nodes("DCF", "CCLE", ["demographic", "subject", "experiment"])
- delete_program(program)[source]¶
Delete a program.
This deletes an empty program from the commons.
- Parameters:
program (str) – The program to delete.
Examples
This deletes the “DCF” program.
>>> Gen3Submission.delete_program("DCF")
- delete_project(program, project)[source]¶
Delete a project.
This deletes an empty project from the commons.
- Parameters:
program (str) – The program containing the project to delete.
project (str) – The project to delete.
Examples
This deletes the “CCLE” project from the “DCF” program.
>>> Gen3Submission.delete_project("DCF", "CCLE")
- delete_record(program, project, uuid)[source]¶
Delete a record from a project.
- Parameters:
program (str) – The program to delete from.
project (str) – The project to delete from.
uuid (str) – The uuid of the record to delete
Examples
This deletes a record from the CCLE project in the sandbox commons.
>>> Gen3Submission.delete_record("DCF", "CCLE", uuid)
- delete_records(program, project, uuids, batch_size=100)[source]¶
Delete a list of records from a project.
- Parameters:
program (str) – The program to delete from.
project (str) – The project to delete from.
uuids (list) – The list of uuids of the records to delete
(int (batch_size) – 100): how many records to delete at a time
optional – 100): how many records to delete at a time
default – 100): how many records to delete at a time
Examples
This deletes a list of records from the CCLE project in the sandbox commons.
>>> Gen3Submission.delete_records("DCF", "CCLE", ["uuid1", "uuid2"])
- export_node(program, project, node_type, fileformat, filename=None)[source]¶
Export all records in a single node type of a project.
- Parameters:
program (str) – The program to which records belong.
project (str) – The project to which records belong.
node_type (str) – The name of the node to export.
fileformat (str) – Export data as either ‘json’ or ‘tsv’
filename (str) – Name of the file to export to; if no filename is provided, prints data to screen
Examples
This exports all records in the “sample” node from the CCLE project in the sandbox commons.
>>> Gen3Submission.export_node("DCF", "CCLE", "sample", "tsv", filename="DCF-CCLE_sample_node.tsv")
- export_record(program, project, uuid, fileformat, filename=None)[source]¶
Export a single record into json.
- Parameters:
program (str) – The program the record is under.
project (str) – The project the record is under.
uuid (str) – The UUID of the record to export.
fileformat (str) – Export data as either ‘json’ or ‘tsv’
filename (str) – Name of the file to export to; if no filename is provided, prints data to screen
Examples
This exports a single record from the sandbox commons.
>>> Gen3Submission.export_record("DCF", "CCLE", "d70b41b9-6f90-4714-8420-e043ab8b77b9", "json", filename="DCF-CCLE_one_record.json")
- get_dictionary_all()[source]¶
Returns the entire dictionary object for a commons.
This gets a json of the current dictionary schema for a commons.
Examples
This returns the dictionary schema for a commons.
>>> Gen3Submission.get_dictionary_all()
- get_dictionary_node(node_type)[source]¶
Returns the dictionary schema for a specific node.
This gets the current json dictionary schema for a specific node type in a commons.
- Parameters:
node_type (str) – The node_type (or name of the node) to retrieve.
Examples
This returns the dictionary schema the “subject” node.
>>> Gen3Submission.get_dictionary_node("subject")
- get_graphql_schema()[source]¶
Returns the GraphQL schema for a commons.
This runs the GraphQL introspection query against a commons and returns the results.
Examples
This returns the GraphQL schema.
>>> Gen3Submission.get_graphql_schema()
- get_project_dictionary(program, project)[source]¶
Get dictionary schema for a given project
- Parameters:
program – the name of the program the project is from
project – the name of the project you want the dictionary schema from
Example
>>> Gen3Submission.get_project_dictionary("DCF", "CCLE")
- get_project_manifest(program, project)[source]¶
Get a projects file manifest
- Parameters:
program – the name of the program the project is from
project – the name of the project you want the manifest from
Example
>>> Gen3Submission.get_project_manifest("DCF", "CCLE")
- get_projects(program)[source]¶
List registered projects for a given program
- Parameters:
program – the name of the program you want the projects from
Example
This lists all the projects under the DCF program
>>> Gen3Submission.get_projects("DCF")
- open_project(program, project)[source]¶
Mark a project
open
. Opening a project means uploads, deletions, etc. are allowed.- Parameters:
program – the name of the program the project is from
project – the name of the project you want to ‘open’
Example
>>> Gen3Submission.get_project_manifest("DCF", "CCLE")
- query(query_txt, variables=None, max_tries=1)[source]¶
Execute a GraphQL query against a Data Commons.
- Parameters:
query_txt (str) – Query text.
variables (
object
, optional) – Dictionary of variables to pass with the query.max_tries (
int
, optional) – Number of times to retry if the request fails.
Examples
This executes a query to get the list of all the project codes for all the projects in the Data Commons.
>>> query = "{ project(first:0) { code } }" ... Gen3Submission.query(query)
- submit_file(project_id, filename, chunk_size=30, row_offset=0)[source]¶
Submit data in a spreadsheet file containing multiple records in rows to a Gen3 Data Commons.
- Parameters:
project_id (str) – The project_id to submit to.
filename (str) – The file containing data to submit. The format can be TSV, CSV or XLSX (first worksheet only for now).
chunk_size (integer) – The number of rows of data to submit for each request to the API.
row_offset (integer) – The number of rows of data to skip; ‘0’ starts submission from the first row and submits all data.
Examples
This submits a spreadsheet file containing multiple records in rows to the CCLE project in the sandbox commons.
>>> Gen3Submission.submit_file("DCF-CCLE","data_spreadsheet.tsv")
- submit_record(program, project, json)[source]¶
Submit record(s) to a project as json.
- Parameters:
program (str) – The program to submit to.
project (str) – The project to submit to.
json (object) – The json defining the record(s) to submit. For multiple records, the json should be an array of records.
Examples
This submits records to the CCLE project in the sandbox commons.
>>> Gen3Submission.submit_record("DCF", "CCLE", json)