Gen3 Query Class

class gen3.query.Gen3Query(auth_provider)[source]

Bases: object

Query ElasticSearch data from a Gen3 system.

Parameters:

auth_provider (Gen3Auth) – A Gen3Auth class instance.

Examples

This generates the Gen3Query class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.

>>> auth = Gen3Auth(endpoint, refresh_file="credentials.json")
... query = Gen3Query(auth)
graphql_query(query_string, variables=None)[source]

Execute a GraphQL query against a Data Commons.

Parameters:
Returns:

{“data”: {<data_type>: [<record>, <record>, …]}}

Return type:

Object

Examples

>>> query_string = "{ my_index { my_field } }"
... Gen3Query.graphql_query(query_string)
query(data_type, fields, first=None, offset=None, filters=None, filter_object=None, sort_object=None, accessibility=None, verbose=True)[source]

Execute a query against a Data Commons.

Parameters:
  • data_type (str) – Data type to query.

  • fields (list) – List of fields to return.

  • first (int, optional) – Number of rows to return (default: 10).

  • offset (int, optional) – Starting position (default: 0).

  • filters – (object, optional): { field: sort method } object. Will filter data with ALL fields EQUAL to the provided respective value. If more complex filters are needed, use the filter_object parameter instead.

  • filter_object (object, optional) – Filter to apply. For syntax details, see https://github.com/uc-cdis/guppy/blob/master/doc/queries.md#filter.

  • sort_object (object, optional) – { field: sort method } object.

  • accessibility (list, optional) – One of [“accessible” (default), “unaccessible”, “all”]. Only valid when querying a data type in “regular” tier access mode.

Returns:

{“data”: {<data_type>: [<record>, <record>, …]}}

Return type:

Object

Examples

>>> Gen3Query.query(
    data_type="subject",
    first=50,
    fields=[
        "vital_status",
        "submitter_id",
    ],
    filters={"vital_status": "Alive"},
    sort_object={"submitter_id": "asc"},
)
raw_data_download(data_type, fields, filter_object=None, sort_fields=None, accessibility=None, first=None, offset=None)[source]

Execute a raw data download against a Data Commons.

Parameters:
  • data_type (str) – Data type to download from.

  • fields (list) – List of fields to return.

  • filter_object (object, optional) – Filter to apply. For syntax details, see https://github.com/uc-cdis/guppy/blob/master/doc/queries.md#filter.

  • sort_fields (list, optional) – List of { field: sort method } objects.

  • accessibility (list, optional) – One of [“accessible” (default), “unaccessible”, “all”]. Only valid when downloading from a data type in “regular” tier access mode.

  • first (int, optional) – Number of rows to return (default: all rows).

  • offset (int, optional) – Starting position (default: 0).

Returns:

[<record>, <record>, …]

Return type:

List

Examples

>>> Gen3Query.raw_data_download(
        data_type="subject",
        fields=[
            "vital_status",
            "submitter_id",
            "project_id"
        ],
        filter_object={"=": {"project_id": "my_program-my_project"}},
        sort_fields=[{"submitter_id": "asc"}],
        accessibility="accessible"
    )