arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Base: Access Tables via Python

You can access the databases and tables within the Base module using Python from your local machine. Once retrieved as e.g. pandas object, the data can be processed further. In this tutorial, we will describe how you could create a Python script which will retrieve the data and visualize it using Dash framework. The script will contain the following parts:

  • Importing dependencies and variables.

  • Function to fetch the data from Base table.

  • Creating and running the Dash app.

hashtag
Importing dependencies and variables

This part of the code imports the dependencies which have to be installed on your machine (possibly with pip). Furthermore, it imports the variables API_KEY and PROJECT_ID from the file named config.

hashtag
Function to fetch the data from Base table

We will be creating a function called fetch_data to obtain the data from Base table. It can be broken into several logically separated parts:

  • Retrieving the token to access the Base table together with other variables using API.

  • Establishing the connection using the token.

  • SQL query itself. In this particular example, we are extracting values from two tables Demo_Ingesting_Metrics and BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL. The table Demo_Ingesting_Metrics contains various metrics from DRAGEN analyses (e.g. the number of bases with quality at least 30 Q30_BASES) and metadata in the column ica which needs to be flattened to access the value

Here is the corresponding snippet:

hashtag
Creating and running the Dash app

Once the data is fetched, it is visualized in an app. In this particular example, a scatter plot is presented with END_DATE as x axis and the choice of the customer from the dropdown as y axis.

Now we can create a single Python script called dashboard.py by concatenating the snippets and running it. The dashboard will be accessible in the browser on your machine.

Execution_reference
. Both tables are joined on this
Execution_reference
value.
  • Fetching the data using the connection and the SQL query.

  • from dash import Dash, html, dcc, callback, Output, Input
    import plotly.express as px
    
    from config import API_KEY, PROJECT_ID
    import requests
    import snowflake.connector
    import pandas as pd
    def fetch_data():
        # Your data fetching and processing code here
        # retrieving the Base oauth token
        url = 'https://ica.illumina.com/ica/rest/api/projects/' + PROJECT_ID +  '/base:connectionDetails'
    
        # set the API headers
        headers = {
                    'X-API-Key': API_KEY,
                    'accept': 'application/vnd.illumina.v3+json'
                    }
    
        response = requests.post(url, headers=headers)
        ctx = snowflake.connector.connect(
            account=response.json()['dnsName'].split('.snowflakecomputing.com')[0],
            authenticator='oauth',
            token=response.json()['accessToken'], 
            database=response.json()['databaseName'],
            role=response.json()['roleName'],
            warehouse=response.json()['warehouseName']
        )
        cur = ctx.cursor()
        sql = '''
        WITH flattened_Demo_Ingesting_Metrics AS (
            SELECT 
                flattened.value::STRING AS execution_reference_Demo_Ingesting_Metrics,
                t1.SAMPLEID,
                t1.VARIANTS_TOTAL_PASS,
                t1.VARIANTS_SNPS_PASS,
                t1.Q30_BASES,
                t1.READS_WITH_MAPQ_3040_PCT
            FROM 
                Demo_Ingesting_Metrics t1,
                LATERAL FLATTEN(input => t1.ica) AS flattened
            WHERE 
                flattened.key = 'Execution_reference'
        ) SELECT 
            f.execution_reference_Demo_Ingesting_Metrics,
            f.SAMPLEID,
            f.VARIANTS_TOTAL_PASS,
            f.VARIANTS_SNPS_PASS,
            t2."EXECUTION_REFERENCE",
            t2.END_DATE,
            f.Q30_BASES,
            f.READS_WITH_MAPQ_3040_PCT
        FROM 
            flattened_Demo_Ingesting_Metrics f
        JOIN 
            BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL t2
        ON 
            f.execution_reference_Demo_Ingesting_Metrics = t2."EXECUTION_REFERENCE";
        '''
    
        cur.execute(sql)
        data = cur.fetch_pandas_all()
        return data
    
    df = fetch_data()
    
    app = Dash(__name__)
    #server = app.server
    
    
    app.layout = html.Div([
        html.H1("My Dash Dashboard"),
        
        html.Div([
            html.Label("Select X-axis:"),
            dcc.Dropdown(
                id='x-axis-dropdown',
                options=[{'label': col, 'value': col} for col in df.columns],
                value=df.columns[5]  # default value
            ),
            html.Label("Select Y-axis:"),
            dcc.Dropdown(
                id='y-axis-dropdown',
                options=[{'label': col, 'value': col} for col in df.columns],
                value=df.columns[2]  # default value
            ),
        ]),
        
        dcc.Graph(id='scatterplot')
    ])
    
    
    @callback(
        Output('scatterplot', 'figure'),
        Input('y-axis-dropdown', 'value')
    )
    def update_graph(value):
        return px.scatter(df, x='END_DATE', y=value, hover_name='SAMPLEID')
    
    if __name__ == '__main__':
        app.run(debug=True)