# Basic Git-Sourced Pipeline Example

## Demo Pipeline

This section describes how to configure the **nf-core demo** pipeline from <https://github.com/nf-core/demo/> to run in ICA.

{% hint style="info" %}
**The nf-core framework for community-curated bioinformatics pipelines.**\
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

*Nat Biotechnol.* 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).
{% endhint %}

### Configuring the Pipeline

1. Create a new pipeline at **Projects > your\_project > flow > Pipelines > Create > Nextflow > from Git**.
2. Fill out the following details:

<table><thead><tr><th width="203.04296875">Field</th><th>Value</th></tr></thead><tbody><tr><td><strong>Repository url</strong></td><td><strong>https://github.com/nf-core/demo</strong> (Tip: If you encounter <em>Invalid GitHub repository url</em>, you may have copied over a trailing space in the url)</td></tr><tr><td><strong>Pipeline name</strong></td><td>The pipeline name is automatically extracted from the repository. You can manually change this if needed, for example to prevent duplicate names.</td></tr><tr><td><strong>Version number</strong></td><td>Enter the version number. You can choose to use the version number from GitHub (1.1.0) or you can use your own version (for example 1 for being the first local version)</td></tr><tr><td><strong>Storage size</strong></td><td>As this demo pipeline uses very little resources, the smallest size we can select (<strong>3XSmall)</strong> will be sufficient.</td></tr><tr><td><strong>Git credential</strong></td><td><p>This is a public repository, so we don't actually need a credential and this can be left blank. However, there is a limit to how many anonymous calls can be made to a GitHub repository, so if that limit is exceeded, you will encounter <em>repo lookup rate limit reached</em> and will not be able to import the pipeline.</p><ul><li>To prevent running into this limit, select a Git credential.</li><li>If you don't have one, click the <strong>create</strong> button next to the Git credential. Enter the value of your personal access token and ICA will automatically obtain the Git username for it.</li><li>If you have no access token, you can create one with the <strong>Create personal access token on github.com</strong> button. The values there are already prefilled, but you can change the note at the top to easily identify for which purpose the token was generated.</li></ul></td></tr><tr><td><strong>Main file path</strong></td><td>The main.nf file containing the pipeline logic and data flow is in the root folder, so we keep this as <strong>main.nf</strong>.</td></tr><tr><td><strong>Config file path</strong></td><td>nextflow.config containing the execution parameters is in the root folder, so we need to enter <strong>nextflow.config</strong>. The inputForm file will be generated based on the parameters in this file.</td></tr><tr><td><strong>Schema file path</strong></td><td>The schema is <strong>nextflow_schema.json</strong>, so enter this value. The inputForm file will be generated based on the parameters in this file.</td></tr><tr><td><strong>Version</strong></td><td><p>You can enter the <strong>commit id</strong> of the version you want to use or use the tag to identify the version.</p><p>In this example, we use the <strong>tags</strong> to identify the version we want. From the screenshot below, you can see that there is a version 1.1.0 of the pipeline, so enter <strong>1.1.0</strong> in the tag field. When you enter this version in ICA, the commit-id for that tag will automatically be filled out. This commit-id is the long version, <strong>the short 7-character version is not supported.</strong></p></td></tr></tbody></table>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-87701617c0c8bfcc71214e4fb66c82c37d7b87ca%2Fimage%20(5).png?alt=media" alt=""><figcaption><p>Method 1 : finding and copying the full commit id</p></figcaption></figure>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-7ea774f2eefc0784be67dac2415d9c67222693a0%2Fimage%20(2).png?alt=media" alt="" width="563"><figcaption><p>Method 2 : finding and copying the tag</p></figcaption></figure>

### Creating the Input File

As per [instructions](https://github.com/nf-core/demo/blob/master/README.md) on the nf-core demo page, **create a samplesheet.csv file** on your local machine with as contents:

```
sample,fastq_1,fastq_2
SAMPLE1_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz
SAMPLE2_PE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz
SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz,
SAMPLE3_SE,https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz,
```

#### Using the GUI

1. In ICA, navigate to **Projects > your\_project > Data** and upload the created **samplesheet.csv** file to your project.

#### Using the CLI

If you have the [CLI](https://help.connected.illumina.com/connected-analytics/command-line-interface/cli-indexcommands) installed configured on your system, you can use the commands below to upload the samplesheet. If you do not have an active CLI, please follow [these instructions](https://help.connected.illumina.com/connected-analytics/command-line-interface/cli-installation) first.

1. List your projects with the command `icav2 projects list`
2. From this list of projects, enter your demo project by using `icav2 projects enter <your_project_uuid>` with your\_project\_uuid replaced with the uuid of your project.
3. If you have created the samplesheet.csv file and put it into your CLI directory, you upload it to the root folder of your project with `icav2 projectdata upload samplesheet.csv` . If this name is already in use, you can rename the file during upload by using `icav2 projectdata upload samplesheet.csv /samplesheet2.csv`

### Running the Analysis

#### Using the GUI

With the pipeline configurad and the inputfile created, you are ready to run your analysis.

1. Go to **Projects > your\_project > flow > Pipelines.**
2. **Select the created pipeline.** (The default name will be nf-core/demo) and choose **Start analysis** at the top of the screen.
3. You will be presented with the form below.
   * Enter an identifier (**user reference**) for your pipeline
   * Select the **samplesheet.csv** file as **input** and start the analysis.

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-e5fefef0d022978fdea92de5e106162f181679d2%2Fimage%20(7).png?alt=media" alt=""><figcaption></figcaption></figure>

You can follow the status of your analysis at Projects > your\_project > Flow > Analyses >

{% hint style="info" %}
If your analysis fails with *Module path must start with / or ./ prefix -- Offending module: plugin/nf-schema*, then the Nextlow version is not set to the latest one. Edit it on your **Projects > your\_project > flow > Pipelines > your\_pipeline > Details**
{% endhint %}

#### Using the CLI

If you have the [CLI](https://help.connected.illumina.com/connected-analytics/command-line-interface/cli-indexcommands) installed configured on your system, you can also use the commands below to work with pipelines. If you do not have an active CLI, please follow [these instructions](https://help.connected.illumina.com/connected-analytics/command-line-interface/cli-installation) first.

1. List your projects with the command `icav2 projects list` to retrieve their **project uuid**.
2. From this list of projects, enter your demo project by using `icav2 projects enter <your_project_uuid>` with \<your\_project\_uuid> replaced with the uuid of your project.
3. To list the pipelines in your project and see their **pipeline uuid**, use `icav2 projectpipelines list`
4. To retrieve the **file uuid**, use the command `icav2 projectdata list --file-name samplesheet.csv` . If you used a different name for your samplesheet file, use the name you gave it instead of samplesheet.csv.

{% hint style="info" %}
You can search for csv files in your project with the command `icav2 projectdata list --file-name csv --match-mode fuzzy`. If you want to
{% endhint %}

5. To run the pipeline with the input file,

* Select the **pipeline** (replace \<your\_pipeline\_uuid> with the uuid of your pipeline).
* Set the **storage size** (3XSmall). If this storage is not available in your subscription, you can use `icav2 analysisstorages list` to get a list of available storage sizes.
* Set the **user reference** to give your analysis a name. In this example we use MyDemoGitPipeline as name.
* Point to the **input file** (replace \<your\_file\_uuid> with the actual file uuid).
  * If you want to see a list of input parameters of your pipeline, use `icav2 projectpipelines input <your_pipeline_uuid>` this will show you the code of the input parameters. In our example, this is "input"

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-56b570a4cbfb31ee8048e1228b03f2f61e0e5def%2Fimage.png?alt=media" alt="" width="563"><figcaption></figcaption></figure>

The resulting command to run the pipeline is then:

{% code overflow="wrap" %}

```
icav2 projectpipelines start nextflowjson <your_pipeline_uuid> --storage-size 3XSmall --user-reference MyDemoGitPipeline --field-data "input":"<your_file_uuid>"
```

{% endcode %}

### Optional : Editing the Input Form

Even though the import function will have created an input form based on the configured files, you can customise this form to make it easier to use by removing or defaulting parameters.

1. Go to **Projects > your\_project > flow > Pipelines > your\_pipeline > Edit**
2. Navigate to the **Inputform files** tab. This will open the inputForm.json file.
3. Replace the contents with this minimalist input file

```json
{
  "fields": [
    {
      "id": "input",
      "type": "data",
      "label": "input",
      "helpText": "Select your input file",
      },
      "maxValues": 1,
      "minValues": 1
    },
    {
      "id": "outdir",
      "type": "textbox",
      "label": "outdir",
      "helpText": "The output directory where the results will be saved.",
      "hidden": true,
      "value": "out",
      "minValues": 1
    }
  ]
}
```

This will result in a minimal input form which only needs the input file selection.

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-76cf8f542fd06d98b8cde40658ea33999c2aca71%2Fimage%20(8).png?alt=media" alt=""><figcaption></figcaption></figure>
