# Sun Grid Engine (SGE) on ICA Bench

## Running Jobs in a Bench SGE Cluster

Once a cluster is started, the cluster manager can be accessed from the workspace node.

### Job resources

Every cluster member has a certain capacity which is determined by the selected [Resource](https://help.connected.illumina.com/connected-analytics/project/p-bench/bench-clusters#configuration) model for the cluster member.

The following complex values have been added to the SGE cluster environment and are requestable.

* static\_cores (default: 1)
* static\_mem (default: 2G)

These values are used to avoid oversubscription of a node which can result in Out-Of-Memory or unresponsiveness. You need to ensure these limits are not exceeded.

To ensure stability of the system, some headroom is deducted from the total node capacity.

### Scaling

These two values are used by the **SGE auto scaler** when running in **dynamic mode.** The SGE auto scaler will summarise all pending jobs and their requested resources to determine the scale up/down operation within the defined range.

Cluster members will remain in the cluster for at least 300 seconds. The Auto scaler only executes one scale up/down operation at a time and is stabilised before taking on a new operation.

{% hint style="warning" %}
Job requests that require more resources than the capacity of the selected resource model will be ignored by the auto scaler and will wait indefinitely.
{% endhint %}

The operation of the auto scaler can be monitored in the log file `/data/logs/sge-scaler.log`

### Submitting jobs

Submitting a single job

```
qsub -l static_mem=1G -l static_cores=1 /data/myscript.sh
```

Submitting a job array

```
qsub -l static_mem=1G -l static_cores=1 -t 1-100 /data/myscript.sh
```

{% hint style="info" %}
Do not limit the job concurrency amount as this will result in unused cluster members.
{% endhint %}

### Monitoring members

Listing all members of the cluster

```
qhost
```

### Managing running/pending jobs

listing all jobs in the cluster

```
qstat -f
```

Showing the details of a job.

```
qstat -f -j <jobId>
```

Deleting a job.

```
qdel <jobId>
```

### Managing executed jobs

Showing the details of an executed job.

```
qacct -j <jobId>
```

## SGE Reference documentation

SGE command line options and configuration details can be found [here](https://gridengine.eu/mangridengine/manuals.html).
