Harshil PatelHarshil Patel
Jun 23, 2022

Announcing Illumina DRAGEN integration with Nextflow Tower

DRAGEN is a platform provided by Illumina that offers accurate, comprehensive, and efficient secondary analysis of next-generation sequencing (NGS) data with a significant speed-up over tools that are commonly used for such tasks.

The improved performance offered by DRAGEN is possible due to the use of Illumina proprietary algorithms in conjunction with a special type of hardware accelerator called field programmable gate arrays (FPGAs). For example, when using AWS, FPGAs are available via the F1 instance type.

Running DRAGEN on Nextflow Tower

Given the need for fast and high quality data analysis in production settings, we have been working with the Illumina DRAGEN team to bring their technology to Nextflow users and to streamline the deployment process via Nextflow Tower.

We have extended the Tower Forge feature for AWS Batch to support DRAGEN. Tower Forge ensures that all of the appropriate components and settings are automatically provisioned when creating a Compute Environment for executing pipelines.

When deploying data analysis workflows, some tasks will need to use normal instance types (e.g. for non-DRAGEN processing of samples) and others will need to be executed on F1 instances. If the DRAGEN feature is enabled, Tower Forge will create an additional AWS Batch compute queue which only uses F1 instances, to which DRAGEN tasks will be dispatched.

Getting started

To showcase the capability of this integration, we have implemented a proof of concept pipeline called nf-dragen. To run it, sign-in into Tower, navigate to the Community Showcase and select the “nf-dragen” pipeline.

You can run this pipeline at your convenience without any extra setup. Note however that it will be deployed in the Compute Environment owned by the Community Showcase.

To deploy the pipeline on your own AWS cloud infrastructure, please follow the instructions in the next section.

Deploy DRAGEN in your own workspace

DRAGEN is a commercial technology provided by Illumina, so you will need to purchase a license from them. To run on Tower, you will need to obtain the following information from Illumina:

  1. DRAGEN AWS private AMI ID
  2. DRAGEN license username
  3. DRAGEN license password

Tower Forge automates most of the tasks required to set up an AWS Batch Compute Environment. Please follow our guide for more details.

In order to enable the support for DRAGEN acceleration, simply toggle the “Enable DRAGEN” option when setting up the Compute Environment via Tower Forge.

In the “DRAGEN AMI Id” field, enter the AWS AMI ID provided to you by Illumina.

Please ensure that the Region you select contains DRAGEN F1 instances.

Pipeline implementation & deployment

Please see the dragen.nf module implemented in the nf-dragen pipeline for reference. Any Nextflow processes that run DRAGEN must:

  1. Define label ‘dragen’

    The label directive allows you to annotate a process with mnemonic identifiers of your choice. Tower will use the dragen label to determine which processes need to be executed on DRAGEN F1 instances.

    process DRAGEN {
        label 'dragen'
    
        <truncated>
    }
    

    Please refer to the Nextflow label docs for more information.

  2. Define Secrets

    At Seqera, we use Secrets to safely encrypt sensitive information when running licensed software via Nextflow. This enables our team to use the DRAGEN software safely via the nf-dragen pipeline without having to worry about the setup or safe configuration of the license key. These Secrets will be provided securely to the “--lic-server” option when running DRAGEN on the CLI to validate the license.

    In the nf-dragen pipeline, we have defined two Secrets called DRAGEN_USERNAME and DRAGEN_PASSWORD, which you can add via the Tower UI by going to “Secrets -> Add Pipeline Secret”:

    Please refer to our recent blog post and documentation for more information about this feature.

Conclusion

DRAGEN technology by Illumina provides a massive speedup for recurrent task executions carried out in typical bioinformatics pipelines. Integration with Tower now greatly simplifies the implementation and deployment of Nextflow pipelines based on DRAGEN acceleration.

DRAGEN integration with Tower is currently only available for use on AWS, however, we plan to extend the functionality to other supported platforms like Azure in the future.

For more information about DRAGEN integration with Nextflow, and all other features of Nextflow Tower, visit the Tower documentation or reach out to us to book a demo.