Addgene
Addgene is a global, non-profit repository created to help scientists share plasmids and other DNA-based tools. Plasmids are small, circular pieces of DNA found in bacteria and other single-cell organisms. Biomedical laboratories routinely use them to modify existing genes or introduce new ones into an organism, making them a powerful research tool. Applications for plasmids include advancing genetic research, creating disease-resistant crops, and developing new treatments for various diseases and conditions.
By using Nextflow to automate sequence assembly and QC workflow, Addgene was able to scale its internal sequencing and analysis to handle hundreds of plasmid genomes per week. This has helped them improve research productivity while minimizing costs, enabling them to better serve the research community.
Addgene manages an extensive repository of DNA-based research reagents commonly used in life sciences. When researchers publish papers, they typically deposit plasmids with Addgene. Once deposited, these plasmids are accessible to other researchers for future experiments. Addgene also facilitates MTA implementation, manages quality control and shipping, and maintains detailed and accurate records.
In the early days of its evolution, Addgene would simply spot-check critical regions of deposited plasmids using Sanger Sequencing. Since obtaining an Illumina MiSeq™ sequencer in-house, Addgene has evolved to the point where it now performs full plasmid sequencing in-house as part of its extensive QC process.
As Addgene began to sequence plasmids themselves, they faced the challenge of automating the bioinformatic steps needed to transfer raw sequence data into complete plasmid sequences that were ready for QC teams to analyze. The volume of data quickly overwhelmed existing processes.
After experimenting with several bioinformatic pipeline solutions, Addgene selected open-source Nextflow, a project maintained by Seqera, to automate their plasmid sequencing and analysis pipelines.
Nextflow provided multiple advantages over other pipeline orchestration software, including:
- Mixing scripting languages within pipelines to maximize the reuse of existing scripts and code.
- Mature container support, making pipelines portable across environments.
- The ability to easily switch between local and cloud-based execution environments without code changes.
- An extensive developer community and curated nf-core pipelines and modules that could be leveraged for Addgene’s purposes.
Deploying Nextflow enabled Addgene to sequence, analyze, and ultimately share hundreds of QC-verified plasmids per week by tapping resources in the cloud.
“By scaling up Nextflow pipelines, we were able to achieve a 9-fold reduction in manual effort by simply consolidating and automating our workflow. The time and cost savings extended to our platform engineering as well, since we no longer needed to purchase and maintain our own infrastructure.”
Jason Niehaus, Director of Data Science at Addgene
Nextflow also provides seamless access to multiple compute environments simultaneously, enabling scientists to run analysis workflows on-premises or on their choice of cloud platforms.