From legacy scripts to ready-to-run Nextflow pipelines with Seqera AI
Write once, run anywhere
Bioinformaticians often develop unique, in-house workflows using widely used languages —such as Bash, Python, and R—choosing the one that best aligns with their project requirements, datasets, and computational resources. With Nextflow, you are no longer limited to a single scripting language — you can use your preferred language! By wrapping your scripts to Nextflow you can fully describe a computational pipeline with all of its dependencies and run it in nearly any environment.
While converting such scripts into production-ready Nextflow code is the ideal solution, it isn’t always straightforward. Ensuring the generated code is not only functional but also scalable often requires a deep understanding of Nextflow’s syntax and best practices, creating a barrier to adoption for many scientists.
AI for code generation
AI is stepping in to assist bioinformaticians. In a LinkedIn poll, we asked the community what they are using to generate code. The results were telling: 40% of respondents selected ChatGPT, 26% selected ‘Other’ (which included “my human brain” as one method), 24% selected Github Copilot, and 10% picked Cursor.
While such AI tools can be highly effective for coding tasks, none have been specifically designed to handle the nuances of Nextflow. For example, when asking ChatGPT to write Nextflow code, it defaults to DSL1 instead of DSL2, even when explicitly prompted by the user. As a result, researchers still spend a lot of time debugging, rewriting, and validating their Nextflow code. This raises the important question: “How can bioinformaticians ensure their code will run efficiently and at scale?”
Enter Seqera AI: Purpose-built for bioinformaticians
Seqera AI was designed to fill this gap. Purpose-built for bioinformatics, Seqera AI has been extensively validated by scientists and has a comprehensive understanding of Nextflow, bioinformatics tools, and the broader scientific community. With Seqera AI, scientists can accurately convert workflow scripts (e.g. WDL, CWL, bash) to DSL2 by simply pasting the script and asking the question: ‘Can you please convert this script to Nextflow?’. This generates code in DSL2 syntax with high accuracy:
Our priority in developing Seqera AI was to ensure Nextflow best practices and promote good software standards, simplifying testing (for beginners and experts alike) to better support the community. For example using nf-test, a framework designed to simplify testing Nextflow pipelines in an efficient and automated way. With Seqera AI, you can easily generate nf-test scripts, ensuring your Nextflow processes are reliable and perform as expected. You can ask the question: ‘Can you please tell me how to write an nf-test?’.
So far, we have demonstrated how you can easily and reliably generate Nextflow code using Seqera AI. But, how can we guarantee that it will run? Each process in your Nextflow workflow includes the Seqera AI script testing module, which enables one-click testing in an AI sandbox. The AI spins up a micro-VM with Nextflow and Wave pre-installed, defines the environment, and identifies the appropriate nf-core test datasets for the process. It then writes the sample to a file and executes the code to verify if it runs successfully. Seqera AI also detects errors, and will automatically self-correct the script to ensure your code will run.
You might be thinking, “I want to test this locally on my own machine”? To address this, we have also included a local testing guide that generates step-by-step instructions for running the code within your own environment.
Try Seqera AI now
Seqera AI is a bioinformatics agent purpose-built for the scientific lifecycle. With a comprehensive understanding of Nextflow, common bioinformatics tools, and the scientific community, Seqera AI enables you to generate, test and validate Nextflow code faster, providing you with code that will actually run.
Try Seqera AI now for free!