Seqera

2025 Year in Review

In Episode 53 of the Nextflow podcast, Phil Ewels returns from a hiatus to deliver a comprehensive year-in-review for 2025. Joined by Rob Syme (Scientific Support Lead), Marcel Ribeiro Dantas (Senior Developer Advocate), and Rob Newman (Product Manager Lead), the team recaps the major developments across Nextflow, Seqera Platform, and the broader community.

Key Links

Nextflow:

Seqera Platform:

nf-core & Community:

Summary

Nextflow Language Evolution

2025 marked a pivotal year for Nextflow’s evolution from a Groovy DSL into a first-class programming language. The introduction of a new syntax parser, available via the NXF_SYNTAX_PARSER=v2 environment variable, enables stricter, more opinionated syntax that lays the foundation for future language features.

The VSCode extension received a major upgrade with a proper language server that understands Nextflow syntax natively. Developers can now see errors inline as they type, with intelligent code completion and hover hints. Two new CLI commands, nextflow lint and nextflow format, help developers automatically identify deprecated patterns and standardize code formatting across teams.

A really significant addition was Lineage tracking in Nextflow 25.04. This feature writes JSON metadata to standardized locations, enabling full provenance tracking of workflow outputs. When chaining pipelines, you can now reference outputs by their lineage rather than file paths, ensuring complete traceability back to original inputs. Enable it simply with lineage.enabled=true in your config.

Workflow Inputs, Outputs, and Type Safety

The introduction of workflow outputs represents a fundamental shift in how Nextflow handles data publishing. Rather than scattering publishDir directives across processes, the workflow itself now controls what gets published. This keeps metadata and files together as channels, enabling automatic generation of sample sheets and structured output manifests.

On the input side, typed parameters bring schema-level validation directly into the Nextflow language. This moves toward parity with the nextflow_schema.json approach while providing native type checking for booleans, numbers, strings, and more. Looking ahead to 2026, variable typing will become an increasingly important part of Nextflow, enabling better developer experience through improved error detection and code intelligence.

Plugin Ecosystem and AWS Performance

The Nextflow plugin ecosystem matured significantly with the launch of a new plugin registry at registry.nextflow.io. This replaces the previous JSON-file-on-GitHub approach with a proper API-backed service, improving reliability and enabling new features like static analysis integration with the VSCode extension. Creating plugins is now easier than ever with the nextflow plugins create command, which scaffolds a new plugin from an improved template repository.

Under the hood, Nextflow 25.10 shipped with a complete rewrite of AWS integrations using AWS SDK v2. Combined with the new virtual threads option (available in Java 21+), this dramatically improves S3 transfer performance and eliminates the connection pool exhaustion errors that could affect large-scale runs. Workflows that work on small data now scale seamlessly to massive workloads with no additional configuration.

Seqera Platform Innovations

The Seqera Platform saw extensive updates throughout 2025. A completely redesigned Pipeline Run Details page now features progress bars, dedicated task and container tabs, and deep integration with Data Explorer for browsing work directories directly in the browser.

New single-VM compute environments on AWS, Azure, and Google Cloud offer 4-6x faster startup times compared to batch services, making them ideal for development and smaller-scale production workloads. For those ready to go further, Seqera Compute provides fully managed compute environments with $100 free credits for institutional users.

Fusion Snapshots revolutionized spot instance usage by checkpointing running tasks to S3. When a spot instance gets reclaimed, tasks resume from their last checkpoint rather than restarting from the beginning, making the 90% cost savings of spot instances practical even for long-running processes.

Studios gained significant traction with features like git repository cloning, custom Dockerfiles, ARM64 support, and spot instance backing. Data Explorer expanded to support any S3-compatible API, opening the platform to HPC systems and alternative cloud providers.

AI and Automation

Seqera AI emerged as a powerful development companion, helping users explore SRA data, understand pipeline code, and even build new plugins. Marcel shared how he built a complete Nextflow plugin in just hours using Seqera AI, from code to documentation to registry publication.

The MCP (Model Context Protocol) server, developed over a few days by Paolo and team, enables integration of Seqera Platform with Claude, Cursor, and Copilot. Users can query run history, debug pipelines, and interact with their platform data through natural language.

For pipeline orchestration, Node-RED integration enables sophisticated automation workflows. When a pipeline completes, you can automatically trigger follow-up pipelines, launch Studios for analysis, send Slack notifications, or even flash smart lights in celebration.

Community and Company Growth

The nf-core community continued its rapid growth with three major tools releases (3.3, 3.4, 3.5) introducing features like the test datasets command, improved download capabilities, and Topics support. A published roadmap details the adoption timeline for new Nextflow syntax features, with most changes landing in nf-core by early 2027.

The Nextflow Summit expanded with events in Boston and Barcelona, the latter moving to an online format with parallel tracks to accommodate more speakers from around the world, including a much-anticipated talk from NASA.

The Ambassador program launched its fifth cohort, now spanning 40 countries with over 600 activities conducted in just 24 months. Training Weeks, offered quarterly with live Q&A and certificates, saw over 1,000 registrations in 2025 alone.

On the business front, Seqera secured Series B funding and achieved SOC 2 Type 2 certification, ensuring the company’s continued investment in Nextflow and the open-source ecosystem.

Full transcript

Welcome

Phil Ewels: Hello and welcome to the Nextflow podcast. You are listening to episode 53 and we’re coming to you in 2026 and it’s good to be back. I need to start off this episode with a bit of an apology. To anyone out there who’s a regular listener who may have wondered what happened to an Nextflow podcast in the second half of 2025.

I’m just looking back at this website here, and yeah, episode 52 was the last episode with, Australian Bio Commons really good. But it was in June the third, and then, then, it went a little bit dark in the second half of, of 25. I do have an excuse. It’s not a very good one. I moved house and had lots of renovation work and everyone tells you it’s gonna take ages.

We thought we had such a great plan. yeah, it took ages and so, but I’m back. I’ve got a new house and I’ve got my recording set up back and, I’m feeling excited and enthusiastic for 2026 and to rejuvenate the Nextflow podcast and come back to you with loads of great new content and what’s, what, what better way to start off 2026 than with a recap episode.

Introductions

Phil Ewels: So that’s what we’ve got today. We’re gonna go back and look at all the best and brightest things that happened in 2025 in the Nextflow ecosystem. Talk about some of our favorite bits and, and, and pull out some of the juiciest facts for you. So to help me on this adventure today, I’ve got three guests.

I’m really happy to welcome one of the show, Rob Syme, who is Scientific Support Lead at Seqera. We’ve got Marcel, who’s the Senior Developer Advocate, and Rob Newman, Product Manager Lead. Thanks guys for joining us. It’s, it’s great to have you on. I think all three of you have been on the podcast before, is that right?

Rob Syme: Yeah, I’m looking forward to cramming six months of podcast into 45 minutes. No problem at all.

Phil Ewels: Exactly. Catch up time. And I think it’s not the first time we’ve had the two robs on, on one episode as well, so I’m, I’m sure it’s gonna be fun trying to.

Rob Newman: Any, anytime they’re talking about data and replication, you get multiple Robs.

Phil Ewels: Exactly. Anyway, so let’s kick in. I, I kind of, we’ve got a few different kind of broad areas I thought we could touch on. there’s loads of stuff happened in 2025 around Nextflow of course, and Nextflow language evolution is kind of really the, the big juicy topic there.

Rob Newman is gonna take us some through some of the stuff that happened in Seqera Platform. And again, lots of updates and, and improvements there.

Marcel’s gonna take us through some of the, his highlights from the community, from the Nextflow community and also nf-core. I mean, this is the Nextflow podcast. Rob Syme, what, what are your highlights from last year with, with Nextflow itself?

Rob Syme: I mean, you, you’ve sort of hit the nail on the head there describing this evolution of the language Nextflow. There are a couple of like the, as every year there are two major stable releases of Nextflow. One in April and one in October.

Nextflow Language Evolution

Rob Syme: And in both cases in 2025, we had these like. setting in place the changes to evolve the Nextflow language. So it has historically been this DSL built on top of Groovy and like 2025 saw Nextflow, hitting its stride and becoming its own language, becoming a bit more opinionated about the syntax. and this is all building to add features for Nextflow developers and Nextflow users.

One of the big ones was the, the new syntax parser, which has been available for some time, particularly for those users of the VSCode extension. So if you’re using VSCode in writing Nextflow workflows, you would’ve noticed the red squiggly underlines, saying, Hey, this is a, this, this feature or this use of the language, not encouraged anymore. I’m sure like I’ve seen that a lot.

Nextflow VSCode Extension

Phil Ewels: Vs. I mean, you touched VSCode. This is something else that was new in 2025, right? Maybe I’m jumping ahead here, but.

Rob Syme: yeah, it’s a good point. yeah, the VSCode extension itself is, is new and extended, and come, comes with its own language parser. So now the VSCode extension understands Nextflow syntax and can provide those syntax like highlighting, not just highlighting, but sort of, syntax to correction, in line in your browser before you run Nextflow.

I mean, this has, for as long as I’ve been involved in this community being one of the top requested features, I think everyone last year has had the experience of error like, curly brace. So, which is an error. Saying something in the Nextflow workflow is amiss. But now it’s such a refreshing thing to be able to see it as you type and as you commit the error rather than waiting till you run it.

Phil Ewels: The VSCode extension has been around for years actually, but and people used to say like, why can’t we, why can’t it work the same as other VSCode extensions? Why can’t we have those squiggly lines? Now, I dunno how many people in the audience would’ve actually looked into the VSCode before, but until now, it was a TextMate syntax extension. And basically what that means is it was hundreds of lines of regular expressions. It was just incomprehensible and very dumb. and now, I mean, this is a game changer right now. Now it actually understands the syntax, it understands the code.

yeah. So cool.

Rob Syme: It’s one of those things that I, for a long time, I thought was impossible. I thought, ah, this is just what it is to be A DSL on top of another language. And, and Ben has worked, incredibly hard and done what I thought was impossible. And now it is this like much more formal strict syntax.

So, and the, the lovely thing about this is it gives you a hint as to what is in the future. ‘Cause it lays the foundations for features that we’re building inside of the language. And if you’re interested in, making sure your workflow will be compatible in the future, there are a couple of ways and features that you now have access to.

The first is Nextflow lint, which is something you can run in, command line. You also bundle it into GitHub actions, things like that on every commit or every pr. You can have it automated, automatically identify of language features, which, will be deprecated.

There’s Nextflow format, which you can also run on the CLI, which will do as much as it can automatically. So just sort of bring your workspace up to speed.

And if you wanna run your workflows with the new syntax parser, it’s available, from 25.04 with the, environment variable NXF_SYNTAX_PARSER=v2. So you set that, then not only will you be like developing with the, new workflow syntax, but you’ll actually be running with new workflow syntax, which not only sort of disables the deprecated features, but also gives you access to fun new features like, workflow outputs and typed inputs, I think is one of the new ones. Workflow outputs is actually now available by standard in all the Nextflow versions, but, and typed inputs is one of the new ones.

Nextflow lint and format

Phil Ewels: There was a lot there. So let, let’s, let’s just break a couple of those down. But I wanna touch on a couple of things. Format, firstly is one of my favorite commands because I have been a huge fan of Black for Python formatting and, markdownlint, Prettier, Ruff, all these different languages. Is that great?

Rob Syme: So great.

Phil Ewels: Like all it’s, it’s a bit like the VSCode extension. We, we are all writing these other languages and, you know, we have our favorite features and one of the things I like, I love auto format is ‘cause I just don’t have to think about it.

It just takes it outta my brain. There’s no arguing. Prs have smaller diffs and now, now we have that in Nextflow, right? Where you can just chuck Nextflow code at it and it will move everything around and reorganize things in a really super standardized way.

And then what else you, so you’re talking about how there’s a an environment variable that you have to kind of pass to Nextflow so it uses the new language parser. That’s really important.

Rob Syme: I’ve seen a lot of people who are, are running Nextflow in production, they’ve started on their test, instances or their staging instances, running those same workflows with that language that, NXF_SYNTAX_PARSER=v2, enabled just so they can, make sure that, everything is, will be up to scratch.

Phil Ewels: If people flick this, this switch on, and they’re using the new syntax parser, does that mean that their pipelines will no longer work with the, the older versions of Nextflow?

Rob Syme: No, no, no. Everything should be backwards compatible. Yeah,

Phil Ewels: Right. Sorry. I will stop inter interrupting. You. You were, you were, you were flowing. You were, you were, you were on a roll. I, I’ll let you.

Rob Syme: great. No, I appreciate the, I appreciate the, the interruptions.

Nextflow Lineage

Rob Syme: So, you know, above and beyond the, the syntax process, one of the, my, like, actually my favorite feature by a long way is, lineage, which was available in 25.04.

So, anyone who’s spoken to me in real life will probably have had an ear bashing about this at some point. But I, I hate paths and I am terrified when by the fact that people refer to the inputs as like the file in this location because that can change over time.

The lineage tracking feature that’s now built into Nextflow, we build you nice JSON objects and write them to standardized locations, which is great. like to have tracking.

But it also has this really cool feature, which means you can refer if you’re, particularly if you’re chaining pipelines, if you have Nextflow for doing some pre-processing, and then you have a separate pipeline to take that input, the outputs of that and do something else. You can, instead of referring as an input, a file in this location, you can say, I want as an input to the second pipeline, the file that was produced by this other workflow. And so implicitly you have the full lineage, not only of this workflow, but of the inputs that workflow, so you can trace all the way back.

It’s a, it’s a really important, I think not, yeah, not quite. It’s hard to sort of understand, but when you see it in practice, and particularly when you have to answer those questions, it’s like, okay, so this is the third workflow in a chain here. How is the input to the first workflow generated?

When, when you’re able to answer that question, both for like, for legal, but also just for scientific reasons, just in case you wanted to double check, what were the impacts of those inputs were. That’s really important.

Phil Ewels: I think you’re being quite modest here, as I think a big chunk of why this feature has developed was your, your impact.

I remember Rob, I think, was it a 2023 Seqera retreat or might maybe even earlier than that. I remember sitting down, I think it was Rob Newman, I’m pretty sure you were there as well, maybe Marcel. I remember sitting down at a table outside the restaurant and putting my phone on the table recording and being like, right, Rob, tell me all about this lineage stuff again. Tell me all about your, your crazy kind of blockchain ideas and all that.

And you, you’ve done a, an nf-core bite-size talk about this as well, right? If people want to go deeper.

Rob Syme: Yeah. And, and like it’s, it’s work continues and I’m still, hacking away at, plugins and things like that to, to make this, I would like to try and get, sort of adoption about this, of this and try and make sure people understand it. But yeah, think it’s gonna be great. It is great.

Phil Ewels: Is is it is a feature that I am really excited about with Nextflow, and I find it more difficult from almost any, like when we talk about Nextflow lint in format, it’s easy to get excited about it. It’s easy to convey why it’s gonna make people’s lives better.

With Nextflow lineage, it’s one of these really like low level things, which I think is gonna become fundamental to the future of how people use Nextflow.

But it’s really hard to explain why people should care about it today. You did a great job of it in your, your recap there. And I think it’s, it’s, it’s such a low level feature. It’s, it’s got so much potential for us to build on top of it now.

Rob Syme: The nice thing about it is that like if you, if you’re a little curious, you can just turn it on with lineage.enabled=true in your Nextflow config and it’ll start writing, dump it. Point to an S3 path or a shared file system location. it costs basically nothing to enable. and then should you, at sometime in the future, be interested in those provenance, you can query that, that those collection of JSON files, for that lineage information or even build your own tools on top of it you, if that’s necessary.

So yeah, try turn it on, see you and just give it a go.

Phil Ewels: Another benefit of turning it on is that, one of the things I find myself talking about with it, with Lineage is if you have a cache miss. So if something is not resumed properly and you dunno why Lineage makes it super easy to compare those two outputs and it will, it will, it lists out everything that was in the cache hash so you can see exactly why.

And you don’t have to do anything with dump channels. You don’t have to rerun any workflows. ‘cause if you’re already saving lineage from every run, all that data’s already been put to disk.

Rob Syme: That task cache is, yeah, it’s, it’s a great point. As a full province of the tasks, not just the workflow outputs.

Nextflow Plugin Registry

Rob Syme: So I talked a little bit about plugins there. one of my favorite features also that came out of 2025 is a new sort of approach to this plugins, in the Nextflow ecosystem. And that is a couple of features.

Firstly, there’s been some development on the sort of little example plugin repository, which used to involve, of gradle commands that were a little bit intimidating, certainly in a pre LLM era, less intimidating these days, but in a pre LLM era was, intimidating for some.

So that we’ve built some features around the, the way the plugins are built to make that make the sort of the minimal plugin repository a lot more grok-able and easy to understand. But in addition to that and sort of accompanying that is we have a new plugin registry, which is hosted by us, at registry.nextflow.io.

And so this has modernized the sort of plugin registry away from a JSON file on a GitHub. So previously we were reliant on GitHub to host this, and then if there were outages, GitHub, plugin wouldn’t work. And so we wanted to move away from that, something that we control, so that we can provide a little bit more uptime on.

So this is both, a safer and sort of more secure way to download plugins, but it also means that if you’re developing plugins, you can upload, those plugins to that registry and use it to advertise the, the, the fact that this plugin exists, get through descriptions and readmes and, a searchable interface for all the available Nextflow plugins.

So I recommend you check it out.

Marcel Ribeiro Dantas, Ph.D: It’s not a Seqera only plugin repository, right? Anyone can, can upload their, their plugins, and I did it myself. And actually the experience is so nice, like claiming a plugin, which is the language we use there, it’s so easy and straightforward.

So if you have a plugin yourself, it’s very easy to upload there. And the great thing is that once it’s, it’s uploaded there, it’s very easy to import in anyone else pipeline through the internet, right? So having a central registry, that’s the, the, the gotcha of that. So I had a great experience uploading like claiming plugins there.

And if you have any plugin of yourself, I really recommend you try.

Phil Ewels: We can see that in effect now as well, because honestly, I mean, Nextflow plugins have been around for years and years, but, there haven’t been very many. You’ve had to be really dedicated to go through the walkthrough, the fire of, of plugin development. But since we’ve released this new registry only a few months ago, we’ve seen the number of new plugins just explode. That, that’s, that’s, that’s so good that that’s a sign of a healthy ecosystem. Right? That’s what we want is for people to build, for build on top of Nextflow.

Rob Syme: You have some plugins which are very aimed at broad application. You have things like the co2footprint plugin, so which is like, workflow agnostic, plugins that are like useful for everyone. And then you’ve got, plugins for particular ecosystems or products like the Databricks plugin. And then you might have at the other end of the spectrum, people like plugins for a particular groups LIMS system or a, that are only really relevant for that particular group.

So like, whether you are, would like to develop a feature for global usage or whether you just wanna make your life a little bit easier and your research group’s life a little bit easier by hooking into some of your own internal systems.

The plugin system offers you both of those groups of people. Something, interesting.

Phil Ewels: There are two points in your recap, Rob that I wanted to to flag as well. One thing is, there’s the new template, GitHub repo, which you mentioned, which is, you’re right. We can now click on that and it’s updated and it’s much better. There’s also a new Nextflow command, you can do “Nextflow plugins create” the Nextflow, will prompt you for a few placeholder variables, and then it will create a plugin for you based on that template, repo it, but it just prefills a few things. So that’s kind of a, a small nice thing.

And there’s also a couple of motivations for, for the new registry. You talked about scalability, that we were pulling this huge JSON file off GitHub every single time you run Nextflow, which is kind of not ideal. and uptime. there’s also some other kind of fundamental features that the registry gives us.

it’s backed by a an API, publicly accessible API, which we encourage anyone to build off. And one thing that does build off that API is, going back, the language server, the VSCode extension uses it. And what it does is, when it does a static, analysis of your Nextflow code, it looks at the config. And until now, if you had any config, it came from a plugin. The language server didn’t know about the code that was in that plugin, and it could only know about it by running the plugin, which we don’t want to do ‘cause we’re not, we’re doing static analysis. So what it does now is it queries the, the registry. And the registry has a static analysis of every plugin, and from that it can pull back all the different config scopes. So now we have rich developer experience for, for configs around plugins thanks to the plugin registry.

And it’s, that’s one example of different new functionality that we can build in, which is impossible with a previous system.

Sorry, I’m nerding out.

AWS SDK v2 upgrade

Rob Syme: Well, if we, if we’re, if we’re in the mood for nerding out, there is one other important feature that I want to talk about that was in 25.10, which is the switch, which is, this is, this is a deep in Nextflow, but this is a switch to the new AWS SDK, which is a lot of letters in quick succession. but it’s a really important change for Nextflow.

So the AWS SDK, the software development kit, the sort of library that AWS provides to interact via Java with the AWS API and services reached end of life. The version one reached end of life, I think in the end of last year. and as of 25.10 ships with the new nf-amazon plugin, which uses that, AWS SDK v2.

And this is more than just, a version bump of this is a complete rewrite of all of the ways Nextflow interacts with AWS. Granted, AWS is only one of the many executors, but it’s a, it’s a big part of the Nextflow ecosystem. So that, like all of the AWS services, whereas S3 batch, EC2, ECS, CloudWatch, SQS, like all of these services under the hood are now changed.

The most important, I think is probably the way we interacts with S3. We have this new high performance S3 transfer manager. So all of the S3 interactions are much faster. And that’s really important, particularly for people running at serious scale when you have thousands or tens of thousands of tasks running at once.

This is particularly relevant if you are using also the new, virtual threads option, which is, option that you can turn on via an environment variable. So this is a feature that’s only available in Java 21 and above, but you can turn that on in combination with the new, AWS SDK v2.

It means that instead of having a pool of connections that Nextflow can use to communicate with AWS, which can be exhausted at times. If you have these two new options enabled, you have really efficient communication with, AWS. And so instead of consuming a pool of resources, it’s very, very cheap to generate these and just park them the threads that are communicating with AWS. And so you can have tens of thousands or even hundreds of thousands of tasks communicating with the AWS S3 and their, because the threads are so cheap, you’ll never exhaust that thread pool. If we just park the threads and we just communicate and as they are, as the comms come in with AWS.

It’s, a lovely feature, which means that your Nextflow workflow, if it works on small data, it also works at massive scale and there’s no more configuration changes that you have to sort of step up to, as you scale.

Phil Ewels: You’re right. That was nerdy, but I enjoyed it.

Rob Syme: I have to talk about it here because. Because if if, if I don’t talk about here, you won’t even notice. Like that’s the wonderful thing about these features. It is one of those features that took a lot of work. Again, thank you [Jorge Ejarque] for putting in the effort here, put in a lot of effort so that people don’t notice when, your Nextflow runs scale without issue. So, have to have to talk about it here.

And if you’ve ever had that, connection thread pool, timeout, exhausted, error in Nextflow. If you’re doing massive runs, if you switch these on now, it won’t be an issue.

Phil Ewels: Yeah. And there’s a blog post right where as well, where we, we checked out some of the benchmarking results on, on that upload and download speed. So if you’re interested, you can, you can read a bit more now.

Rob Syme: Faster, more efficient, better in every way. It’s fantastic.

Phil Ewels: The unsung heroes of open source development where people, people like Jorge, pushing his features out and you, you never notice that Nextflow just doesn’t crash quite as much as it used to. It’s like a, you know, incremental improvements, but they’re, they’re super important.

Rob Syme: Yeah.

Workflow Inputs/Outputs and Static Types

Phil Ewels: You, you touched on it earlier, but it’s such an important feature. You talked about workflow inputs and outputs and static types. We’re gonna have a whole episode based on this. So, so we, if you were interested in this, there’s, there’s a talk about it from an Nextflow summit last year. Ben and I had a webinar, and we’re gonna go deep on it, on a podcast in a month or two about this.

But, but Rob, just tell us a little bit more about what, what these are and why they’re important.

Rob Syme: So I think everyone who’s developed Nextflow is very familiar with the idea of publishing outputs from a workflow via the publishDir directive, which is attached inside of a process.

And this is really a, a holdover and anachronism from the way, way, way back in the day the, the first version of the Nextflow language, is developed.

But it becomes a little bit of a challenge if you’ve got modules and processes, which is sort of the belly of the beast of a workflow being the, the thing that emits the outputs, which are the sort of the external interface between the workflow and the, the consumers of that workflow. The people are gonna be consuming the data.

So, Nextflow language is move to this new system of workflow outputs, one end. And then, so the workflow itself now has outputs. So rather than publishing files from processes or modules, now the workflow itself, that top level is responsible for publishing of files.

And my favorite thing about this feature is that you can, you’re basically publishing channels, rather than publishing files. So wanna make the, the beautiful thing about Nextflow is that you have this system, this data flow system where you have the metadata and the files together attached, flowing through the workflow itself. So there’s no, there’s no idea of having to reach out somewhere else to grab the metadata and the workflow outputs allow you to publish channels.

It gives you lovely things like automatically generating sample sheets or JSON yaml structured data, which has the metadata and the paths to those output files all together.

And the sort of symmetrical end of the, this is typed inputs. So Nextflow, optionally, now using the strict parser, you can define the inputs to the workflow.

So this is, an inside of the language itself, inside of the workflow, sort of an amendment or an improvement on the nextflow_schema.json.

Phil Ewels: Exactly. Yeah, I mean it’s, it for inputs is basically a kind of a bit of a shuffle on how params and parameters are defined in a workflow. it used to be that parameters were kind of just in a config, but then you referenced them directly, like within processes. Again, it’s that kind of scope, and kind of very wide ranging.

Whereas now, params with workflow inputs, they, they go at workflow level. and they, they also have much more metadata. I remember sitting in the, the, the lunchroom at SciLifeLab lab with Paolo in like 2020, talking to ‘em about this new Nextflow schema thing I’d come up with and talking whether we could extend the config.

It’s taken us a while to get there, but now they’re, but we’re, we’re working towards a future where there’ll be basically one-to-one parity with the nextflow_schema.json file, and also the, the, the language itself.

So Nextflow will be fully aware of, types. So if, if it’s a Boolean and you know, or validate everything, whether it’s a number, it’s a string. And then those params, are much more powerful, better, much better defined, and we can then pass them into the workflow.

And, and like you say, it’s that interface between pipelines. Right? Rob Neuman, I think you’re gonna touch on pipeline chaining in a little bit. It’s these fundamentals which, are kind of requirements for then doing more stuff downstream outside of Nextflow itself.

And when I talk about these types variables, I mean typed params, that’s huge. And we will see this as we go into 2026 with more, more language features coming. The, the variable typing system is gonna become a really important part of Nextflow. It’s so important and it’s so well liked in other languages and it’s now gonna start coming to Nextflow. And it gives you type safety, and much more rich intelligence about the code so that Nextflow knows what kinds of variables are being passed around.

It can spot typos and errors, ahead of time and just basically provide a much better developer experience.

Rob Syme: And had a better experience for the users of the workflows. Like Nextflow is this beautiful language and it’s so lovely to develop in the Nextflow. But the people who are using the workflows who are running nextflow run nf-core/rnaseq on my data, they don’t care how beautiful and elegant and fun it was to develop the workflow. They care about the data.

And so this is Nextflow pushing and being a little bit more opinionated and helping people either side of their workflow, not just like presumably the listeners of this podcast, the people developing the workflows.

Phil Ewels: Yeah. And also it is, it is a very nice circular topic, Rob, that you brought us onto there. ‘cause it brings us back to Lineage. ‘cause again, I remember you saying in frustration to me years ago. Nextflow has all this information, it knows all this history, it’s got all this metadata and you get to the end and you publish the files and you just chuck it away.

But, but now, you know, workflow outputs and lineage, you know, we are, we’re actually holding onto all of that. and, and we can now use it.

Rob Newman: It’s also a little bit of watch this space as well, because I know this is 2025 retro, but in terms of 2026, we’re gonna be using that data as well in the, in the platform as well. I.

Phil Ewels: Yeah. Yeah, 2025 was all about foundations, I kind of feel like. And 2026 is hopefully where we see, see some of these bear bear fruits.

Seqera Platform: Pipeline Run Details

Phil Ewels: Okay. Rob Newman, tell us a little bit about some of your highlights of 2025.

Rob Newman: Yeah, I think the biggest one that affects most users of Nextflow, who are also Platform users will probably be the, the pipeline specific changes that we’ve made, and specifically the new pipeline run details page. That was a big change that was pushed out prior to the Boston Summit earlier this year.

And you know, now we have a progress bar defining, you know, clear changes in, as the pipeline progresses. Now we also have a dedicated run info tab. we also have a task tab that is the default now. So you can easily down into those tasks. And also for those of us that are using containers, there’s now a containers tab as well, so you can really drill down and see what is being built.

So that was, that was probably the biggest one, coming out of, the first half of the year. That probably affects a lot, a lot of users.

Phil Ewels: You know, it’s a good feature when I can’t even remember what the old run page looked like anymore. Like I was trying to remember kind of what it looked like previously, and I’m so used to a new one now. It feels, it doesn’t annoy me. It doesn’t get in the way. It just, I just go, I just think about what, what it’s showing me.

And I think that’s a sign of a good, good design. yeah.

Rob Syme: It gets out of the way when it needs to and shows you what you need.

Marcel Ribeiro Dantas, Ph.D: There’s so much information that you would like to have, right. And to organize it in a way that looks good very, very difficult job.

Rob Newman: Yeah, I think think, like you were saying, it shows very high level like core or critical information that you need to know about your pipeline. But if you really want to, you can really, really drill down, the past level and, and really get fine grained information and to help you troubleshoot pipelines, especially.

Phil Ewels: Shout out also to one of, again, this is the, I feel like the podcast is the home for all the unloved unnoticed features, and there’s one here in the pipeline page, which I think. Is one of my favorite features actually. If you go to a task table and you click a task and it brings up the modal with all the details, there’s a little tab hidden there and I can’t actually remember what it’s called now, but it’s called like data or something like that in this little popup modal.

And that will load up data explorer if it works on your compute environment and show you all the files of, in that work directory of that task. And it’s two clicks to get there and you can just see it in there in platform. And I love it. And I think a lot of people don’t realize it’s there, which is probably on us to make it a bit more like obvious.

But, it’s so cool. Again, it’s like lo low friction and, and I love that.

Rob Syme: Huge benefit for people, particularly for people who are used to operating on HPC. So the Nextflow, like diehards who’ve grown up on HPC, where you can see the, into the work directory, bash .command.run to run a single task and like just grab the just isolated idempotent task, the unit of the workflow.

And then when they moved to cloud, they found their data is a long way away that like, it’s not accessible or viewable in the same way. And then that feature that you’re describing there, Phil operating on the task directory, the task working directory via data explorer, it gives you that sense that, oh, okay, I am back to my familiar, like this thing is not a, it’s not obfuscated, it’s not invisible, it’s presented and, and here for me.

Dynamic resource labels

Rob Newman: Also think one of the features that maybe came in under the radar for many people as well is this idea of dynamic resource labels. you can have, you know, there’s labels and there’s resource labels, there’s also the ability now to do dynamic resource labels, which is the, the Nextflow process ID and also the workflow id.

And if you are, if you really want to drill down and understand. is costing how much in, in your platform, like in your pipeline runs, you can now do that. And with these dynamic resource labels, they automatically resolve themselves at execution time.

So it, it allows you to just set one key and value and then that resolves itself and you can then go into your cost allocation reports in your cloud provider and really get an understanding of, though it’s in a shared compute environment, how much did this execution of this run cost?

It’s like, it’s a, it’s a big deal and it’s, it’s kind of, again, a little bit hidden in the weeds as well.

Pipeline versioning

Rob Newman: Yeah. And then I think something that, that is really important as well for our users, that has been. I’ve been discussing with, our users of Nextflow for almost a year now, has been this idea of pipeline versioning.

This idea of versioning and pinning specific versions, is really gonna allow enterprise customers especially be able to leverage this idea of different per workflow configurations.

It was talked about for the Barcelona Summit and, yeah, it’s, if people are interested, it’s in, it is in private preview right now, but, yeah, that’s, that’s gonna be a really big deal for customers who are doing pipelines at scale.

Phil Ewels: I mean, I’ve, going back to my previous life, I remember having like lists of pipelines, like, you know, RNA-Seq, version 3.1, RNA-Seq, version like 1.1. and so on and just in a huge list. ‘cause you’d, we’d, we’d want to keep running one version while we’re kind of testing and validating the next version and so on.

And now that’s all sort of sucked together in one place.

Rob Newman: Yeah. Yeah. It’s really, it’s a really powerful feature.

Pipeline Chaining with Node-RED

Rob Newman: And then one of the other features that we’ve still on the topic of pipelines that a lot of our, users have been talking about is this idea of pipeline chaining or automation. And I, I think Phil, you had, you did the Node-RED, was it in an earlier podcast or was it just a blog post?

Phil Ewels: It’s just a blog post. I’m hoping to dig into it a bit in a future episode ‘cause obviously it’s a bit of a pet project of mine.

Rob Newman: Yeah. You, you, I think you showed, you showed this actually during a, during a meeting, when a pipeline execution successfully passes or, or completes you get like flashing lights in your room. ‘cause you’ve got like smart lights attached to your, home and network. And that was, that was pretty cool.

Phil Ewels: So the, the context here for people who’ve never heard of this before, Node-RED is an open source package. It’s been around for, for donkeys years. And it’s, it’s internet of things, so it’s used lots in industry if you’ve got a factory and stuff like this. So any, any event driven system where you have like a flow of data and you want to kind of do stuff with it and build automation, and it’s a kind of a low code system, so it’s drag and drop interface, but you can put in JavaScript and stuff.

I’ve been using it for, you know what, it’s actually Evan who got me into this because when I joined Seqera ages ago, or even before I joined Seqera, I noticed on his GitHub profile that he was a member of the home assistant organization. And I was like, what’s that? And then I found out what it is. It’s a open source platform for doing home automation. I have it running on Raspberry Pi. I got into that. I got into Node-RED for my home home stuff and then I was like, oh, you could use this for, for automation of pipelines and platform.

So my first demo was in home assistant ‘cause that’s where I had it running and I made all my lights flash and stuff like that. And yeah, I showed it in that meeting. What, what, what, Rob’s not really saying is everyone was like, great Phil. is that useful?

Rob Newman: Well also sometimes you need to, you know, you need to bring the sizzle, you know, so that, that there’s a little bit of sizzle there. So, you know, that’s, it’s showing people, it’s a very sort of obvious way of like seeing all these different interactions, across different network devices.

But I, but what I think we’re seeing is the evolution of that, where people are saying, oh, well, if there’s different events that trigger other events, then why not triggering a pipeline when that successfully executes, trigger another pipeline or trigger another three pipelines.

And that, that is where the usefulness comes in. and we’re starting to see customers really adopting that.

Phil Ewels: Yeah, so I’ll come back to this in another episode in more detail, but if you’re interested, just Google Seqera Node-RED and it’s now like there’s a website with docs and there’s a whole load of video tutorials I’ve done as well, including one with a disco at the end if you, if that’s all you care about.

But yeah, you can do, you can do all sorts. You can spin up Studios as well, Rob, but I know Studios is your kind of passion and like, so you know, the pipeline finishes, you can launch a Studio and then send a message into Slack saying, Hey, this pipeline finished, here’s a running Studio with, you know, R-IDE or whatever, like ready to do for downstream analysis.

You can do basically anything you can think of. It’s really cool.

Rob Newman: Yeah, it’s, yeah, it’s really extensible, which I think is really fantastic.

Single VM Compute Environments

Rob Newman: I guess talking about extensibility, it’s maybe a good time to talk about we have all these new, single, cloud VM compute environment families that we’ve now released.

So on AWS Azure and, and Google Cloud, we have, you know, these small Compute Environments that you can build, which were really built to address these like long launch delays when you are using batch computing services within the, the major cloud providers.

So now with, with these new single VM Compute Environments, you get much, much faster startup times, I think, you know, four to six times faster in terms of compute startup times.

I think I actually did a poll earlier this year of all the scientific development team at Seqera, and I was like, Hey, what is like the biggest eureka moment that you had when you first started using Seqera? And everyone said Batch Forge, right? The ability to create, on the fly Compute Environments and Compute Environments are amazing. especially, you know, As those, those batch services provide new features, like we want to incorporate those new features and we get feature requests all the time for Compute Environments.

But a lot of times customers just want a super simple, you know, give me a vanilla compute environment, make, you know, provision, go fast. all they really care about.

And these single cloud VM Compute Environments have designed specifically for that. So super fast, super quick, fast uptime, very little configuration that is required by the user. Many fewer dependencies. And, you know, they think they come with things like spot instance support out of the box.

So if you have a relatively small scale pipeline with, you know, quite well-defined resources that you are gonna consume, those are by far the fastest way of, executing your pipelines. You know, and obviously if you’ve got a very large pipeline, like, I don’t know, like a Sarek pipeline, for example, that are relatively complex, you know, we still recommend that you use the batch services. But these, these single Compute Environments have been really powerful and really, really well adopted by customers.

Phil Ewels: That first startup time is just, it makes it so much easier to test stuff. You know, you can just, you just kick it off and you have to wait a few seconds rather than what, five, 10 minutes for a batch batch to, to get around to running your workflow is really good.

Rob Syme: There’s like lovely efficiency gains at that upper end the Batch Forge in when you might have a large team of people and suddenly you can share resources and sort of pack lots of tasks into a single large virtual machine. But it’s surprising how far along the spectrum that actually starts to make sense.

It for a lot of work and a lot of teams, I imagine it actually is more cost effective to be running on single VM CEs. Not just the experience of the startup time, but even in a cost sense. but of course it’s lovely to know that when you pass that threshold, you switch to Batch Forge or even beyond that, switch your own custom Terraform set up and build your own batch CEs. But like all of those, you’ve got an option.

Rob Newman: Yeah, true. Actually, you did mention the Terraform piece there as well, which is,

Rob Syme: Love me some Terraform

Rob Newman: yeah. Like you can build the, all the resources in the platform using Terraform now, which is a big, big thing that we rolled out this year as well, which is, yeah, I would encourage people to go, look, look that up as well.

Phil Ewels: Listen to the podcast. I’ve, I’ve, I’ve asked, to, I’ve asked Ken and Adam to join me to record another episode pretty soon about, about Terraform. So we’re gonna, we’re gonna dive deep on why Ken cares so much about Terraform. ‘cause it is a lot.

Rob Syme: He, Ken has like convinced me. It is. It is amazing and I’m looking forward to that podcast.

Rob Newman: Yeah, it’s, it’s a big deal, especially for customers who already have, you know enterprise cloud, deployments of other infrastructure, you know, and they’re using Terraform already. It’s, you know, it just simplifies their provisioning, their security posture, everything related to that. Yeah, it’s really powerful.

Seqera Compute

Rob Newman: But speaking of compute, a big thing that we released this year was Seqera Compute, which is Seqera providing these fully managed, Compute Environments for you.

And it’s literally now if you, if you are new user to the platform, you can sign up and you get a hundred dollars of free compute credits. And, what we do is we, you know, you add your pipeline, we create a bucket on your behalf. Well, at the very bottom level, we create an AWS tenant on, on your behalf, and that we manage, and then we create buckets for you, and then we create Compute Environments for you that you can use.

So for customers who are, Marcel’s, keeping me honest here, you need, we need institutional emails Yeah. For users to be able to sign up. But what we’ve seen is customers who are used to using Nextflow, and they’re, you know, they’re not sure about what their compute environment needs even are, right?

They’re not sure like, how many CPUs do I wanna allocate? What’s the sort of memory I need to allocate, what even sort of flavor of compute environment do I need? It allows users to come in and really bootstrap and get a baseline understanding of what their compute requirements are, without any sort of like lock in. And so they can just come in and slowly, incrementally learn, you know, learn by doing.

Marcel Ribeiro Dantas, Ph.D: So it’s a huge opportunity that Seqera is providing for people to get more about Seqera platform, all the features you can play with your own pipeline, with your data, and then you really get the value of the platform. So I think it’s, it’s a great onboarding experience in terms of what you can do with real data, real pipeline and so on.

Phil Ewels: It was always a bit of a blocker when tr showing people in the past like, yeah, come and try out Seqera platform. Okay, now go and create an AWS account and put in your credit card details. And everyone, they’re like, not, not so sure about that. whereas now you can kind of get, get running in a couple of clicks.

And then also if your Seqera credits run out, then everything stops. And it’s not like AWS where, you know, you can accidentally run up bills for thousands and thousands of dollars on your credit card. So it’s, it’s easier, it is also safer, which, which I like.

Fusion Snapshots

Rob Newman: And I think to sort of like round out the last sort of first or section around compute, I would be remiss if I wasn’t talking about fusion snapshots as well. So if you’re running a Nextflow pipeline and that spot instance gets reclaimed, Fusion now does a great job of snapshotting and checkpointing of where you got to in that pipeline run

Rob Syme: yeah. it’s, it’s fantastic because up until Fusion snapshots Nextflow has this inbuilt task resumption mechanism.

And so even a workflow is broken down into tasks and that becomes the unit. If it’s so before snapshots, if a task gets interrupted, it will resume from the beginning of the task, not from the beginning of the workflow.

But for a lot of people, they have tasks and processes that can’t be broken down into smaller pieces. Like this thing is gonna run for a day and a half. and like that’s the way the tool works. I can’t that I, as a workflow author, can’t break it down into a smaller piece.

And if your task is running for above six or 12 hours, it, it’s hard to get the cost benefit of spot instances because your task is to get reclaimed, then restart again, and then reclaimed again.

So what this does is allow it as the task progresses and the snapshot comes in, it saves the state of the running task to S3 and then on restart it resumes from that state so you don’t have to go back to the beginning of the task. So it’s great for people who either don’t want to because it takes development time or can’t, their tasks into smaller pieces. and you get the benefit of costs, the cost benefit of spot, which is 90% in most cases. And the, a roundabout, it’s, substantially cheaper.

Phil Ewels: When I first started using clouds, the spot market was so different to how it is today as well. Like I, it used to be when I used to run things on, on AWS on spot that basically almost nothing got reclaimed. So it was just a no brainer to run on spot. Like it was obvious.

But if you look at some stats on interruption rate, AWS has like kind of tweaked how a spot market works and capacity planning has basically got better I think by AWS over time. And the spot market’s got squeezed and squeezed and squeezed and squeezed. And now when you run a large workflow on spot is almost guaranteed that you’re gonna have at least some tasks will be reclaimed.

And so yeah, that balance between whether it’s worth it anymore has become much harder to justify. And with, with fusion snapshots, we’re back, we’re back in the good old days where it’s just like, yeah, you tick that box and it’s free money.

I’ve, I’ve asked Lorenzo from, from what was the Fusion team to join me on the podcast pretty soon. So we’re gonna do a deep dive on this. ‘cause I think it is, is really interesting. And he, you know, he really knows his stuff and I’m hoping he can, he can teach me and, and explain it to me like I’m a, I’m a 6-year-old.

Rob Newman: He’s also very deserving of the moniker of Wizard as well. Like if there’s, there’s one person, there’s a few people who work at Seqera who deserve the term wizard. He’s definitely one of them. yeah, I remember him giving a talk all about, programming a, a very basic Casio digital watch.

Phil Ewels: Don’t, don’t give it away. That’s one of the ways I, I got him to agree to give a podcast.

Rob Newman: nice. Nice.

Seqera AI

Rob Newman: Well, I think moving on, of the biggest things, and I don’t think this is purely like Nextflow or Seqera, is AI. And and shout out to the, the growth and AI team has been building Seqera ai. and especially, you know, supporting the ability, ability to explore and pull in like SRA hosted data, you know, having pipeline mode, so where you can load in GitHub repos, you know, you can, and ask questions about those repositories. Like, like what does, what does this piece do? know, what does this process do? And, and get really like useful and helpful feedback there.

And it’s actually allowing our users to build, well either customize existing pipelines to their own requirements or build brand new pipelines. It’s, it’s been pretty amazing to see that explosion.

Phil Ewels: I was gonna say that Rob dropped a message in the chat saying we’ve done well to get this far through the podcast episode without mentioning ai. I mean 20, I’m sure 2025 is gonna be talked about as the year that AI exploded. Right. Especially in, in, in kind of anything to do with programming. it, it’s, everything has changed and, and I think what stands out to me about Seqera AI is, is, I mean, we have a pretty small team working on this and we’re up against some you know, some of the best funded companies that have ever existed building this out. And, and the fact that they’ve been able to pivot and stay fresh and sort of keep up to date with all the things which are happening in this ridiculously fast moving field and, and also carve ourselves a niche where Seqera AI is able to provide value over these massive players, I think is, is huge.

Like they’ve, they’ve done an amazing job and honestly, I, I can’t quite believe how good Seqera AI is and how useful it is. And, yeah, and, and also the way that it can, if you are using Seqera Platform, it has, its kind of, it, its roots down into a system so you can ask it about specific runs and, you know, explain to me why that’s failed.

and, and that all goes back to our kind of core mission of improving developer experience and user experience and Yeah.

Marcel Ribeiro Dantas, Ph.D: Just crazy. Like I, I’ve been working with AI for, I dunno, 20 years or something like this. My under graduation, my master’s, my PhD, everything. And when all this thing exploded, I play a bit with ChatGPT and stuff, but I, I, it did, I, I wasn’t bought by the idea, right? And when Claude and cursor and all these guys, and Loveable and all these others started appearing, I mean, now I feel guilty that I cannot tell this to more people in my 24 hours, right? Like, I was bored late 2025. I was bored. I was like, I’m gonna write an actual plugin. And they would take care of AI in a matter of a few hours. I had it ready with the documentation, with the code, with everything. The plugin was ready the next day. It was already in the Nextflow registry. Paolo had already accept my claim request. You know, all the documentation, well written, commented, tested, I had hundreds of tasks that I wrote together with Seqera, Seqera AI to confirm everything was working, and it works great.

So now I’m like, everyone, if you wanna write a plugin, run a Seqera ai, write your plugin with Seqera ai, it’s gonna be great. It’s gonna solve your problems, right? So I’m amazed at Seqera AI really isn’t. I mean, we haven’t had Seqera AI for over a year. And as a Seqeran, I’m happy we have it, but in the past two months, I think I just got completely in love with it.

Rob Syme: I think as developer, before ai, I felt like a race car driver. And, and then in early, the early models before sort of June last year, I felt like I’d been demoted to the guy who watches cars on a Scalextric track and just picks ‘em up when they fall off. And then just like, I felt that was my job.

But I think like late 2025 with Seqera AI and other models, like it feels like okay now that we’re, we are really moving, like this is genuinely useful. love Marcel’s example of plugin development because it ties into what we were talking about earlier with the new develop, like improved plugin development experience, a small idea or a big idea, to improve the way you develop workflows or your users use those workflows and track with them.

a great time to just give it a go. So I’m looking forward to a future plugin episode.

Phil Ewels: Sorry, Rob Newman. Carry on.

Rob Newman: Yeah,

Phil Ewels: Where, where, where were we?

Seqera MCP Server

Rob Newman: I think, you know, kind of just. Putting a, a bow on the Seqera AI piece was, you know, the, the MCP server as well that came out at the tail end of last year. I think it was like October. and, you know, again, talking about wizards, you know, this is, this is Paolo and, and the, a few of the other members of the team like cooking, cooking this up in the, the evenings or on the weekends over a few days.

And, you know, I would let, I don’t want to speak for the demos, like you should go and look at some of the presentations, from , the Barcelona Summit at the end of the year around the MCP server and, you know, just this ability to plug in your Seqera Platform into Claude, into cursor, into copilot, ask questions of it, you know, has authentication built in, right? So it’s super secure. So yeah, being able to run, look at your query history, you know, debug pipelines, it’s, yeah, it’s pretty amazing.

Data Explorer & s3 APIs

Rob Newman: Cool. Cool. All right. I’ll just, I’ll talk about a few other things. I guess a big one initially, with Data Explorer is, you know, this has become the defacto way of users to explore data, either created by pipelines or to define inputs to pipelines. I mean, it’s always been, reserved for cloud-based object storage.

What the team have done is create data explorer to work on any S3 compatible, API object storage. you know, there’s a lot of HPC systems that have S3 compatible APIs, and so now you can connect Data Explorer to that, which is, which is a huge deal. Like, it, it makes the, the platform now accessible to a, a whole different range of users. and also other, object storage providers like Minio. It’s the defacto standard and so many, many different providers, CloudFlare comes to mind, Oracle comes to mind, you know, they’re all using S3 compatible APIs. And so now this, this ability of just pulling in data from any of these providers or, you know, hosting services into the platform is, you know, literally a couple of clicks.

Phil Ewels: Super cool.

Datasets

Rob Newman: Yeah. And then, you know, we, we’ve also, we also did a huge amount of work on, data sets, right? Well first of all, there’s no quota anymore, so that quota has been lifted. So for, for users who had like a, you know, a limit around a workspace that’s, that’s gone.

But now you have the ability to show and hide, sets. So prior to this work, you know, if you had lots and lots of data sets, it was kind of really hard to understand which ones do I really care about. so now you can show and hide, different datasets dynamically. You can also add, labels to it. And there’s also additional metadata that you can now query for in the dataset interface.

And then I think one of the really nice features is that you can now look at all of the runs that a data set was used in. So the datasets, tabula view, you, there’s a, there’s a new column and you can just click on that and it’ll take you directly into the pipeline runs page and basically show you all the pipeline runs just that, have just used that dataset. So that, that’s sort of like much more, oh, much more robust, integrated tissue across the platform is, has, been really popular with our users.

Rob Syme: It’s a lovely example of Seqera being a lot more interested in data. So both like. Where was this data set used as input and then coming this year, hopefully like where, like where’s this file produced? sort of pushing either side of the workflow into the providence and the ingestion.

Seqera Studios Updates

Rob Newman: Yeah. And then the last thing I think to talk about is, Studios. it’s, kind of I, I, you know, I’ve been, that’s been the focus a lot or a lot of focus of my attention since I, I joined Seqera and I think this year was when we really started to see whole scale adoption of that. the number of users using it, the number of people who are building really cool, interesting customized Studios as well that we, we never really imagined people doing. It’s been. The flexibility, and the extensibility of Dockerfiles and the ability to kind of build your own analysis environment, like, you know, BYOC, bring your own container is just, just amazing.

And there’s, there’s a, there’s a couple of good blog posts from 2025 from the scientific development team that have kind of talked about, you know, things like Marimo or, or cell by Gene, you know, or, or even just like shiny applications that, you know, provide these really rich dashboards for interacting with data. it’s, it’s, been really amazing to see what our, what our users have been building.

And we’ve also added things like support for the single, VM Compute Environments that I talked about earlier. we’ve enriched the interface with, resource labels. you know, you can now define environment variables. It supports like ARM 64 architectures. Spot instances are now supported.

And then I guess the piece, the resistance was at the end of the year was building the git integration. So, you know, any git provider that you have, you know, you can now, in a similar way to pipelines, you can now drop a path to a repo into Studios and it will clone the repository. so you know, whether it’s got pipelines, whether it’s got data, whatever, whatever it’s got, or maybe it’s just like notebooks. you can now clone that directly into Studios. and you can even give it raw container files like a docker file and it will build that for you. Prior to this, you always used to have to compile your own, container and, and, you know, put it up to a, a container registry, and now it does that on the fly for you, which, know, for people who want to bootstrap really quickly, analysis environments, it’s, it’s been, like, efficiency improvement for them.

Phil Ewels: I’ve really, that, that thing with the, the custom images where you can just drop something in and it just works, like is beautiful. I love it. And, and it only really struck me at the second half of this year when I started using this more, just how much complexity that kind of takes away.

Like when you run something in Studios, you have user authentication set up, you have firewalls, everything is secure and you didn’t do anything. But it’s on the internet where, where the right people can access it from anywhere and start it and stop it. And multiple people can collaborate in a single session and like, there’s, there’s a ton of stuff there and like, yeah, it might just be a notebook, but actually, it’s a really powerful thing.

And, and, and there’s also, I, I feel like Studios has benefited from many, small incremental improvements, this, this past year it’s, it’s much more stable when it used to be.

I’m sure Studios are much, they don’t crash nearly as often and like the interface for launching Studios is easier to use. And I dunno, I feel like there’s been a lot of paper cuts that have been removed in 25.

Rob Newman: Yeah. And we did, we did a fair amount of user research. I think that’s one of the big things for me, you know, towards the second half of the year was like that, that sort of light bulb coming on for a lot of people being like, oh yeah, we, this works and it, and it sits adjacent to all of our pipeline runs and it’s in our own infrastructure. Like we’re not having to provision external systems or, you know, like third party proprietary things. or environments or, or systems. Like everything just sits in our own infrastructure. Cool

Seqera company updates

Rob Newman: And I think the last thing I wanted to comment on was you kind of business related was, you know, we got, series B this year. So I think that’s a big shout out to, to everybody. you know, allows Nextflow to be, continue to be developed, you know, with the attention paid to it, is, which is great. Then we, we got SOC 2 certified as well, type two certified, which is really great for, for a lot of our customers.

Phil Ewels: And it’s, it’s good news for everybody, right? Healthy, Seqera means healthy Nextflow and healthy community. It’s also good news for everyone, everyone out there listening to the podcast I hope.

nf-core/tools updates

Phil Ewels: Marcel, do you wanna start taking us into, into what’s new.

Marcel Ribeiro Dantas, Ph.D: Yeah so this last year we had lots of nice things in multiple different ways in terms of technology. For example, we had three releases of the nf-core tools, the one in June, October, and November. And each one of them, there’s something I love.

I did the 3.3. For example, we had the nf-core test dataset command. That’s something that I believe, not only me, but like whenever I needed to find some test dataset to run my nf-core pipeline to create a new one. wasn’t very pleasant to go to GitHub and look for a branch and so on, so on. This command makes it so much easier. So nf-core test, dash datasets, test it. You’re gonna love it.

The 3.4 we had like this overhaul in the nf-core download command. You can download the Docker tar archives for maybe, an infrastructure that you don’t have internet connections. So now you can have the container images locally. So many nice things.

And the 3.5 is one that I’m very in love of. ‘cause a big chunk of it is regarding the topics, the usage, usage of topics for the versions. And that’s something we worked on in the hackathon in Barcelona.

So I think there was a lot of nice things in nf-core tools.

Topics

Phil Ewels: I mean, topics are a fairly new thing, right? I mean, that, that’s maybe a, a 2025 recap thing, which we missed was, was topics coming out in Nextflow itself. Could you briefly tell us what, what you mean by that?

Marcel Ribeiro Dantas, Ph.D: Topics it’s a, technically it’s not a big change, in my opinion it has a big impact. The idea of topics that sometimes you want to store information throughout multiple different steps of your pipeline, like the version, right? We have the version of the modules of the test and we want to save them somewhere so that in the end we know the version of all the softwares that we’re using in our pipeline.

So how do we do that? Before we would have a big chunk of code in every process so that we write to this way, to this place, then get these channels together, then write to, I mean, it’s, it was, it worked, but it was a lot of code, right?

With topics, you can just create a topic, let’s call it versions. And every process of the pipeline, you can easily write to this topic. I don’t care about merging information or bringing to a place or to the other. It’s gonna be there in the topics channel. So it’s a much more organizing, clean way. To report versions and version was the, the obvious way of using topics. But whenever you ha you want to save information for multiple different processes, topics are the easiest way. And I think it’s gonna be much, much more fluid, this conversion from just plug it together, pulling the input, running the pipeline. It’s ready.

Syntax updates in nf-core

Marcel Ribeiro Dantas, Ph.D: Yeah, we talked a lot about the syntax and the updates in this, in this podcast, in this episode today, and also about nf-core now. So I think something that some people may be, may be asking is, what’s the roadmap for this new syntax adoption in nf-core? Right?

So we have a blog post where you can see it in more detail. We have a, a right, an image there when you see when each thing is going to be, adopted. The topics was already in Q4 last year. And then you’re gonna have other things like the strict syntax, the workflow output, static types and re in records, new process syntax, all these things. They’re gonna be slowly adopted until early 2027.

But a big chunk of that is, is in 2026. So you will see pretty soon a lot of these amazing things by default in nf-core.

Community growth

Marcel Ribeiro Dantas, Ph.D: And talking about community, I mean. It’s growing in all different ways. Not only in number, but in in quality. Also, we see these people who one day appear asking basic questions about Nextflow the next day, they’re pipeline maintainers, they’re creating plugins, they’re answering questions on the forum.

It always makes me so happy when I see like, oh, this person was like asking basic questions. I know last year now they’re like a rock star in the, they’re Nextflow ambassadors, right? So it’s very nice to see how the community is growing not only in size, also in how, integrated it is and how people are happy to contribute and so on. So many things, like the weekly help desk that we are always answering questions on live to people. It was expanded to include Asian Pacific, time zones, right? So many nice things. when it comes to infrastructure, we now have the nf-core advisories and so many different things.

Phil Ewels: I, I’d love to touch on advisories ‘cause I really like this. This is another unsung hero feature a little bit. ‘cause it’s, it’s the kind of thing you, you hope, you hope you never need it and you hope you never notice it. But if you do, it’s fantastic. And, and kudos to Mathias from, SciLifeLab, for putting us together.

But basically. What what it does is if you’re using a particular version of a pipeline, and, and we know, right, not everyone can update to the latest version of every pipeline, every time you run Nextflow. If you’re running a production, you need to be fixed at certain versions. You need to validate everything.

So now on the nf-core website, if you go to a specific version of a specific pipeline, if there’s a bug that we know affected that pipeline, it shows it in a big red banner at the top, and. That’s huge. ‘cause it means like, now you know, okay, you really should update because we know there’s a bug here, which is giving you bad data or whatever.

And also it helps with, if you have a problem, because, you know, if we know that between this version and this version of Nextflow, these pipelines are affected by this thing and this, this is what we did about it and this is how we fixed it. All of that kind of stuff is, is documented, but they’re kind of like mini blog posts, but specifically about any problems with, with nf-core pipelines.

And we, we call them advisories and hopefully you never see them, but yeah, they, they’re there when you need them and they, they pop up in all the right places. So it’s, it’s a simple thing, but it’s, it’s a really, it’s a really nice.

And it’s a, i, I like it as well because it’s a kind of, a bit of a mark of maturity. I think for our community. We kind of, we know that these pipelines are really, you know, critical. and, and kind of we, we, we try and handle that responsibility. I.

Marcel Ribeiro Dantas, Ph.D: Like the way that the nf-core project as a whole is evolving, right? With the, the way we do proposals now, the advisories, the way the core team works, the all the teams, right, the, the maintainers. It’s so nice to see, such a mature open source project, right? It’s very nice. The advisory is, are really like this Mathias.

Nextflow Summits and events

Marcel Ribeiro Dantas, Ph.D: Going on with the community things. We had the Nextflow summit in Boston earlier last year in Barcelona, late last year. We had now this online summit for Barcelona. And the hackathon training in person was still a lot of fun for those who came in person. The online one was able to have also so many different talks.

A lot of people always wanted to present nice things in the summit, but they couldn’t come. Right. And with the online version, now we really can have the best talks, the best people because of geographical location and so on. Right? So I think was a very good, smart choice to make the, the virtual on one online.

Phil Ewels: All of these talks, by the way, are, are on YouTube. So dig it out. There’s, I, that was a podcast episode I wanted to do and I, we didn’t, was was a recap of our favorite talks ‘cause there’s some really, really juicy ones in there. I mean, you know, like you say, people coming from all over the planet as well.

We had a talk from NASA this time, which has been on my bucket list for ages. ‘cause I, I knew they’d been writing Nextflow pipelines for quite a few years and now we got ‘em in to give a talk and you know, there’s there’s so many good talks there. It was, it was a fantastic event and also ‘cause it, ‘cause it was online.

Something else we did for Barcelona was running talks in multiple tracks, which meant that we managed to get a lot of talks in. It was just a, you know,

Marcel Ribeiro Dantas, Ph.D: yeah.

Phil Ewels: it, it was really, it was a call event.

Nextflow Ambassadors

Marcel Ribeiro Dantas, Ph.D: Yeah. we have starting now, like this actually today was the kickoff call for the fifth cohort of our ambassador program. Right? It started late 2023. Three with the first cohort starting in early 2024. Right. And it’s been such a great adventure in all these two years. In a few months.

I was looking at the number of statistics today. We have had in the past 24 months, over 600 activities. It’s a lot conducted by ambassadors. Not that ambassador was there conducted by ambassadors. Right. Huge number.

And we always had, like, we started with 45 people in 16 countries and it kept growing. Right. And it’s been three cohorts that the number doesn’t change a lot between a hundred twenty eight, a hundred thirty four. So it’s, it’s, this number didn’t grow a lot, but this time we’ve got 40 countries. So we have now 40 countries with Nextflow ambassadors residing there. means that actually the impact is larger. Because for example, we have, we have people that are in uk, but they are from Iraq and they go there regularly to give Nextflow trains or something like this.

We have Kübra in Germany who is from Turkey, right? And she goes there often and give training and so on. So if we talk about the number of countries we can have, we can easily have, impact is way more than 40, right? Like Juan Vito from Brazil, he went to Chile to give a training, right? So it is like amazing numbers for the project and program. And so far people seem happy. So I think we’ve done like lots of trainings, lots of in-person activities. Like last year from the 300 and something activities, we had 170 were in person. Most of the training sessions were in person, right? So it’s so many nice things we, we are getting from Ambassador program people also happy about being in the community, learning more Nextflow. Lots of ambassador ambassadors, they are not experts when they join the program. They know some basic Nextflow and they develop the skills during the program. They learn how to write better, how to give talks, how to give training and so on. So it is a great program for people to join and also to, to, to, to, to learn.

So, so many nice things. and well, if you want to be an ambassador, you can just go to the nextflow.io website, go to the resources, Nextflow ambassador, you apply. At the end of every semester we evaluate the applications and you start in the next one. So we just started the fifth cohort.

Training Weeks

Marcel Ribeiro Dantas, Ph.D: There’s something new that we’ve been doing last year, and I also like a lot, which are the training weeks, right? I mean, you can always learn nextflow, we have the training.nextflow.io website. You can just go there. Learn Nextflow. You have the GitHub CodeSpaces environment with everything solved and configured for you to use. You just go there, read and do it.

Well, when things are too easy and always there for us, sometimes we postpone, right? So I have some friends like, oh yeah, I’m gonna do that. I love that specific Nextflow training material. I’m gonna do it someday and I never do because, well, life goes on, right?

So we had this idea of, you know, it’s free for everyone already available, but what if we create an Nextflow training week every quarter. You can organize yourself to focus on the training that week we’re can have quizzes. If you get the good score, you get a certificate of accomplishment. We are gonna be live on Zoom every day to answer all your questions. We cannot do this the whole year, but the training week, we can do that for you. The community forum, all the ambassadors are gonna be there to make sure they answer your questions as well as possible in good speed and so on.

Last year in the Q4 alone, we had about 500 people register for the training week, only for the Q4 training.

They don’t have the Q3, Q2 and Q1, right? So it was about a thousand, over a thousand people register for these trainings, a lot of them getting the certificates, learning Nextflow, right? And sometimes it’s just like the, the push you need just the poke. You need to, okay, there’s a training week I’m gonna learn now.

And lots of people have, have take advantage of that. So it was very, very nice.

Phil Ewels: I think we need to, I need to pull out my green screen again and pretty soon because we’re, we’re talking about all these language updates, which is fantastic for Nextflow. And, and, but anyone who’s working on the Nextflow training material, which is actually several of us on this call, you can see us quietly kind of pulling our hair out. ‘Cause we know we’re gonna have to update all the training material.

Marcel Ribeiro Dantas, Ph.D: Week, which is gonna be all the material will be updated. At least the, the, the introductory ones will be updated through all the new syntax. Right. So we’re gonna learn the newest best way.

Wrap up

Phil Ewels: guys? I think we did it.

Rob Syme: Congratulations everyone for sticking with us for so long.

Rob Newman: It is a big year.

Rob Syme: I think this is a, this is a good, good, impetus not to let it go six months without a, without a podcast.

Phil Ewels: Exactly.

Rob Syme: lesson from today.

Phil Ewels: What a year, what an amazing year. When you, when you put it all in context and you see it all kind of written out, you know, we, I’m gonna, it’s gonna take me a while, but I’m gonna stick in loads of links about all the stuff we’ve talked about in the show notes for here. So you can go and go and find the, the YouTube talks and you can find the blog posts and everything.

There’s just so much stuff, and I, and I’m sure that there’s loads of stuff we haven’t talked about as well, which I will remember in about half an hour when we stop recording.

Rob Syme: And so much of it is building towards setting the foundations for the future as well. Like, it, it’s, it’s ‘cause it’s real sense of acceleration. Like there’s so much new stuff and new stuff to build, new stuff.

Phil Ewels: yeah.

Rob Syme: It’s great.

Phil Ewels: Yeah.

Rob Newman: It is. I don’t know. They, I think when you are a parent, when you first become a parent, they say the, the days are long, but the years are short. It kind of feels a little bit like that working at Seqera.

Phil Ewels: A good way. I hope. The sleep deprivation and anxiety and,

Rob Newman: No, I mean, yeah, it’s, I mean, like we’re, we’re all invested and all, you know, wanting to see the best of things. So yeah.

Phil Ewels: And a lot of this stuff we’ve been talking about for years as well, like all this stuff about error messages is the Nextflow. This is, this has not come out of nowhere. I find it so gratifying to finally, finally, after years be able to say, yes, we have a solution for you. And, and it’s getting better, and also seeing how well that’s been received by the community as well as it’s extremely satisfying.

Right on that note, , thank you everyone. If you made it this far, congratulations, send me an email. I’ll, I’ll send you a certificate to put on your wall, and yeah, stay tuned. I’m, I’m hoping to get many more episodes out, in, in 2026. So we’re gonna have much more specific deep dives on different topics in, in the near future. So, so get that like, and subscribe, you get that Spotify subscription running and, and whatever else it is that you do. And, and we will, you know, hear you soon. Thanks very much everyone.

Rob Newman: Thanks.

Phil Ewels: Cheers.

Back to podcasts