Nextflow Summit 2023 recap
Five days of Nextflow awesomeness in Barcelona
On Friday, Oct 20, we wrapped up our hackathon and Nextflow Summit in Barcelona, Spain. By any measure, this year’s Summit was our best community event ever, drawing roughly 900 attendees across multiple channels, including in-person attendees, participants in our #summit-2023 Slack channel, and Summit Livestream viewers on YouTube.
The Summit drew attendees, speakers, and sponsors from around the world. Over the course of the three-day event, we heard from dozens of impressive speakers working at the cutting edge of life sciences from academia, research, healthcare providers, biotechs, and cloud providers, including:
- Australian BioCommons
- Genomics England
- Pixelgen Technologies
- University of Tennessee Health Science Center
- Amazon Web Services
- Quantitative Biology Center - University of Tübingen
- Biomodal
- Matterhorn Studio
- Centre for Genomic Regulation (CRG)
- Heidelberg University Hospital
- MemVerge
- University of Cambridge
- Oxford Nanopore Technologies
- Medical University of Innsbruck
- Sano Genetics
- Institute of Genetics and Development of Rennes, University of Rennes
- Ardigen
- ZS
- Wellcome Sanger Institute
- SciLifeLab
- AstraZeneca UK Ltd
- University of Texas at Dallas
- Seqera
The hackathon – advancing the Nextflow ecosystem
The week began with a three-day in-person and virtual nf-core hackathon event. With roughly 100 in-person developers, this was twice the size of our largest Hackathon to date. As with previous Hackathons, participants were divided into project groups, with activities coordinated via a single GitHub project board focusing on different aspects of nf-core and Nextflow, including:
- Pipelines
- Modules & subworkflows
- Infrastructure
- Nextflow & plugins development
This year, the focus of the hackathon was nf-test, an open-source testing framework for Nextflow pipelines. The team made considerable progress applying nf-test consistently across various nf-core pipelines and modules — and of course, no Hackathon would be complete without a community cooking class, quiz, bingo, a sock hunt, and a scavenger hunt!
For an overview of the tremendous progress made advancing the state of Nextflow and nf-core in three short days, view Chris Hakkaart’s talk on highlights from the nf-core hackathon.
The Summit kicks off
The Summit began on Wednesday Oct 18 with excellent talks from Australian BioCommons and Genomics England. This was followed by a presentation where Pixelgen Technologies described their unique Molecular Pixelation (MPX) technologies and unveiled their new nf-core/pixelator community pipeline for molecular pixelation assays.
Next, Seqera’s Phil Ewels took the stage providing a series of community updates, including the announcement of a new Nextflow Ambassador program, a new community forum at community.seqera.io, and the exciting appointment of Geraldine Van der Auwera as lead developer advocate for the Nextflow. Geraldine is well known for her work on GATK, WDL, and Terra.bio and is the co-author of the book Genomics on the Cloud. As Geraldine assumes leadership of the developer advocacy team, Phil will spend more time focusing on open-source development, as product manager of open source at Seqera.
Seqera’s Evan Floden shared his vision of the modern biotech stack for open science, highlighting recent developments at Seqera, including a revamped Seqera platform, new Data Explorer functionality, and providing an exciting glimpse of the new Data Studios feature now in private preview. You can view Evan’s full talk here.
A highlight was the keynote delivered by Erik Garrison of the University of Tennessee Health Science Center provided. In his talk, Biological revelations at the frontiers of a draft human pangenome reference, Erik shared how his team's cutting-edge work applying new computational methods in the context of the Human Pangenome Project has yielded the most complete picture of human sequence evolution available to date.
Day one wrapped up with a surprise announcement that Seqera has been confirmed as the official High-Performance Computing Supplier for Alinghi Red Bull Racing at the 37th America’s Cup in Barcelona. This was followed by an evening reception hosted by Alinghi Red Bull Racing.
Day two starts off on the right foot
Day two kicked off with a brisk sunrise run along the iconic Barcelona Waterfront attended by a team of hardy Summit participants. After that, things kicked into high gear for the morning session with talks on everything from using Nextflow to power Machine Learning pipelines for materials science to standardized frameworks for protein structure prediction to discussions on how to estimate the CO2 footprint of pipeline runs.
Nextflow creator and Seqera CTO and co-founder Paolo Di Tommaso provided an update on some of the technologies he and his team have been working on including a deep dive on the Fusion file system. Paolo also delved into Wave containers, discussing the dynamic assembly of containers using the Spack package manager, echoing a similar theme from AWS’s Brendan Bouffler earlier in the day. During the conference, Seqera announced Wave Containers as our latest open-source contribution to the bioinformatics community — a huge contribution to the open science movement.
Paolo also provided an impressive command-line focused demo of Wave, echoing Harshil Patel’s equally impressive demo earlier in the day focused on seqerakit and automation on the Seqera Platform. Both Harshil and Paolo showed themselves to be "kings of the live demo" for their command line mastery under pressure! You can view Paolo’s talk and demos here and Harshil’s talk here.
Talks during day two included bringing spatial omics to nf-core, a discussion of nf-validation, and a talk on the development of an integrated DNA and RNA variant calling pipeline.
Unfortunately, there were too many brilliant speakers and topics to mention them all here, so we’ve provided a handy summary of talks at the end of this post so you can look up topics of interest.
The Summit also featured an exhibition area, and attendees visited booths hosted by event sponsors between talks and viewed the many excellent scientific posters contributed for the event. Following a packed day of sessions that went into the evening, attendees relaxed and socialized with colleagues over dinner.
Wrapping up
As things wound to a close on day three, there were additional talks on topics ranging from ZS’s contributing to nf-core through client collaboration to decoding the Tree of Life at Wellcome Sanger Institute to performing large and reproducible GWAS analysis on biobank-scale data at Medical University of Innsbruck.
Phil Ewels discussed future plans for MultiQC, and Edmund Miller shared his experience working on nf-test and how it is empowering scalable and streamlined testing for nf-core projects.
To close the event, Evan took the stage a final time, thanking the many Summit organizers and contributors, and announcing the next Nextflow Summit Barcelona, scheduled for October 21-25, 2024. He also reminded attendees of the upcoming North American Hackathon and Nextflow Summit in Boston beginning on November 28, 2023.
On behalf of the Seqera team, thank you to our fellow sponsors who helped make the Nextflow Summit a resounding success. This year’s sponsors included:
- AWS
- ZS
- Element Biosciences
- Microsoft
- MemVerge
- Pixelgen Technologies
- Oxford Nanopore
- Quilt
- TileDB
In case you missed it
If you were unable to attend in person, or missed a talk, you can watch all three days of the Summit on our YouTube channel.
For information about additional upcoming events including bytesize talks, hackathons, webinars, and training events, you can visit https://nf-co.re/events or https://seqera.io/events/seqera/.
For your convenience, a handy list of talks from Nextflow Summit 2023 are summarized below.
Day one (Wednesday Oct 18):
- The National Nextflow Tower Service for Australian researchers – Steven Manos
- Analysing ONT long read data for cancer with Nextflow – Arthur Gymer
- Community updates – Phil Ewels
- Pixelgen Technologies ❤︎ Nextflow – John Dahlberg
- The modern biotech stack – Evan Floden
- Biological revelations at the frontiers of a draft human pangenome reference – Erik Garrison
Day two (Thursday Oct 19):
- It’s been quite a year for research technology in the cloud: we’ve been busy – Brendan Bouffler
- nf-validation: a Nextflow plugin to validate pipeline parameters and input files - Júlia Mir Pedrol
- Computational methods for allele-specific methylation with biomodal Duet – Michael Wilson
- How to use data pipelines in Machine Learning for Material Science – Jakob Zeitler
- nf-core/proteinfold: a standardized workflow framework for protein structure prediction tools - Jose Espinosa-Carrasco
- Automation on the Seqera Platform - Harshil Patel
- nf-co2footprint: a Nextflow plugin to estimate the CO2 footprint of pipeline runs - Sabrina Krakau
- Bringing spatial omics to nf-core - Victor Perez
- Bioinformatics at the speed of cloud: revolutionizing genomics with Nextflow and MMCloud - Sateesh Peri
- Enabling converged computing with the Nextflow ecosystem - Paolo Di Tommaso
- Cluster scalable pangenome graph construction with nf-core/pangenome - Simon Heumos
- Development of an integrated DNA and RNA variant calling pipeline - Raquel Manzano
- Annotation cache: using nf-core/modules and Seqera Platform to build an AWS open data resource - Maxime Garcia
- Real-time sequencing analysis with Nextflow - Chris Wright
- nf-core/sarek: a comprehensive & efficient somatic & germline variant calling workflow - Friederike Hanssen
- nf-test: a simple but powerful testing framework for Nextflow pipelines - Lukas Forer
- Empowering distributed precision medicine: scalable genomic analysis in clinical trial recruitment - Heath Obrien
- nf-core pipeline for genomic imputation: from phasing to imputation to validation - Louis Le Nézet
- Porting workflow managers to Nextflow at a national diagnostic genomics medical service – strategy and learnings - Several Speakers
Day three (Thursday Oct 19):
- Driving discovery: contributing to the nf-core project through client collaboration - Felipe Almeida & Juliet Frederiksen
- Automated production engine to decode the Tree of Life - Guoying Qi
- Building a community: experiences from one year as a developer advocate - Marcel Ribeiro-Dantas
- nf-core/raredisease: a workflow to analyse data from patients with rare diseases - Ramprasad Neethiraj
- Enabling AZ bioinformatics with Nextflow/Nextflow Tower - Manasa Surakala
- Bringing MultiQC into a new era - Phil Ewels
- nf-test at nf-core: empowering scalable and streamlined testing - Edmund Miller
- Performing large and reproducible GWAS analysis on biobank-scale data - Sebastian Schönherr
- Highlights from the nf-core hackathon - Chris Hakkaart
In this event, we're showcasing some of the results of the project 'Optimization of computational resources for HPC workloads in the cloud using ML/AI' by Seqera Labs S.L. This project has been funded by the European Regional Development Fund (ERDF) of the European Union, coordinated and managed by RED.es, with the aim of carrying out the development of technological entrepreneurship and technological demand, within the framework of the Strategic Action on Digital Economy and Society of the State Program for R&D&I oriented towards societal challenges.