The State of the Workflow 2023: Community Survey Results
In March, we ran our annual State of the Workflow Community survey to get a sense of activities and preferences in the Nextflow user community.
In our third annual survey, participation was up by nearly 31% from our 2022 survey and over double that of 2021, with 500+ Nextflow users taking time to share their thoughts. We are grateful to all who took the time to participate and help us evolve Nextflow and the Nextflow community. Congratulations also to Anne Bertolini of ETH Zürich, the winner of the Macbook Air draw, and to the ten additional winners of our Nextflow/Seqera swag packs!
While some trends were surprisingly consistent, there were also some fascinating nuggets buried in this year’s data pointing to a changing landscape for scientific workflows. So buckle up, and read on to learn how the world of workflows is evolving!
Users and demographics
We are grateful to the Chan Zuckerberg Initiative (CZI) for their continued support of essential open-source software projects such as Nextflow and nf-core. As part of our continuing efforts related to CZI diversity and inclusion initiatives, we have continued to collect demographic information on the Nextflow community to track progress year-over-year.
2023 saw continued rapid adoption of Nextflow. 42.4% of survey respondents have been using Nextflow for one year or less, highlighting the community's impressive growth.
Some interesting findings about the Nextflow user community are provided below:
- Bioinformatics still rules — The composition of survey respondents is very similar to 2022, with Bioinformaticians, Principal Investigators, and Software Engineers accounting for roughly 84% of Nextflow users. Main areas of research among Nextflow users included genomics, transcriptomics, and metagenomics and 71.7% viewed Nextflow as fundamental to their research.
- An expanding user base — This year’s survey included respondents from 47 countries. New entries in our “top 16” list included India, Belgium, and Serbia. We also saw a growth in Nextflow users in Singapore, China, Nigeria, Estonia, Greece, Kenya, and several other countries, reflecting Nextflow’s international reach.
- Our community is trending younger — While most of us are a year older in 2023, our user community is actually younger! This year, 79% of Nextflow users reported being between 20 and 40 years old. Moreover, 37% were in their twenties vs. only 26% in 2022 — a 42% increase, suggesting a younger cohort of scientists are adopting Nextflow.
- Changing market segmentation — While the mix of Nextflow users by industry remains largely unchanged, we saw a slight uptick in government (6.0% vs. 3.6% in 2022) and users in pure research (24.5% vs. 19.7% in 2022)
- Good news on diversity — Gender diversity is heading in the right direction. While bioinformatics remains a male-dominated field, the percentage of women using Nextflow is now 27%, up from 23% in 2022 — a healthy 17% increase.
- High levels of satisfaction — Finally, Nextflow users continue to be a happy bunch, with an impressive 99% of respondents indicating that they are satisfied with the platform vs. 98% in 2022.
The march to the cloud continues
While on premises clusters aren’t going away anytime soon, this year’s survey data continues to point to a gradual shift to the cloud.
On-premises clusters remain the most popular platform for running Nextflow pipelines with 62.8% using traditional HPC clusters. This figure is down from 69% in 2022 — a 9% drop. Also supporting this trend, 75% of respondents report using an HPC workload manager down from 83% in 2022.
- Slurm rules on on-premises roost — Among sites operating on-prem clusters, the use of open source software continues to expand with 66% using Slurm vs. 53% in 2022 suggesting that Slurm’s market share is growing at the expense of commercial workload managers.
- Use of public cloud is up 20% — Use of cloud among Nextflow users continues to trend up, with 43% of respondents reporting that they use public cloud vs. 35.7% last year — a 20% year-on-year increase.
- The private sector is leading the charge — Among private sector organizations, this figure is even higher, with roughly 80% of organizations using public cloud.1
- A consolidating market — While use of cloud is growing, we evidence that the cloud market is consolidating. In 2022, the top three cloud providers accounted for 63.5% of cloud deployments. Today, they represent 69% of installations as leaders such as Amazon, Microsoft, and Google take market share from smaller competitors.
An interesting nugget in our data is the surprising strength of Google Cloud Batch. Despite being supported in Nextflow for less than one year, Google Cloud Batch is used by 5.4% of survey respondents. By contrast, 35.1% use Amazon Batch with 6.6% using Google Life Sciences and 5.2% using Azure Batch.
Most Nextflow users run multiple pipeline managers. The most popular companions to Nextflow are Snakemake (35.9%), Galaxy (14.3%), and Cromwell/WDL followed by others such as CWL, Apache Airflow, and Prefect.
Interestingly, while Kubernetes is commonly used as a deployment platform for enterprise applications (including Nextflow Tower), it is not making inroads as a Nextflow compute environment. Kubernetes is relatively difficult to configure and manage compared to cloud batch services, although we’ve recently simplified the integration considerably with a new Fusion file system integration. In 2023, 6.2% of Nextflow users reported using Kubernetes, down slightly from 7.2% in 2022.
Nextflow users are community minded
From the survey responses, participants in our 2023 survey were generally well-informed and active in the Nextflow community. We’ve seen a few trends that differentiate the results from last year:
- nf-core has now emerged as the #1 source of Nextflow pipelines.
- Community members are moving to Slack as their primary means of support.
- Tower adoption is growing — especially among private sector users.
24% of survey respondents actively contribute to nf-core community pipelines, and a surprising 61.8% use nf-core pipelines in their day to day work. This year, nf-core pipelines became the most common source of pipelines closely followed by in-house developed pipelines at 60.8%.
The nf-core and nextflow slack channels have emerged as the most popular avenues for support. In our most recent survey, 59% of respondents cite the nf-core channel as their primary means of support.
Our survey data also surfaced these additional observations:
- A strong developer orientation — Not surprisingly, Nextflow users tend to have software development skills with 49% using the Nextflow GitHub repo as a source of support.
- A technically-proficient community — Most Nextflow users still favor the command line, with 79% using the Nextflow CLI to launch jobs and 30% using Tower “sometimes” or “often”. Nextflow Tower is more popular among users in commercial settings with 39.4% using Tower “sometimes” or “often”.2
- Important workflow management features — What users value in a workflow manager is remarkably consistent year over year with most answers being within 1% of the 2022 result. The most important features are high-quality documentation, performance at scale, and ease of installation. Ensuring data provenance and pipeline portability are also important and users prefer workflow managers with strong community adoption.
- We also asked users to specify the most important features in Nextflow Tower. The most frequently cited features included a Web UI to monitor workflows, automated resource recommendations to reduce costs, and the ability to control workflows via a CLI or API.
- Tower resource optimization was introduced in Nextflow Tower in part due to this feature showing up as important in prior surveys indicating that community feedback matters.
You find a PDF version of these infographics for easy sharing here:
Download InfographicWe were delighted with the high-quality of this year’s survey feedback. Please stay tuned for future Nextflow and nf-core community surveys to make sure your voice is heard.
[1] For ease of classification, we consider pharmaceutical companies, biotech firms, and healthcare diagnostic and clinical settings to be predominantly private sector and academic research and government environments to be predominantly public sector.
[2] Among our sample of 502 respondents, 67 of 170 users in biotech startups, pharmaceutical or healthcare/diagnostic and clinical applications indicated that they use Tower “sometimes” or “often.” (39.4%)