The Italian Council for Agricultural Research and Economics (CREA)
Aim
Conduct large-scale, reproducible bioinformatics analyses of genome data. Develop new varieties of crops to meet the rapid pace of climate change.
Challenges
The Italian Council for Agricultural Research and Economics (CREA) operates multiple clusters and desktop computing environments across various research divisions. With 600 researchers across 12 research centers, and petabytes of genomic data being generated, centrally coordinating and managing access to infrastructure and software for bioinformatics analysis is a significant challenge. The inability to share such infrastructure for analysis results in long turnaround times, inefficient resource utilization, and a high total cost of ownership.
In addition to infrastructure, software tools were also a problem. Different research teams used different tools, including custom scripts, to automate multi-step processes. These scripts were sometimes insufficiently commented, making them hard to maintain and reproduce. This lack of standardization across teams made it difficult to share research, tools, and workflows, hindering effective collaboration across divisions, thereby slowing research progress.

Solution
- Nextflow - Enabled highly efficient parallel execution and scaling of genomics analyses.
- nf-core pipelines - Ensured reproducibility through peer-reviewed, standardized omics Nextflow pipelines.
- Microsoft Azure - Easy access to scalable cloud resources, without needing to manage infrastructure on-premises.
Results
CREA, with its Research Centre for Genomics and Bioinformatics (CREA-GB) has been leading an effort to standardize genomics analysis by leveraging Nextflow pipelines in Microsoft Azure. By utilizing peer-reviewed, community-curated nf-core pipelines and Microsoft Azure, CREA has improved efficiency, collaboration, and reproducibility in their research. They have also established a shared infrastructure that serves as a model for all CREA research teams.