AI Summaries now available in MultiQC

MultiQC was written to summarize analysis results. It takes directories full of bioinformatics tool outputs, finds files that it recognizes and parses out the key metrics into a human-readable report that you can use to get a feel for your experiment and spot outliers and patterns.

Over the past few years, large language models (LLMs) have become all the rage. In particular, they excel at consuming large quantities of information and summarizing the contents. So we wondered to ourselves: what happens if we throw data from a MultiQC run at an LLM? Turns out that the results can be quite useful…

Introducing AI Summaries

AI Summaries allow MultiQC users to quickly generate summaries of their reports, offering insights without needing to comb through every detail manually. Previewed at the Nextflow Summit 2024 in Barcelona, we believe that AI Summaries mark a significant step forward in integrating AI into bioinformatics workflows, moving beyond chatbots into highly specialized tools that can streamline your daily work. MultiQC AI Summaries can be created in two ways:

During report generation: Embedded directly into the HTML file for easy sharing and reference.
Dynamically in-browser: Generated on demand while viewing a report, ideal for ad hoc exploration.

These summaries aim to address the challenge of a “first-pass” interpretation. Whilst they cannot (and should not) replace expert evaluation and human expertise, they provide a strong starting point for spotting outlier samples and understanding the data more efficiently.

The integration of AI into MultiQC demonstrates how LLMs can be applied to structured data analysis, a use case that extends far beyond the familiar chat interfaces. AI in bioinformatics isn’t just about answering questions—it’s about enhancing productivity and decision-making at every stage of the analysis pipeline.

Avoiding walled gardens

MultiQC is free and open-source software, and it’s important to us that it remains open. MultiQC ships with direct integration with Seqera AI which is free to use and has the smoothest integration. But we’re also releasing it with integration for OpenAI, and Anthropic. MultiQC also has functionality to easily copy the LLM prompts to your clipboard to use whatever other LLM provider you wish.

This flexibility in provider choice and integration ensures that you can use this new functionality with whatever tools and data privacy requirements you may have.

How it works

To enable AI Summaries, users need an API key from a supported provider. Seqera AI offers a free option¹, while OpenAI and Anthropic provide access on a pay-as-you-go basis. Once configured, summaries can be fetched at report generation time using simple MultiQC command-line flags:

--ai / --ai-summary: Generate concise summaries.
--ai-summary-full: Generate detailed summaries with analysis and recommendations.

All MultiQC reports also now contain a new AI toolbox section and buttons to summarize report sections on demand. These in-browser summaries are not stored in the HTML. If you’re using the Seqera AI integration, you’ll also find a Chat with Seqera AI which remembers the report history in a new chat session, so that you can interactively ask follow-on questions about your data.

Chat with SeqeraAI which remembers report history

If you’d rather not see any mention of AI in your MultiQC reports, you can run using the --no-ai flag, which disables all AI features. You can also hide the buttons from reports by selecting Remove AI buttons in the report toolbox.

For those working within Nextflow pipelines, configuration is straightforward, with environment variables used to securely manage API keys. This ensures seamless integration without altering pipeline code.

Please summarize responsibly

As exciting as AI Summaries are, it’s important to recognize their limitations. LLMs have token limits, which can make summarizing very large reports challenging. MultiQC addresses this by prioritizing the most important data, but users must still evaluate the summaries critically.

Privacy is another key consideration. Report data is sent to the chosen LLM provider for processing, so understanding what data is being shared—and with whom—is crucial. Seqera AI provides transparency around data use, ensuring inputs are not retained for model training or fine-tuning. See https://seqera.io/ai-trust/ for more information.

The bigger picture

We think that AI Summaries in MultiQC are a glimpse into the future of bioinformatics tools. They showcase how AI can move beyond generic applications to address niche, domain-specific challenges. By integrating directly into workflows, features like this help bioinformaticians work smarter, not harder.

We’re excited to see how the community embraces this new capability. Whether you’re generating summaries to speed up your analysis or experimenting with custom configurations, we’d love to hear your feedback. AI Summaries are just the beginning, and with your input, we’ll continue to make MultiQC even more powerful.

Upgrade to MultiQC v1.27 today to try out AI Summaries for yourself. For detailed instructions on setup and configuration, visit the MultiQC documentation.

¹ Seqera Cloud Basic is free for small teams. It includes access to Seqera AI, with a usage cap of 100 messages per calendar month. Seqera AI usage is unlimited for Seqera Cloud Pro users. Researchers at qualifying academic institutions can apply for free access to Seqera Cloud Pro. See Seqera Pricing for more details ↩