In Episode 50 of the Nextflow podcast, Phil Ewels is joined by Krešimir Beštak, a rising star in the nf-core community and a PhD student at the Institute for Computational Biomedicine in Heidelberg. This special episode shines a spotlight on microscopy and spatial omics, an exciting departure from the usual genomics focus.
Krešimir shares his journey from initial exposure to Nextflow via the MCMICRO pipeline to now contributing actively to the nf-core project. With a background in computational biomedicine and a passion for image processing, Krešimir’s story is one of rapid progression through hands-on development and community engagement.
The conversation delves into the growing role of spatial omics in understanding diseases, with Krešimir’s research focusing on cardiovascular disease and myocardial infarction using spatial transcriptomics and proteomics. He also discusses his group’s work on translational spatial profiling in oncology, bridging the gap between academic research and real-world diagnostics. This episode offers a fresh perspective on how Nextflow is enabling reproducible, scalable workflows well beyond genomics - into the world of imaging and beyond.
Key links
Here are some of the links that were mentioned during the podcast:
- MCMICRO and nf-core/mcmicro, the multiple-choice microscopy pipeline.
- Napari, an interactive viewer for multi-dimensional images in Python.
- Minerva, a suite of light-weight software tools for interactive viewing and fast sharing of large image data.
- image.sc, a discussion forum for people working in imaging analysis.
- Ken Brewer’s Nextflow Summit 2024 talk: “How to NOT contribute to nf-core”
- Join the nf-core Slack and find the
#microscopy
channel. - Current Nextflow Ambassadors and the program overview.
Podcast overview
In this special 50th episode of the Nextflow podcast, host Phil Ewels chats with Krešimir, a PhD student and active member of the nf-core community. The conversation shifts away from the familiar world of genomics and explores the emerging intersection between microscopy, spatial omics, and Nextflow pipelines. Krešimir shares his journey into the world of reproducible bioinformatics, offering insights into how Nextflow is being used for advanced image analysis and translational research.
Meet Krešimir
Krešimir is currently based at the Institute for Computational Biomedicine at the University Hospital Heidelberg, working as a second-year PhD student. His background is in spatial omics, and he’s been contributing to nf-core for two years, using Nextflow for even longer. He first encountered the workflow manager through the MCMICRO pipeline working in Denis Schapiro’s group and has since been hooked, developing modules, contributing to nf-core, and continuing his during his master’s and now PhD.
From MCMICRO to nf-core
Krešimir’s Nextflow journey began using and contributing to MCMICRO, a Nextflow pipeline designed for multiplexed microscopy image analysis. What started as a routine onboarding task soon evolved into full-blown development. He began writing and contributing modules and became deeply embedded in the nf-core community. Under the guidance of Florian Wuennemann during his master’s placement, Krešimir developed tools that are now in active use within the broader Nextflow community: proof of how accessible and empowering community-driven development can be.
Spatial Omics and Translational Research
The heart of Krešimir’s current work lies in applying spatial transcriptomic and proteomic techniques to mouse cardiac tissue, particularly within the context of myocardial infarction. His group also works on cancer research, focusing on the diagnostic potential of spatial data. These projects are connected through a newly launched Translational Spatial Profiling Centre, where research directly supports diagnostic applications, blending academic rigour with real-world impact.
Beyond Genomics: Microscopy in Nextflow
While Nextflow has its roots in genomics, this episode highlights its growing role in other domains. Krešimir’s work with imaging and spatial data shows that workflow managers like Nextflow are versatile enough to support complex pipelines in microscopy and histology. As interest grows in these fields, the nf-core community is evolving to meet new demands, developing standards, modules, and pipelines that go well beyond traditional omics.
Looking Ahead
Krešimir’s enthusiasm for image analysis and reproducibility is palpable. He’s part of a new wave of researchers applying modern workflow tools to previously niche domains, opening the door to new collaborations and use cases. With more imaging-focused pipelines emerging in nf-core, and as spatial omics continues to mature, this episode is a compelling glimpse into where the Nextflow ecosystem is headed.
Conclusion
Episode 50 of the Nextflow podcast celebrates not only a community milestone, but also a broadening of horizons. Through Krešimir’s story, we see how Nextflow and nf-core are growing to support new scientific fields, enabling powerful reproducible research across domains. Whether you’re a workflow developer, a computational biologist, or simply curious about what’s next, this episode is well worth a listen.
Full transcript
Welcome and introductions
Phil: Hi, welcome to the Nextflow podcast. My name is Phil Ewels, and this is Episode 50, going out in April, 2025.
Today really excited to invite Krešimir on from the nf-core community. And we’re gonna talk all about microscopy in Nextflow pipelines.
This is a little bit different to the normal stuff we talk about with genomics, which is really interesting, and it’s something we’re seeing emerging more and more into nf-core and the wider Nextflow community.
So welcome. Tell us a little bit about yourself.
Krešimir: Hi Phil. Thank you so much for having me.
I’m a second year PhD student, working in spatial omics. I’ve been working within the nf-core community I would say for two years now, and have been using Nextflow for three years.
So I’m based in Heidelberg at the Institute for Computational Biomedicine as part of the University Hospital [00:01:00] Heidelberg.
Phil: And how did you find out about nf-core and how did you get involved with Nextflow at the start?
Krešimir: I’m working in Denis Schapiro’s group and Denis, while in his time at the Laboratory Systems Pharmacology at Harvard developed MCMICRO, which is a multiplex imaging, multiple choice microscopy pipeline.
Basically the first task that anyone who entered the group got was set up MCMICRO and run it on test data. And that was my first experience with Nextflow.
Then during the following year, I developed a module which, we had to add to MCMICRO to be able to process some data. That was my first hands-on experience, developing in Nextflow.
And then down the line, I added the module, for example, to nf-core, and have been developing alongside Florian Wuennemann, who is now at Seqera. who has supervised me while I was an intern and doing my master thesis there.
Now I just stayed there doing the PhD because I just fell in love with image processing and spatial omics in [00:02:00] general.
Phil: That’s the perfect setup when you do a master’s course and you really like the project and then you can just roll that straight into a PhD.
Krešimir: Exactly.
Phil: a dream situation. So tell us a little bit about your research. Tell us a little bit about what you’re working on.
Krešimir: I’m primarily working on cardiology in the context of spatial omics. So we apply different spatial methodologies, for example, transcriptomic and proteomic on mouse cardiac tissue, especially in the context of cardiovascular diseases, and primarily focusing on myocardial infarction.
However, as a group, we also focus a lot on cancer research. That is, I would say the the main topic, again, always in the context of spatial technologies.
And we are also going into a translational setting. So, Denis and half of our group actually, established the translational spatial profiling center, combining research and a [00:03:00] translational setting. So diagnostics in a cancer research setting utilizing spatial proteomics.
Phil: Amazing. It’s cool and you can see the research you’re doing directly feeding into something that affects people’s daily lives.
Introduction to Spatial Omics
Phil: So you mentioned a term there a few times about spatial omics. Give us an overview for anyone listening , who’s never heard of spatial omics before. What does that cover?
Krešimir: Spatial omics broadly, it involves capturing the information about different molecules. So be it RNA, proteins, metabolites, and captures also the spatial location of those molecules.
So basically for proteins, you measure intensity with high resolution microscopy. For RNAs you can detect RNAs as spots, for example. It gives you ideally subcellular resolution and the information about your target molecule.
However, there are also untargeted approaches. [00:04:00] Visium and Visium HD are now being used quite commonly. And these are untargeted approaches that give you almost subcellular resolution. If you can have like eight by eight microns, that is already very close and the field is moving so fast, That I would say that within the next couple of years, we will have publicly available subcellular resolution, whole transcriptome, readout.
Multiplexing markers
Phil: Amazing. This is maybe not that well known, but my PhD was on,um, chromatin organization within the nucleus. So I spent many, many hours at a microscope doing DNA FISH and RNA FISH.
And it was so laborious. I’d spend like days and days with all these little foil covered pots. Doing all this different kind of staining with all the antibodies trying to get , just two probes, just two different RNA probes within the nucleus, and then manually by hand measuring how many were co-locating.
It was, it was super manual and [00:05:00] super slow. And now I feel like an absolute dinosaur because, because I mean, what, what kind of throughput are we talking about with the techniques you’re describing here?
Krešimir: I mean, for example, the platform that we’re using, the Lunaphore Comet™, it can provide like 40 plex antibodies on two samples per day, for example. So you can just have it running continuously.
Phil: So, so that’s like two years of my PhD in two days. Nice. Yeah.
Krešimir: it’s, very exciting actually. This platform was what got me to stay, to do a PhD. ‘cause in my master thesis I was also in the lab doing the antibody panel optimization validation and then acquiring, the data set.
Phil: And when you say multiplex, that’s 40 different antibodies, which are all staining . How do you stop the overlap: and how do you stop those colors bleeding into one another?
Krešimir: There are many different ways to [00:06:00] stop that overlap. So for example, some platforms can have a cyclic approach where you add the antibody, you image it, also add buffers along the way, and then wash away the antibody.
add the antibody, then cut it away, then wash that away.
Some have antibodies that have barcodes, and then you actually target the barcodes, then also cyclically.
You can image without a problem three markers in one cycle. And then if you image across multiple cycles, then you can register the cycles and have one clear stack of channels.
Metabolite microscopy
Phil: Fantastic. So that covers microscopy analysis of DNA, RNA and protein. I think you mentioned metabolites as well. How does that work?
Krešimir: This is out of my expertise, you could also theoretically target metabolites. So for example, there’s been work at the DKFZ where they use a mass spec based [00:07:00] approach, also giving you a spatial readout for metabolite.
So not only do you have immunofluorescence. You also have mass spectrometry based approaches such as me and imaging ma cytometry which have antibodies that are coupled to metals. And then you actually find the metals at a very high sensitivity.
Phil: It’s a bit like electron microscopy as well, I think where you can use antibodies with metal tags.
Myocardial multiomics
Phil: So to recap and make sure I understand this correctly, so you are taking slides with tissue slices for mouse biopsies of myocardic infarctions. And then using that microscopy techniques to look at the proteomic landscape. And are you doing all of these in one, one set of experiments or are these separate projects or these different omic techniques?
Krešimir: So usually you would have, one set of experiments with one methodology. But, for example, in the myocardial infarction story led by Florian Wuennemann, we’ve used three different spatial technologies on that data.
We applied targeted [00:08:00] spatial transcriptomics with molecular cartography. Then the Lunaphore Comet™, which I explained was my master thesis. And then we also applied deep visual proteomics where this was not subcellular, but we selected regions and then sent that to a mass spec and got all the proteins in those regions, which then helped to highlight a novel immune infiltration route into the myocardial infarction site. Also finding a potential protein responsible.
Phil: So by overlaying the different techniques, you can get a more complete view of what’s happening in this tissue.
Krešimir: Absolutely. So in our case, it was diagonal, so we did not do it on the same slice, on the same samples.
So our readouts were mostly separate and then biologically we connected the story. The integration of different modalities, for example, on the same sample or more likely consecutive sections is a big challenge. But something I find very exciting.
Microscopy data analysis
Phil: Okay, so [00:09:00] that’s the kind of biology side. Take us through what happens next. You take these microscopy images, you end up, I guess, with a directory full of TIFF files. Or is that old fashioned as well?
Krešimir: No, it’s possible. Yes.
Phil: Okay, so you end up with some data of some description. What happens next? How do you process this?
Krešimir: So I will go through the proteomic example here, so antibody based.
So usually you would get, as you said, either a lot of TIFF files, each is one tile, one marker, for example. Or you would have one tiff file for the whole tile, but with a stack of markers for that cycle. Then you have to somehow stitch and register those files into one big, image that then you can check for spatial organization.
For processing such data there are existing pipelines. So for example, as I already mentioned briefly, MCMICRO, the multiple choice microscopy pipeline is perfectly suited for a multiplex protein based data.[00:10:00]
So the steps there, you take those, TIFF files, then you have to stitch them, register them into one big image, but also apply illumination correction, because sometimes you have lens-based periodic artifacts that happen in every tile. So you have to detect that. Then while combining the full image you can correct for that.
Then for example, with the data I mentioned, you also have autofluorescence. So you can subtract the autofluorescence to get a more cleaner, at least visually cleaner image.
Then you have to segment that image into cells. You have to get a label mask. Each label is one cell. So this is also a very big undertaking and there are so many tools out there, and this is something I feel like every second day there’s a new tool and it’s very exciting to, to see.
So then once you know where your cells are, then you can measure the intensities. So usually you measure the mean intensity across the [00:11:00] cell and have that as a readout, per cell, per marker. And you also measure the cell geometry. And this you can then use downstream. So for example, this is where I would stop an MCMICRO run and then go into a more interactive approach depending on the biology.
So you, you have to definitely call the phenotypes. So from the intensities, you need to figure out what you’re looking at, which cell is which, and you can also do spatial methods finding different niches and defining interactions between the cells.
nf-core/mcmicro
Phil: And the MCMICRO pipeline that has been brought into nf-core relatively recently, right?
Krešimir: Yeah, so MCMICRO was first developed just in Nextflow, not using nf-core tooling. Yeah, it’s a very versatile pipeline. You can select which modules you want to run, however, we have decided to push the nf-core implementation of MCMICRO just to bring all of those modules to the community for a structured way that people can reuse those modules.[00:12:00]
Because it was already modularized, it made perfect sense that the next step would be to go into nf-core and just bring everything into the community, but also show, I mean, as, as we’re doing here, that imaging pipelines exist and in a way call to action that let’s, let’s bring the community to nf-core broadly.
Phil: I love that and I love how, because the analysis is very modular, you’ve enabled other people to create bespoke pipelines with those kind of Lego bricks so they can just install those modules into their own pipelines if they have something similar.
Krešimir: Exactly. So for example, we’ve had a recently published paper that needed, I needed to process the data fast and it was a data type that I have not yet processed. And then just having a branch of nf-core/mcmicro, and the pipeline is still in development. But like even then I could take that and just have something that works on my data.
I feel like that’s the biggest power of bringing the modules and the pipeline to [00:13:00] nf-core because you can really adapt it to your own data set.
2D vs 3D microscopy
Phil: And quick question, this is all 2D microscopy, right? We’re not talking about three dimensional reconstruction here.
Krešimir: Yes. So in my case, I only work with 2D data, however, there have been advances with 3D and also a push in nf-core, for example, with light sheet microscopy. Yeah, 3D data is definitely coming and it’s exciting, but it’s a different scope of even image size, which brings about different challenges.
Phil: I guess voxels take up a lot more disk space.
Computational bottlenecks within analysis
Phil: If we start to think about the different steps that you described in mcmicro, what do they look like computationally? What are the challenges?
Krešimir: I think it really depends on, what do you call a bottleneck? Because yes, the image size can be a bottleneck, especially because it’s easy to duplicate data, for example. And this is something that we need to be really [00:14:00] careful about going forward.
So in our case, we don’t have that many files. Because we use omni-TIFF files right now, at least, as the intermediary and final file output. However, some are using spatial data, which has a lot of files in the file structure.
Phil: When you say duplicate files, so you are talking, for example, removing autofluorescence. You, you could have two sets of a data files then before and after that step.
Krešimir: That’s precisely the example I was thinking about. So in a way, if you have a big image to start with, if you subtract the autofluorescence, you would have a new image with the same size.
Computational resources are a big one. Absolutely. Segmentation is a very resource heavy approach. People have done work to optimize this. Then it’s kind of, is it on the tool developers to make it as optimized as possible, or is it on the pipeline developers to make any tool as optimized as possible? That’s an ongoing challenge I would say. But there has been amazing work done in that [00:15:00] regard.
Phil: I remember a talk about microscopy in Nextflow back at the 2022 Nextflow Summit in Barcelona. In that talk he was talking about using a Spark cluster, within the Nextflow pipeline. Is that something that MCMICRO is doing or something you’ve come across?
Krešimir: No, that is not something we’re working with. I imagine the talk was by Konrad Rokicki, cause I’ve also heard a talk from him using that, and I was amazed at the work being done at Janelia.
It’s exciting to have different people doing different things. For example, people there, people at the Sanger Institute, now even here in Heidelberg, at EMBL and DKFZ people are doing amazing things already with Nextflow in this field.
Other nf-core imaging pipelines
Phil: We’ve talked about MCMICRO, that’s not the only pipeline in nf-core that you’re involved in. Can give us a, an overview of all the different microscopy pipelines we have at the moment,
Krešimir: I am, also with Florian Wuennemann, lead developer for molkart, which, here it yes.[00:16:00]
Phil: Hooray! nf-co.re/shop, get your own.
Krešimir: So that’s a pipeline for targeted spatial transcriptomics, primarily for molecular cartography. It has some steps that are molecular cartography specific. Then again, it’s segmentation and we offer a variety of segmentation options.
And then it also filters the spots and gives you a cell by gene quantification file that then you can use for downstream processing.
So there are also spatialvi, spatialxe pipelines, for example, handling visium and xenium data, respectively. And there are, I think, several more in works.
This is something that I only found out at the Summit this year, it was imcyto, I did not know that the pipeline existed. So it’s exciting to also see that there was a push for imaging back 2020 if I saw the dates correctly.
Phil: Yeah, I’d forgotten about imcyto. You’re right, that’s [00:17:00] one of the, one of the old class of nf-core pipelines.
Krešimir: DSL1 still. Right. can still that molkart is the first imaging pipeline released nf-core using DSL2 two.
Phil: And does molkart run after MCMICRO or is it like an alternative pipeline, which is totally separate?
Krešimir: So it’s totally separate. So mcmicro only supports protein data. Whereas molkart is for molecular cartography. That’s the platform from Resolve Biosciences. They offer high sensitivity up to a hundred target genes, and very high resolution readouts.
So for those a hundred targets, you get the spatial location of each RNA molecule, in your sample.
Phil: Then you also mentioned spatialvi and spatialxe. And those are for data coming out of a 10X platform. Is that right?
Krešimir: Exactly. So spatialvi is for visium data, that’s untargeted spatial transcriptomics. And spatialxe is for Xenium, which is [00:18:00] also a targeted transcriptomic approach. The readouts are similar to molkart, but the processing is a bit different.
Phil: So all of these techniques are looking at spatial RNA data, looking for where different genes are being transcribed. What’s the difference between them and when would you choose one over the other?
Krešimir: That’s a really good question. It would depend on, the scope that you want, the money you have, the biological question at hand, because in the end, the biological question should always drive the experiments, right?
Downstream analysis after molkart
Phil: So you’ve run your molkart pipeline. What kind of files are you generating at the end of these pipelines and where do you go next?
Krešimir: So at the end you get TIFF files for the segmentation masks, highlighting where each cell is and a CSV quantification table.
Then you would go to for example Seurat, to do normalization [00:19:00] clustering, and then interactive annotations of clusters, for example, also integrations with other publicly available data.
Phil: So you’d run this kind of analysis in a Jupyter Notebook or something ? Or is this more kind of another pipeline?
Krešimir: No, so in this case, that would be an r markdown.
Phil: And so after that step of normalization and integration, then you start to be able to get out publication figures comparing localization and things like that.
Krešimir: precisely. What also should be mentioned is that with the spatial readout, you can then use different spatial methods on top . You can check just looking at the cell labels and their xy centric coordinates whether some cell types often co localize.
Then you could also check for the raw data itself whether some transcripts often co localize. You don’t necessarily even need the segmentation in all cases.
Manual interventions
Phil: And is there any manual inspection step or is this all script based?
Krešimir: So I don’t have experience [00:20:00] with the genomics pipelines, but I would say that for imaging pipelines, there is a layer of interactivity needed at least when it’s your first time encountering a tissue and your panel, you need to tune a lot of parameters to get proper segmentation.
So, for example, in a one dataset, we had to do local contrast enhancement because to get the membrane signal actually usable. Tuning the parameters, is a very important step here.
Phil: And that has to be done manually.
Krešimir: Yes, in a way. So there are also efforts to automate parameter selection. But usually you need ground truth and that is very hard to get without manual work.
Phil: I think in genomics, that kind of manual inspection is done less and less, but I think everyone should be doing it. And then so back when I was doing this kind of every day we used to use a graphical tool developed at the Babraham in Cambridge called SeqMonk. It just looks at alignments and you can do lots of quantification things off it, but you always use it [00:21:00] after the alignment step.
And just eyeballing the data, so often you find things which are really weird, which you might never have noticed if you were doing only using a CLI and only doing whole genome wide techniques.
One of my favorites was I downloaded some public data from a sequence read archive, and for some reason it had been corrupted at some point. Just over half of each chromosome was present.
So all the stats looked fine genome wide. ‘cause there were stats per chromosome. Everything was present. But actually about half the genome was missing. And it, there’s one of those ones like that would actually be really quite difficult to spot with QC tools. But when you just eyeball with data or the alignments you can spot things very quickly and I, I imagine it would be the same with microscopy.
You can see where someone’s dropped a big glob of antibody staining or something.
Krešimir: Absolutely. Even just looking at the expression of markers, in a plot format does not give you the insight, because there are so many things that need to be taken into account. Like the [00:22:00] segmentation quality needs to be good. There can be spillovers, so signal from one cell going into the adjacent cell.
It’s an ongoing challenge and it’s very difficult to get fully right. Which is why also for downstream from the quantification. Okay, so let’s say we, we optimize the segmentation. We train our own model. We’re happy with how it looks. Downstream, you still need to take that into account.
So what’s oftentimes done is just manual thresholding, for example, because you don’t have the dimensionality to cluster efficiently, and then you have to manually threshold your markers per image because that also then reduces the batch effect or sample to sample variation.
Phil: I was thinking that - if you’re manually tweaking all the parameters. Between different samples. I imagine that makes it very prone to batch effects.
Krešimir: I would say it’s the other way around. By changing those parameters taking into account your biological knowledge of that tissue, you [00:23:00] reduce the batch effect because you do not use the quantifications in the end. The goal is often just the phenotypes.
This is something people are also working on trying to use the quantifications in the end. But if you simplify, you get the phenotypes from the quantification then you do your spatial analysis on the phenotypes. I, that’s kind of the, the typical way I would say to do it.
Minerva
Krešimir: What I’m sharing right now is a Minerva story. It’s a manually curated walkthrough.
This is an amazing addition when you’re done with your story and you just want to show the world your data, because people just click on a link and open in their browser without needing to download the data, process, the data themselves.
Phil: What kind of things can you do with Minerva? What tissue are we looking here on your screen share.
Krešimir: So here as an example of antibody-based data, I have one sample of myocardial infarction, 24 hours after [00:24:00] induction.
We’re now looking at a cross-section of a mouse heart that had myocardial infarction induced.
So this data was produced as part of an ongoing story that’s right now a preprint, led by Florian Wuennemann. And, this was data acquired during my master thesis.
So what we’re looking at here right now are seven markers, which you can see on the right, showing different proteins in different colors.
So for example, this is the whole cross section. Then if I zoom in, you can see that we have subcellular resolution. We have different channels, so for example, in cyan we see the nuclei stained with DAPI. Then in blue are cardiomyocytes. In orange, we have really nicely stained the whole infarc region. So everything that’s orange here is injured. Then we have endothelial mark and endothelial marker, blood vessels. And immune cell markers in red and magenta.
With data such as this, we [00:25:00] can really appreciate differences in cell shapes. We can see differences in cell types, how there are different structures happening here.
So this is the left ventricle. This would be the right ventricle here. This is the lumen of the left ventricle. And what our story shows is, if there is an infarct, the immune cell infiltration does not only happen through vasculature near the infarct, also through the endocardial layer here, as you can really nicely see here, these are all immune cells that are attaching to the endocardial layer and then additionally infiltrating into, the subsequent endocardial infarct zone.
Google Maps for cells
Phil: That’s beautiful. There’s no disputing that with microscopy data, is there. There’s something very nice that you can just look at an image and be like, look, we can see the immune cells. It’s like Google Maps, but on a tiny scale.
Krešimir: precisely like that. And also yeah, I mean the file structure is very similar because you have a big image, the raw data for this would [00:26:00] be 60 gigabytes, but then you, you actually have the data saved as different pyramidal layers. And then based on how zoomed in you are, it’s the visualizer software shows you a different, view of the data.
If you zoom in, then you would see theoretically the highest resolution possible. But if you zoom out, it’s still possible to open images on your laptop.
Phil: It is actually quite interesting is a comparison between microscopy analysis and other scientific fields, which are totally different but work with similar data shapes.
‘cause we also have the nf-core pipeline called Rangeland, which is for doing satellite image data analysis. And I think it’s very, very similar. It’s registration of images and, normalization and, and many of the same kind of steps, but just done on a totally different scale, you know, satellite images instead of microscopy images.
Krešimir: Absolutely. It was recently released, right? I was very happy to see it because have such a big scope of data sizes. Spatial omics has also taken a bit [00:27:00] of inspiration from geographical systems.
Phil: Yeah, makes sense. What a beautiful piece of art to finish your project with as well. You can print it out and put it on your wall.
Krešimir: I will say I printed this sample as a sticker, and it’s always on my laptop.
Microscopy community around Nextflow
Phil: So we’ve talked a little bit about how there’s more and more microscopy pipelines coming to nf-core, and it feels like there’s growing momentum in this field.
Can you tell us a little bit about how this sub-community is forming within nf-core? And I think there’s a microscopy special interest group, which sort of started not that long ago.
Krešimir: Precisely. I feel like the nf-core Slack has become a really good talking point. The imaging community is mostly using image.sc for communication, a forum platform, for discussing imaging based issues. But the nf-core slack is definitely becoming more and more used.
I would say that within nf-core it is very important that the communication happens as [00:28:00] transparently as possible because there are so many pipelines and they do share aspects that ideally would be sub workflows, for example. And it’s just about finding people who can help you.
Krešimir: But I, I definitely think that having the nf-core modules and the nf-core tooling. Ken presented this, at the summit, how to not contribute to nf-core. I, I really like that because I, I feel like how adaptable you have to be to process your data, like oftentimes it’s not enough to just use the original tool. You have to make a slight change. You have to add a package, and that is much easier when you don’t put tools on nf-core, but use nf-core tooling and then make modules like that.
And I feel like in the imaging community, a lot of work has been done in this way, like amazing work that like people are using in pipelines.
Phil: Sometimes that kind of gets lost, but it was one of the original founding principles of nf-core, is that the pipelines [00:29:00] would both work out the box, but also just form a solid base for people to customize in their own cases as well.
So I think this is a perfect example of that.
Krešimir: And also think it’s a very important question to ask, when should something be put on nf-core in general? Because for smaller pipelines that not a lot of people will use, having that overhead of nf-core related maintenance might be a bit too much just because there is really no point in spending so much time when would just work with the template 2.0. What’s the scope? Because if you’re making a big pipeline that you envision a lot of people would use, then absolutely get as many people in, in nf-core, to give you feedback.
I would say that recently there have been many events where the topic was: how do we do bio image processing in Nextflow? So there was one organized at [00:30:00] QBiC in Tübingen, and one at VIB, in Leuven, both of which were, I think, really amazing showcases of how people use Nextflow in imaging.
And big talking point at both events was how do we use nf-core? How can I use an nf-core module in my own standalone pipeline? So, for example, that was a tutorial I gave in Leuven. I would say that both events were amazing success of networking and just seeing what’s out there, with palpable results and discussions also.
How to get involved
Phil: And so for anyone who’s listening who hasn’t been involved in nf-core and in the microscopy community, do you have any advice?
Krešimir: So for anyone who wants to get involved, I would say join the nf-core Slack. Join the #microscopy channel and pick out the pipelines that exist. And just ask what you would want, what you would want to see, how would you want them improved.
Phil: And as we look into the oncoming [00:31:00] 2025, is there anything coming up which you’re particularly excited about?
Krešimir: So there’s the nf-core hackathon happening, or that will have happened last week.
Phil: So we’re recording before this has happened, but have you tell us a little bit about what you have planned.
Krešimir: So last year, in Heidelberg we had the big local site event of over 50 people. This year we’re scaling back a bit and it’ll be a bit more topic-wise focused. So I think more than half of us will be working on spatial omic pipelines. We have the push from DKFZ with the spatialxe pipeline and also with molkart mcmicro.
Changes in Nextflow for microscopy
Phil: Is there anything that you would like to see changed in Nextflow itself or in the nf-core template or anything which would make life easier for people developing microscopy pipelines?
Krešimir: So this is something that I think is a very big talking point I did not really find mentioned. It’s the module documentation.
So [00:32:00] this was actually at the hackathon I mentioned in Leuven, we proposed two options on how this could be done. If nothing else, hey, here are some commented outlines, but these are parameters you can use. So either in the Nextflow config, as commented outlines, ah, here is how you could parse parameters with ext.args. Or adding different sections in the meta.yaml.
Phil: I wonder if we could also just add a markdown file in the module directory where people can write whatever kind of prose documentation they’d like. Really super easy.
Krešimir: I mean that’s, that would be possible. Right? Just having a read me for the module specifically.
Phil: It would also render nicely on GitHub, so I like it. It’s a good idea. We should bring it up.
Nextflow Ambassador program
Krešimir: One I wanted to say is just that, I want to highlight the Ambassador program from Marcel and the team, because it’s been a really good platform, to communicate with other ambassadors and just have a bigger outreach in general.
With their [00:33:00] support, I was able to go to the two events I mentioned the community based events. And that’s really exciting, exciting that that initiative happened.
Phil: I’ll put the link in onto the podcast notes for the Nextflow Ambassadors webpage, where anyone who’s interested can find out more and see when the next application window is.
Conclusion
Phil: All right. Thank you so much for joining me today. It’s been an absolute pleasure to talk about your work and talk about this field, which is, yeah, a little bit different to what I’m used to talking about, which makes it even that bit more interesting.
And really exciting to hear how this is developing in nf-core, I feel we’re now gaining that critical mass where it’s becoming a sort of self-sustaining sub-community within nf-core.
Krešimir: Imaging is such a wide topic. There are so many different data types. So you have immunofluorescence that I mentioned, and you also have like light sheet microscopy, then you have x-ray, you have electron, you have live cell imaging as well. And all of these require [00:34:00] different processing approaches, but all of them are imaging.
Phil: Plenty of space for more pipelines to be added to nf-core in the future.
Great. I’ve really enjoyed chatting to you. Thank you so much for joining me.
Krešimir: Thank you so much Phil for inviting me.
Phil: And and yeah, we’ll speak to you soon.
Thanks very much.
Krešimir: Thank you.