Break-through Science - Machine Learning and Imaging Pipelines using Nextflow
This week, I had the pleasure of speaking with Felix Morency, the CTO of Imeka, a customer that is using Nextflow for managing imaging data analysis pipelines and machine learning. It is always fun to hear about the break-through work our customers are doing.
Imeka is based in Sherbrooke, Quebec, Canada and has a brain imaging technology platform specializing in microstructure and white matter connectivity. Imeka offers pharmaceutical companies the ability to combine diffusion MRI imaging and AI to map white matter integrity and understand the central nervous system morphology of patients. Imeka’s technology platform offers pharmaceutical and biotech companies the possibility to reconstruct the brain's fiber network, providing insights into the white matter and the central nervous system, including the spinal cord.
Side note: An excellent analogy for white matter and grey matter is that grey matter is the processing core of the brain and spinal cord, whereas white matter is very much like the network cabling. Together they make up a large part of the central nervous system. (That was at least helpful for me as a computer scientist.)
We now see elements of the brain and central nervous system that were not possible with previous technologies.
The solution starts with acquiring images (diffusion MRI images), which represents the diffusion of water molecules along the brain’s nerve bundles. Those images are then run through Nextflow pipelines to launch quality checks, main image processing, and statistical processing. Imeka deploys its Nextflow pipelines in Singularity containers running on an on-premise compute cluster. The pipelines are pulled from a git repository and often have more than 50 compute steps with a lot of parallel processing.
The next part of the compute process is to perform machine learning on the processed images. Nextflow is used to train the ML models utilizing GPUs with inference done on CPUs. Once the model is trained, Nextflow then pipes the models to Pytorch which runs the ML on the back end. ClearML is also used to track the progress of training and to visualize if the training is progressing well. Imeka has developed its own set of models and techniques for machine learning and tools to get finer imaging-based results. The final results are then interpreted, yielding an improved understanding of the white matter structure and its state. This information is used by pharmaceutical and biotech companies to select drug treatments.
“Nextflow elegantly solves problems and addresses the multiple challenges that we faced. We tested a lot of tools to orchestrate and launch containerized pipelines on our cluster. Nextflow just works, and provides reproducibility, which is so important in a regulated industry."
Imeka’s non-invasive technology minimizes pharmaceutical and biotech clients' risks in the development of treatments for neurodegenerative diseases such as Multiple Sclerosis, Alzheimer's, and Parkinson's disease, as well as traumatic brain injury. The benefits will manifest in new drug development, early detection and better diagnosis, progression monitoring and improved outcomes for diseases where treatments are presently limited. Stories like Imeka’s epitomise what I love most about the work I do — speaking to our brilliant customers who are working to change the world.
Rob Lalonde rob.lalonde@seqera.io @hpc_cloud_rob