Fusion file system

A distributed, lightweight file system for cloud-native data pipelines

Data Management Simplified

Supercharge your cloud file system performance

Cloud object stores such as AWS S3 are scalable and cost-effective, but they don't present a POSIX interface. This means containerized applications must copy data to and from S3 for every task — a slow and inefficient process.

Fusion file system is a virtual, lightweight, distributed file system that bridges the gap between pipelines and cloud-native storage. Fusion enables seamless filesystem I/O to cloud object stores resulting in simpler pipeline logic and over twice the throughput of cloud object storage.
  • Simplify pipeline development and deployment
  • Reduce redundant file I/O and maximize resource use efficiency
  • Avoid the need to pre-install cloud tools in containers or cloud instances
  • Accelerate task and overall pipeline execution
  • Eliminate the need for expensive and complex shared file systems
  • Boost productivity, reduce time to results

Features

Transparent, automated installation

Traditionally, pipeline developers needed to bundle utilities in containers to copy data in and out of S3 storage.

With Fusion file system, there is nothing to install or manage. The Fusion thin client is automatically installed and configured for your pipeline enabling containerized applications to read and write to S3 buckets, Google Cloud Storage, Azure, and other object stores as if they were local storage.

Dramatically reduce data movement

When pipelines run with object storage, tasks typically read data from a bucket, copy it to local block storage for processing, and copy results back to the object store.

The result is significant overhead for every task. Fusion file system enables direct file access to object storage, eliminating unnecessary I/O and dramatically reducing data movement and overall runtime.

No shared file system required

To share data among pipeline tasks, organizations often turn to shared file systems such as Amazon EFS, Amazon FSx for Lustre, or NFS.

Fusion file system avoids the need to deploy, manage, and mount shared file systems on every cloud instance by providing the same functionality over cloud object stores such as S3 — significantly reducing cost and complexity.

Seamless access to cloud object storage

While some open-source projects provide a POSIX interface over object storage, they require developers to install and configure additional software and package it in containers or VMs.

Unlike third-party solutions, Fusion is optimized for Nextflow and handles these tasks automatically, delivering fast, seamless access to cloud object storage.

Maximize pipeline performance and efficiency

Copying data to and from object storage adds latency for every task, lengthening the time containers and cloud instances are deployed. This translates into longer runtimes and significantly higher costs for pipelines with thousands of tasks. Fusion file system eliminates these bottlenecks and delays, reducing execution time and cloud spending and using compute instances more efficiently.
Ready for Fusion?
Learn more

Boost performance and efficiency with Fusion file system

Fusion file system is a simple, scalable file system solution that works optimally in cloud-native compute environments. Fusion 2.0 outperforms existing object stores and file systems by avoiding the need for intermediate data copies and leveraging the high-performance Fusion driver.

How Fusion FS works
Fusion file system works by eliminating the need to stage data to local block storage for each pipeline task. By eliminating the need for block storage and by accessing object storage via a POSIX interface, efficiency is increased dramatically, resulting in faster pipeline execution and lower costs.
Ready to turbocharge your pipeline file I/O with Fusion?
Learn more