Why Git Integration Matters
Studios has simplified the way bioinformaticians transition between pipeline execution and interactive analysis. Whether you're analyzing data in Jupyter or R notebooks, writing Nextflow pipelines in VS Code, or defining your own custom environments, Studios automatically provisions the computational resources and tools you need. But as teams scale, managing custom environments and ensuring reproducibility across projects becomes increasingly complex.
That's where the new Git integration comes in – by connecting Studios directly to your Git repositories, you can now define, version, and share your entire Studio configuration as code. No more manual setup. No more “it works on my machine” moments. Just pull your configuration from Git, and you're ready to go.
The Power of Schema-Driven Configuration
At the heart of this integration is a flexible schema defined in a simple YAML file. The studio-config.yaml file lives in a dedicated .seqera directory within your Git repository and gives you complete control over your Studio environment. Here's what makes this schema powerful:
Container flexibility
Choose between pre-built Seqera-managed registry images, custom container build file, or bring your own container. The schema supports all three approaches, letting you match your workflow to your team's needs.
Build from Dockerfiles
Instead of building and pushing your container image to a registry manually, you can now simply include a raw Dockerfile in your .seqera directory, reference it in your schema, and Seqera handles the build process automatically using Wave. When you launch your Studio, Seqera now builds your container on-the-fly, pushes it to your configured workspace container repository, and uses it for your Studio session. This eliminates the friction of managing container registries and building pipelines yourself, letting you iterate quickly on your environment without leaving the platform. Your workspace administrator just needs to configure a target container repository once, add the necessary credentials, and your team is ready to go.
Compute requirements
Define CPU, GPU, and memory requirements right in your configuration (AWS Batch compute environments only). Whether you're running lightweight data exploration or compute-intensive model training, you can specify exactly what you need.
Environment management
Include an optional conda environment file to define additional libraries and dependencies. The schema automatically handles package installation with Wave, so your environment is consistent every time. Note that Wave v1.31.0 now supports multi-stage build templates (Pixi and Micromamba v2) for smaller, more secure images with Pixi-specific options for container builds.
Environment variables
Set session-specific variables that take precedence over compute environment defaults. Perfect for configuring API keys, paths, hosted datasets, or other runtime settings.
Session management
Configure lifecycle settings like session lifespan and privacy controls directly in your schema. Optionally add resource labels for cost-allocation reporting.
The schema is intuitive and well-documented. To help teams get started quickly, we've published a GitHub repository with multiple branches showcasing different use cases and configuration patterns. Adapt to your needs!
Repository Contents, Cloned and Ready
When you launch a Studio from a Git repository, the repository contents are cloned directly into your session. No manual file transfers. No complex mounting procedures. Your code, scripts, and data files are simply there, ready to use.
By default, repositories clone to /workspace, making your files immediately accessible. You can customize the clone path in your schema as needed, and you even have the option to disable cloning entirely if you're using the repository purely for template sharing.
This cloning capability means you can version control not just your environment configuration, but also your analysis scripts, notebooks, and project documentation. Everything lives together in Git, and everything comes together in your Studio session.
Simplified Form Layout: Listening to Your Feedback
Based on your feedback, we completely redesigned the Studio creation form to make Git integration intuitive and efficient. The new form layout features:
- →Smart field population: When you enter a Git repository URL, the form dynamically updates. Branch, tag, and commit options populate automatically, and relevant configuration fields adjust based on your schema.
- →Clear visual hierarchy: Related fields are grouped logically, making it easy to understand what you're configuring at each step.
- →Inherited settings display: Environment variables and resource labels from your compute environment are displayed transparently, so you always know what's inherited and what's custom.
- →Progressive disclosure: Advanced options are available when you need them but don't clutter the initial view.
The result? A form that feels fast and responsive, guiding you through the configuration process without overwhelming you with options.
How It Works: From Repository to Running Session
Getting started with Git-integrated Studios is straightforward:
- Create your configuration: Add a
.seqeradirectory to any Git repository with astudio-config.yamlfile. Define your container, dependencies, compute requirements, and any other settings you need. - Generate a Git-provider PAT: In your Git hosting service, generate a personal access token (PAT), then add to Seqera as a workspace credential.
- Add your Studio: In Seqera, navigate to Studios and select the Git repository option. Enter your repository URL, select a branch or tag, and watch as the form populates with your configuration.
- Review and customize: The form displays all settings from your schema. You can add additional conda packages or optionally over-ride any settings defined in your schema.
- Mount data: Data-links still need to be manually defined.
- Launch: Click "Add and start" to save your configuration and immediately launch your session, or use "Add only" to save for later launching.
Within minutes, your Studio session is running with your repository contents cloned and ready, your environment configured exactly as specified, and your compute resources provisioned to match your requirements.
Real-World Applications
Teams are already using Git-integrated Studios in powerful ways:
- Research teams maintain separate branches for different projects, each with its own Studio configuration. Switching between projects is as simple as selecting a different branch.
- Training programs share Studio configurations publicly, allowing students to launch identical environments with a single click. No installation required, no configuration headaches.
- Data science teams version control their analysis environments alongside their code, ensuring complete reproducibility from data ingestion through final reporting.
- Infrastructure teams manage custom Dockerfiles in Git, allowing rapid collaboration and iteration to build bespoke analysis solutions.
Looking Forward
The Studios Git-provider integration represents a fundamental shift in how teams can manage their interactive analysis environments. By treating infrastructure as code and leveraging the same version control workflows you already use for your pipelines and analysis code, we're making it easier than ever to create reproducible, shareable, and maintainable Studios.
💡Note: Git-integrated Studios is available now in Seqera (Cloud and Enterprise). Check out our comprehensive documentation for detailed instructions, schema references, and best practices. Browse our example repository to see different configuration patterns in action.
