Pipeline Secrets: Secure Handling of Sensitive Information in Tower
In the modern scenario of credentials-driven access to data resources, Nextflow users frequently need to safely transmit sensitive information. Passwords, access keys and API tokens are all often used by tasks during workflow execution. Today, we are delighted to offer Nextflow users a secure solution for handling sensitive information inside Tower: Pipeline Secrets. This new capability was developed in conjunction with our friends at Sage Bionetworks.
In the past, users relied on workarounds to store such information, such as hardcoding the values in pipeline parameters or using environment variables. These approaches are all prone to leaks into logs, printouts as environment variables, or in the worst case scenario, having sensitive information make its way into public code repositories.
To address this requirement, Nextflow recently introduced a new directive to handle secrets inside a pipeline. Nextflow Secrets ensure secure transmission of sensitive content to workflow tasks and avoid accidentally exposing their content anywhere in either code or Nextflow logs. This dramatically enhances the runtime security of Nextflow workflows.
We leveraged the new Nextflow Secrets directive and built a native integration within Tower in the form of Pipeline Secrets that enables users to store keys and tokens needed by Nextflow tasks when executed in external platforms.
We have added a new section inside each workspace for storing Pipeline Secrets. Due to the sensitive nature of secrets, only workspace admins and owners are allowed to create, edit, and delete secrets.
Once a secret is created, its value is encrypted and it is no longer retrievable by users. Pipeline Secrets defined inside a workspace can be used by all workflows executions launched inside that workspace, enabling secure sharing of sensitive data with a team, without exposing the content unnecessarily.
When a new workflow is launched, secrets are sent to the corresponding secret manager for the compute platform. For example, in the case of AWS-based Compute Environments, the secrets are sent to AWS Secrets Manager. Nextflow will then download these secrets internally and use them when they are referenced in the workflow task as described in the Nextflow Secrets documentation.
Finally, secrets are removed from the secrets manager when the pipeline completes (successfully or not) to avoid incurring additional costs with unused secrets.
Tower enables an additional layer of flexibility with secrets, enabling each user to define user-specific Pipeline Secrets In this way, every user can create their required secrets without needing to share the information at the workspace level. This is relevant for personal secrets or when only one member has access to a specific resource, license, or token.
Pipeline secrets defined at the user level can be sent during pipeline execution alongside those defined in the workspace, following the precedence rules outlined in the documentation.
The most immediate use case for Pipeline Secrets are the safe transmission of private keys without exposing their value in the logs or pipeline parameters.
At Seqera, we use Secrets to store and transmit software license keys for the Sentieon suite of bioinformatics secondary analysis tools. This enables our team to use the software safely via the nf-sentieon pipeline without having to worry about the setup or safe configuration of the license key.
Secrets can also be used to store private API keys such as the NCBI key for E-utilities without exposing its content neither in the pipeline code, nor in the environment variables setup.
Tower relies on third-party secret manager services in order to maintain security between the workflow execution context and the secret container. This means that no secure data is transmitted from Tower unsafely to the Compute Environment.
We have implemented a strict role-based permissions policy to access and update secret definitions, enhancing the security posture of organizations and reducing the possibility of human error.
Pipeline Secrets can currently be used in AWS Batch and HPC batch schedulers, expanding the scope of Nextflow secrets. We plan to add the support for other platforms in the future.
We understand how vital data security is for users. Providing a way to ensure secure transmission of sensitive data is of utmost importance to us, be it sensitive data insulated between members of a team, patient data, or personally identifiable information.