Sunbeam 4.7.0 release: Updates for remote filesystems and partial workflows

By Charlie Bushman

We're excited to announce the release of Sunbeam v4.7.0! Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most workstations and clusters. Sunbeam was designed to be modular and extensible, allowing anyone to build off the core functionality.

This release primarily focuses on allowing Sunbeam users more flexibility with their filesystems. What started as a simple attempt to use the AWS S3 snakemake plugin quickly turned into a bigger project. For starters, sunbeamlib, the Python library wrapping around our snakemake workflows that provides such commands as sunbeam run, would do strict checks that samples and config file paths are present before running the workflow. If these resources exist on a remote filesystem like S3, then sunbeamlib will error out. The solution to this was to ease the strictness of these checks so as to provide warnings when required resources are missing but still let the run go ahead and fail down the line if they are truly not accessible.

The ability to have initial files missing then opened up the door for a much-requested feature, the --skip flag. With this flag users can now skip either the QC or the QC & decontamination steps of Sunbeam. Previously, users would need to create directories full of dummy files. In addition to this, it was discovered that trimmomatic would quietly warn about the Nextera adapter files being missing instead of erroring out, so a check has been added to the workflow that requires those files to exist (unless the SUNBEAM_NO_ADAPTER environment variable is set). Some internal things have changed as well (no more pin files and a universal extension path function!) that likely won't concern the end user.

That's all for this release. Happy Sunbeaming!