Snakemake pipelines in the High-Throughput Sequencing Center of AU-ENVS

Published

June 25, 2023

Modified

October 18, 2023

Workflow management tools like  Snakemake or Nextflow have proven incredibly valuable in large-scale bioinformatics analysis. While I don’t have a particular preference, I have had extensive experience with Snakemake as a Student Assistant at the Environmental Sciences Department of Aarhus University.

Snakemake files are just Python files extended with a declarative sugar syntax like MakeFiles. Unlike traditional Makefiles, it is designed explicitly for scientific workflows. 

In my role, I have focused on automating pipelines for the High-Throughput Sequencing Center of Aarhus University (Roskilde). This has allowed me to delve into working with HPC clusters, employing Singularity containers, and gaining in-depth knowledge of the bioinformatics pipelines.

To date, I have developed in Snakemake a Whole-Genome Sequencing (WGS) pipeline for prokaryotes, a TotalRNA metatranscriptomics pipeline, and one pipeline for long-amplicon sequencing for Oxford Nanopore. Please check the recently published papers of Jaarsma et al. (2023) and Scheel et al. (2023) to see papers that used them.

Using Snakemake wrappers, small reusable scripts that facilitate using popular bioinformatics programs, has made my life much easier. When I had the time and saw the opportunity, I contributed with PRs to the snakemake-wrappers repository, improving or adding new features :)

I have made all of my code publicly available on GitHub. I am very grateful to be in a work environment encouraging Free software. I always continue to be impressed by the open contributions of the scientific community to bioinformatics.

If you are more interested, I invite you to look at the organization’s repository (https://github.com/AU-ENVS-Bioinformatics/).

References

Jaarsma, Ate H, Katie Sipes, Athanasios Zervas, Francisco Campuzano Jiménez, Lea Ellegaard-Jensen, Mariane S Thøgersen, Peter Stougaard, Liane G Benning, Martyn Tranter, and Alexandre M Anesio. 2023. “Exploring Microbial Diversity in Greenland Ice Sheet Supraglacial Habitats Through Culturing-Dependent and -Independent Approaches.” FEMS Microbiology Ecology 99 (11). https://doi.org/10.1093/femsec/fiad119.
Scheel, Maria, Athanasios Zervas, Ruud Rijkers, Alexander Tøsdal Tveit, Flemming Ekelund, Francisco Campuzano Jiménez, Torben Røjle Christensen, and Carsten Suhr Jacobsen. 2023. “Abrupt Permafrost Thaw Triggers Activity of Copiotrophs and Microbiome Predators.” FEMS Microbiology Ecology, October. https://doi.org/10.1093/femsec/fiad123.