Skip to content

getwilds/wilds-docker-library

Repository files navigation

wilds docker library logo

WILDS Docker Library

Curated collection of Docker images for reproducible bioinformatics workflows in the WILDS.

Project Status: Prototype – Useable, some support, open to feedback, unstable API. License: MIT

Overview

The WILDS Docker Library is a centralized repository of Docker images maintained by the Fred Hutch Data Science Lab (DaSL) for use in bioinformatics and computational workflows. This repository hosts Dockerfiles for a wide range of bioinformatics tools, ensuring consistent, reproducible, and secure environments for research workflows.

Each tool has its own directory with:

  • Dockerfiles for specific versions
  • Comprehensive documentation
  • Security vulnerability reports
  • Usage examples

Available Tools

The repository includes Docker images for popular bioinformatics tools including (but not limited to):

Tool Description Available Versions
STAR RNA-seq aligner 2.7.6a, 2.7.4a, latest
BWA Burrows-Wheeler Aligner 0.7.17, latest
Samtools SAM/BAM manipulation 1.11, 1.10, latest
BCFtools VCF/BCF manipulation 1.19, 1.11, latest
Picard Sequence data manipulation 3.1.1, latest
HISAT2 Graph-based alignment 2.2.1, latest
Scanpy Single-cell analysis 1.10.2, latest
scvi-tools Deep learning for single-cell 1.1.6, latest
Cell Ranger 10x Genomics analysis 6.0.2, latest
GATK Genome analysis toolkit 4.3.0.0, latest

For a full list of available tools and versions, browse the repository directories.

Usage

Docker

# Pull from Docker Hub
docker pull getwilds/TOOLNAME:VERSION

# Pull from GitHub Container Registry
docker pull ghcr.io/getwilds/TOOLNAME:VERSION

# Example: Run a STAR alignment
docker run --rm -v /path/to/data:/data getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/

Singularity/Apptainer

# Pull from Docker Hub
apptainer pull docker://getwilds/TOOLNAME:VERSION

# Pull from GitHub Container Registry
apptainer pull docker://ghcr.io/getwilds/TOOLNAME:VERSION

# Example: Run a STAR alignment
apptainer run --bind /path/to/data:/data docker://getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/

Security and Vulnerability Monitoring

Automated Security Scanning

All Docker images in this repository undergo regular security scanning:

  1. Build-time scanning: Each image is automatically scanned for vulnerabilities when built
  2. Monthly scanning: A scheduled workflow scans all images on the first day of each month
  3. On-demand scanning: Repository maintainers can trigger scans for specific images at any time

Vulnerability Reports

Each tool directory contains vulnerability reports (CVEs_*.md) detailing:

  • Detected vulnerabilities with severity ratings
  • Affected components and versions
  • Available fixes or mitigations

If a specific tool's vulnerabilities are of particular concern, please file an issue in the GitHub repo citing which portion of the vulnerability report you would like to be addressed.

Automated Workflows

The repository uses GitHub Actions to automate several processes:

Docker Build and Publishing Workflow

Defined in .github/workflows/docker-update.yml, this workflow:

  • Builds and publishes Docker images when Dockerfiles are modified
  • Pushes images to both GitHub Container Registry and DockerHub
  • Updates repository descriptions based on README content
  • Generates vulnerability reports for each newly built image

Security Monitoring Workflow

Defined in .github/workflows/docker-scout.yaml, this workflow:

  • Scans all Docker images monthy for security vulnerabilities using Docker Scout
  • Generates comprehensive vulnerability reports in each tool's respective subdirectory

Contributing

We welcome contributions to improve and expand the WILDS Docker Library. Please review the following guidelines:

Directory Structure

Each tool should have its own directory with:

  • Dockerfile(s) following the naming convention Dockerfile_VERSION
  • A comprehensive README.md describing the tool, versions, and usage examples
  • Vulnerability reports generated by our workflows

Dockerfile Guidelines

  • Start from minimal base images (e.g., ubuntu, python, r-base)
  • Pin all software versions for reproducibility
  • Include all required labels (see template below)
  • Keep images small (few hundred MB, 2GB max)
  • Focus on a single tool per image (1-2 tools max)
  • Document all installed components

Required Labels

Each Dockerfile must include these labels:

LABEL org.opencontainers.image.title="toolname"
LABEL org.opencontainers.image.description="Short description of the tool and its purpose"
LABEL org.opencontainers.image.version="1.0"
LABEL org.opencontainers.image.authors="youremail@fredhutch.org"
LABEL org.opencontainers.image.url=https://hutchdatascience.org/
LABEL org.opencontainers.image.documentation=https://getwilds.org/
LABEL org.opencontainers.image.source=https://github.com/getwilds/wilds-docker-library
LABEL org.opencontainers.image.licenses=MIT

Testing Your Changes

Before submitting a PR:

  1. Test your Dockerfile locally:

    docker build -t test-image -f TOOLNAME/Dockerfile_VERSION .
  2. Verify functionality:

    docker run --rm test-image [command-to-test-functionality]
  3. Run a local security scan (if Docker Scout is available):

    docker scout cves test-image

Pull Request Process

  1. Fork the repository and create a feature branch
  2. Make changes following our guidelines
  3. Submit a pull request against the main branch
  4. Our CI/CD pipeline will automatically build test images and generate security reports
  5. Reviewers will provide feedback and approve changes

Support

For questions, bug reports, or feature requests:

License

Distributed under the MIT License. See LICENSE for details.

About

Collection of Docker images for use in WILDS bioinformatics workflows

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •