Getting Started
This guide walks through setting up a container image, running the test analysis, and processing your own data.
Prerequisites
- Docker (install) or Singularity/Apptainer (available on most HPC systems)
- At least 4 GB RAM allocated to the container
- Paired-end sequencing data as
.fastq.gzfiles
Step 1: Obtain the Container Image
Pull from Docker Hub (Recommended)
Launch Docker, then run:
This downloads the pre-built image containing all code, dependencies, and test data.
Advanced: Build from Dockerfile
If you prefer to build the image yourself, download docker-demux.zip from the Zenodo repository and unzip it. The file structure should look like:
Dockerfile and requirements.txt are needed for the build. The fastq_data directory contains test data, and all analysis code lives in 16s-demux/.
Build the image:
Replace {name}:{version} with your desired image name and version tag. The build takes a few minutes. A successful build ends with output like:
Warning
Building requires more resources than a typical HPC login node provides. Use a compute node or allocate sufficient resources through a job manager.
Build from Definition File
Download demux.zip from the Zenodo repository and unzip it. The file structure should look like:
16s-demux.def and requirements.txt are needed for the build. The fastq_data directory contains test data, and all code lives in 16s-demux/.
Navigate to the directory containing 16s-demux.def and build:
The build takes less than 10 minutes. A successful build ends with output like:
The file demux-image.sif will be created in your current working directory.
Warning
Building requires more resources than a typical HPC login node provides. Use a compute node or a job manager. A template Slurm script (slurmBuild.sh) is included in demux.zip — edit the SBATCH parameters for your system.
Step 2: Run the Test Analysis
Run the included test data to verify your setup works correctly.
Start an interactive container:
Note
If you built your own image, replace rlporter24/dualindex-demux:1.0 with the {name}:{version} you used.
All test files are included in the container. Run the test:
Replace 1 with the desired number of cores. This should complete in under 5 minutes.
Interactive mode:
Open a shell in the container:
Run the test from the 16s-demux directory:
Using a job manager (e.g., Slurm):
Edit the included submit_test_snakemake.sh script for your system, then submit it. The script runs:
Replace 1 with the desired number of cores. This should complete in about 10 minutes.
Verifying Test Output
A successful test run produces output like:
Within workflow/test_out/, you should see demux and trimmed directories:
-
demux/— four sets of 3 files (1.extract.log+ 2.fastq.gzeach), plusR1/andR2/directories. Each sample should have 8 files ending with-L*.fastq.gz(phases 0–7): -
trimmed/— subdirectoriesgroup1andgroup2, each containingR1/,R2/,removed/, pluslowReadsSummary.txtandsummary.txt:
Success
If all these files are present, the test run was successful.
Step 3: Run Your Analysis
1. Start a container:
2. Copy input files into the container:
From a separate terminal, use docker cp to transfer your data:
{local_path}— path to a file or directory on your machine{CONTAINER}— the container name (find it withdocker container ls){container_path}— destination path relative to the16s-demuxdirectory
Tip
Find your container name by running docker container ls or checking the Docker Desktop GUI.
3. Update configuration:
Edit config/config.yaml to set the paths for your samplesheet, fastqlist, and (if needed) indices file. See the Configuration page for details.
4. Dry run (recommended):
This validates inputs and shows planned jobs without executing them.
5. Execute:
Replace 1 with the desired number of cores. A successful run ends with:
6. Transfer outputs:
7. Clean up:
Exit the container (exit), then remove it:
Warning
This removes the container and all data inside it. Make sure all outputs are transferred before removing.
1. Prepare inputs:
Ensure your input files (fastq data, fastqlist, samplesheet) are accessible on the file system. Update config/config.yaml with the correct paths. See the Configuration page for details.
2. Dry run (recommended):
3. Execute:
Replace 1 with the desired number of cores. A successful run ends with:
Outputs are generated in workflow/out/. To exit an interactive shell, type exit.
Note
Running interactively is only recommended for testing or troubleshooting. Login nodes typically lack sufficient resources — use a compute node or job manager for real analyses.







