Designed for computational biologists and bioinformatics cores, this pipeline takes you from raw FASTQ files to fully quantified and quality-controlled expression matrices.
Every stage is encapsulated in a Viash component, so you can run the entire pipeline or pick and choose sub-workflows as needed.
The Bulk RNA-seq workflow consists of several sub-workflows that can be executed independently:
Prepare genome -- uncompress reference data and generate index files for aligners and quantifiers.
Pre-processing -- perform quality control on reads, trim adapters and remove contaminant sequences using FastQC, Trim Galore! and fastp.
Genome alignment and quantification -- align reads with STAR and quantify expression with Salmon or RSEM. UMI deduplication and BAM statistics are handled automatically.
Post-processing -- mark duplicate reads, assemble transcripts with StringTie and generate bigWig coverage files.
Quality control -- collate QC metrics across steps (RSeQC, dupRadar, Qualimap, Preseq, DESeq2 and featureCounts) and produce an aggregate report via MultiQC.
Because each sub-workflow is a Viash component, you can run just the parts you need or swap in new tools as they emerge.
Every component in the Bulk RNA-seq pipeline is versioned and containerized.
Running the workflow produces a comprehensive audit trail including the exact container images and parameters used.
This makes it easy to meet regulatory requirements and reproduce results on any platform -- whether local servers, HPC clusters or cloud environments.