rnaseq: A Modular, End-to-end Tx Workflow

Designed for computational biologists and bioinformatics cores, this pipeline takes you from raw FASTQ files to fully quantified and quality-controlled expression matrices.
Every stage is encapsulated in a Viash component, so you can run the entire pipeline or pick and choose sub-workflows as needed.

Pipeline Stages

The Bulk RNA-seq workflow consists of several sub-workflows that can be executed independently:

  • Prepare genome -- uncompress reference data and generate index files for aligners and quantifiers.
  • Pre-processing -- perform quality control on reads, trim adapters and remove contaminant sequences using FastQC, Trim Galore! and fastp.
  • Genome alignment and quantification -- align reads with STAR and quantify expression with Salmon or RSEM. UMI deduplication and BAM statistics are handled automatically.
  • Post-processing -- mark duplicate reads, assemble transcripts with StringTie and generate bigWig coverage files.
  • Pseudo alignment and quantification -- run Salmon or Kallisto in pseudo-alignment mode as an alternative to full alignment.
  • Quality control -- collate QC metrics across steps (RSeQC, dupRadar, Qualimap, Preseq, DESeq2 and featureCounts) and produce an aggregate report via MultiQC.

Because each sub-workflow is a Viash component, you can run just the parts you need or swap in new tools as they emerge.

Compliance & Reproducibility

Every component in the Bulk RNA-seq pipeline is versioned and containerized.
Running the workflow produces a comprehensive audit trail including the exact container images and parameters used.
This makes it easy to meet regulatory requirements and reproduce results on any platform -- whether local servers, HPC clusters or cloud environments.