Modern sequencers can produce billions of reads in a single run, often across multiple lanes and barcodes.
For core facilities and pipeline developers, manually splitting and processing this data becomes a bottleneck.
The High-Throughput RNA-seq workflow extends the bulk RNA-seq pipeline to support demultiplexing and parallel processing of large sequencing runs.
bcl-convert
and split reads by sample or index.After demultiplexing, the resulting FASTQ files can be passed through the same sub-workflows used in the Bulk RNA-seq pipeline (pre-processing, alignment, quantification, etc.).
Shared components ensure consistency between low-throughput and high-throughput processing.
By leveraging Viash components, large-scale RNA-seq processing remains reproducible and auditable.
Each demultiplexing and processing step runs in a versioned container with a recorded SBOM.
You can run the pipeline on HPC clusters or cloud platforms without modifying the workflow, ensuring that scaling up does not compromise governance.
This workflow is designed to process high-throughput RNA-seq data, where every well of a microarray plate is a sample. A fasta file provided as input defines the mapping between sample barcodes and wells.
The full workflow is split in two major subworkflows that can be run independently:
Input for the workflow has to be fastq
files. For bcl or other formats, the demultiplex workflow needs to be run first.