Introduction to single-cell RNA-seq

Single-cell RNA-seq Technologies

Single-cell RNA-seq (scRNA-seq) technologies can be divided into two categories, tag-based and full-length, based on their capture methods and quantitative nature.

In tag-based scRNA-seq, cells are separated by emulsion/droplets, and individual cells are given a unique cell barcode prior to sequencing. An example of tag-based scRNA-seq is 10x Genomics (Zheng et al. 2017).

In full-length scRNA-seq, cells are physically separated into individual wells of a plate and are often also sorted by other means (e.g., Fluorescence Activated Cell Sorting). With full-length scRNA-seq, each cell is sequenced individually and has its own fastq file. An example of full-length scRNA-seq is Smart-seq2 (Picelli et al. 2014).

For the purposes of this tutorial, we will focus on tag-based scRNA-seq, but it is important to keep in mind that the pre-processing steps and the biases to look out for in post-processing vary based on technology and how the cells are sorted.

For more extensive background on single-cell experimental methods, Predeus et al. also have a very good tutorial for scRNA-seq. We will also refer extensively to the the book Orchestrating Single-Cell Analysis with Bioconductor (Amezquita et al.).

Overall view of scRNA-seq tag-based workflow

Tag-based scRNA-seq

Example: 10x Genomics (Zheng et al. 2017) Individual cells are separated by emulsion/droplets prior to cell lysis. Transcripts from each cell are then tagged with two barcodes: a cell-specific barcode and a Unique Molecular Identifier (UMI) (Islam et al. 2014). All transcripts from all cells are then pooled together and undergo PCR amplification and sequencing as if they are one sample.

Tagging of each transcript with a different UMI before amplification allows the identification of PCR duplicates, allowing control for PCR amplification errors and biases. Individual samples have two fastq files: one for the cell and UMI barcodes (R1) and another with the transcript sequence reads (R2).

Pros

Can run potentially millions of cells at once.
Much less computationally demanding.
Won’t take up all your computer’s storage.
Much cheaper.

Cons

Sequencing is not bidirectional so data will likely have more intense 3’ bias.
The sequencing depth per cell with these technologies is generally lower.

Resources

Orchestrating Single Cell Analysis with Bioconductor
Hemberg Lab scRNA-seq training course
ASAP: Automated Single-cell Analysis Pipeline is a web server that allows you to process scRNA-seq data. (Gardeux et al. 2017.)
Smith Unique Molecular Identifiers – the problem, the solution and the proof provides an excellent background on UMIs.

Introduction to single-cell RNA-seq

CCDL for ALSF

2021

Single-cell RNA-seq Technologies

Tag-based scRNA-seq

Pros

Cons

Resources

Literature on the comparisons and explanations of scRNA-seq technologies

Literature on scRNA-seq analysis and best practices