Next - Generation Sequencing

Teacher

LAB

Next – Generation Sequencing

Gene Expression Analysis using DESeq2

Calculate Expression Values

In RNA-Seq experiments, gene/transcript expressions are measured by counting the reads mapped
to its respective position in the genome. The expression values can be presented in different forms
like read counts, RPM, RPKM, FPKM, and TPM. If relative expression of a transcript with respect
to other transcripts in a sample is to be measured, then RPM, RPKM, FPKM, or TPM are used as

the expression needs to be normalized. On the other hand, to compare expression of two samples,
using read counts is the better option.

One frequently asked question is “Should we use total mapped reads or only uniquely mapped
reads?” to estimate the expression levels. Some genomes, especially plants, contain high level
repetitive regions and many of these repetitive regions contain genes or pseudo genes. In such
genomes half of the reads may be mapped to multiple locations. In such instances, it is better to use
algorithms which can assign counts to the multiple features. Similarly, if a read is mapped to two
overlapping features it would be worth while to assign a count to both the features. For better
understanding, one can always compare the results with and without multimapped and overlapped
reads for expression analysis.

Differential Expression

Various statistical methods are available for differential expression of RNA-Seq; they vary in data
normalization and distribution model considerations. Algorithms which use negative binomial
distribution like DESeq [25] and edgeR [26] are considered to be sensitive and specific. However,
using more than one differential expression analysis method would provide better results in
detecting true positives.

Prev Practice

Next Practice