Statistical methods for identifying differentially expressed genes in RNA-Seq experiments
1 Biostatistics Program, School of Public Health, LSU Health Sciences Center, 2020 Gravier Street, 3rd floor, New Orleans, LA, 70112, USA
2 Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
3 Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA, 94598
4 Staff Scientist, Group Lead for Genome Analysis, DOE Joint Genome Institute, 2800 Mitchell Dr., MS100, Walnut Creek, CA, 94598, USA
Cell & Bioscience 2012, 2:26 doi:10.1186/2045-3701-2-26Published: 31 July 2012
RNA sequencing (RNA-Seq) is rapidly replacing microarrays for profiling gene expression with much improved accuracy and sensitivity. One of the most common questions in a typical gene profiling experiment is how to identify a set of transcripts that are differentially expressed between different experimental conditions. Some of the statistical methods developed for microarray data analysis can be applied to RNA-Seq data with or without modifications. Recently several additional methods have been developed specifically for RNA-Seq data sets. This review attempts to give an in-depth review of these statistical methods, with the goal of providing a comprehensive guide when choosing appropriate metrics for RNA-Seq statistical analyses.