Open Access Highly Accessed Review

Statistical methods for identifying differentially expressed genes in RNA-Seq experiments

Zhide Fang1*, Jeffrey Martin2,3 and Zhong Wang2,3,4

Author Affiliations

1 Biostatistics Program, School of Public Health, LSU Health Sciences Center, 2020 Gravier Street, 3rd floor, New Orleans, LA, 70112, USA

2 Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA

3 Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA, 94598

4 Staff Scientist, Group Lead for Genome Analysis, DOE Joint Genome Institute, 2800 Mitchell Dr., MS100, Walnut Creek, CA, 94598, USA

For all author emails, please log on.

Cell & Bioscience 2012, 2:26 doi:10.1186/2045-3701-2-26

Published: 31 July 2012

Abstract

RNA sequencing (RNA-Seq) is rapidly replacing microarrays for profiling gene expression with much improved accuracy and sensitivity. One of the most common questions in a typical gene profiling experiment is how to identify a set of transcripts that are differentially expressed between different experimental conditions. Some of the statistical methods developed for microarray data analysis can be applied to RNA-Seq data with or without modifications. Recently several additional methods have been developed specifically for RNA-Seq data sets. This review attempts to give an in-depth review of these statistical methods, with the goal of providing a comprehensive guide when choosing appropriate metrics for RNA-Seq statistical analyses.