Differential expression analysis for sequence count data

S Anders, W Huber - Nature Precedings, 2010 - nature.com
Nature Precedings, 2010nature.com
Motivation: High throughput nucleotide sequencing provides quantitative readouts in assays
for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq), cell counting. Statistical
inference of differential signal in these data needs to take into account their natural
variability throughout the dynamic range. When the number of replicates is small, error
modeling is needed to achieve statistical power. Results: We propose an error model that
uses the negative binomial distribution, with variance and mean linked by local regression …
Motivation
High throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq), cell counting. Statistical inference of differential signal in these data needs to take into account their natural variability throughout the dynamic range. When the number of replicates is small, error modeling is needed to achieve statistical power.
Results
We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power.
Availability
A free open-source R/Biondonductor software package, called "DESeq", is available from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq
nature.com