Abstract
For gene expression data analysis, an important task is to identify genes that are differentially expressed between two or more groups. Nevertheless, as biological experiments are often measured with a relatively small number of samples, how to accurately estimate the variances of gene expression becomes a challenging issue. To tackle this problem, we introduce a regularized t distribution and derive its statistical properties including the probability density function and the moment generating function. The noncentral regularized t distribution is also introduced for computing the statistical power of hypothesis testing. For practical applications, we apply the regularized t distribution to establish the null distribution of the regularized t statistic, and then formulate it as a regularized t-test for detecting the differentially expressed genes. Simulation studies and real data analysis show that our regularized t-test performs much better than the Bayesian t-test in the “limma” package, in particular when the sample sizes are small.
Original language | English |
---|---|
Pages (from-to) | 1884-1900 |
Number of pages | 17 |
Journal | Scandinavian Journal of Statistics |
Volume | 50 |
Issue number | 4 |
Early online date | 14 Apr 2023 |
DOIs | |
Publication status | Published - Dec 2023 |
Scopus Subject Areas
- Statistics and Probability
- Statistics, Probability and Uncertainty
User-Defined Keywords
- Bayesian t-test
- hypothesis testing
- noncentral regularized t distribution
- regularized t distribution
- regularized t-test