Motivation: Researchers worldwide have generated a huge volume of genomic data, including thousands of genome-wide association studies (GWAS) and massive amounts of gene expression data from different tissues. How to perform a joint analysis of these data to gain new biological insights has become a critical step in understanding the etiology of complex diseases. Due to the polygenic architecture of complex diseases, the identification of risk genes remains challenging. Motivated by the shared risk genes found in complex diseases and tissue-specific gene expression patterns, we propose as an Empirical Bayes approach to integrating Pleiotropy and Tissue-Specific information (EPS) for prioritizing risk genes. Results: As demonstrated by extensive simulation studies, EPS greatly improves the power of identification for disease-risk genes. EPS enables rigorous hypothesis testing of pleiotropy and tissue-specific risk gene expression patterns. All of the model parameters can be adaptively estimated from the developed expectation-maximization (EM) algorithm. We applied EPS to the bipolar disorder and schizophrenia GWAS from the Psychiatric Genomics Consortium, along with the gene expression data for multiple tissues from the Genotype-Tissue Expression project. The results of the real data analysis demonstrate many advantages of EPS.
Scopus Subject Areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics