Introduction to Statistical Methods for Integrative Data Analysis in Genome-Wide Association Studies

Can YANG*, Xiang WAN, Jin Liu, Kwok Po NG

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingChapterpeer-review

2 Citations (Scopus)


Scientists in the life science field have long been seeking genetic variants associated with complex phenotypes to advance our understanding of complex genetic disorders. In the past decade, genome-wide association studies (GWASs) have been used to identify many thousands of genetic variants, each associated with at least one complex phenotype. Despite these successes, there is one major challenge towards fully characterizing the biological mechanism of complex diseases. It has been long hypothesized that many complex diseases are driven by the combined effect of many genetic variants, formally known as “polygenicity," each of which may only have a small effect. To identify these genetic variants, large sample sizes are required but meeting such a requirement is usually beyond the capacity of a single GWAS. As the era of big data is coming, many genomic consortia are generating an enormous amount of data to characterize the functional roles of genetic variants and these data are widely available to the public. Integrating rich genomic data to deepen our understanding of genetic architecture calls for statistically rigorous methods in the big-genomic-data analysis. In this book chapter, we present a brief introduction to recent progresses on the development of statistical methodology for integrating genomic data. Our introduction begins with the discovery of polygenic genetic architecture, and aims at providing a unified statistical framework of integrative analysis. In particular, we highlight the importance of integrative analysis of multiple GWAS and functional information. We believe that statistically rigorous integrative analysis can offer more biologically interpretable inference and drive new scientific insights.

Original languageEnglish
Title of host publicationBig Data Analytics in Genomics
EditorsKa-Chun Wong
PublisherSpringer Cham
Number of pages21
ISBN (Electronic)9783319412795
ISBN (Print)9783319412788, 9783319823126
Publication statusPublished - 25 Oct 2016

Scopus Subject Areas

  • Computer Science(all)
  • Medicine(all)
  • Mathematics(all)

User-Defined Keywords

  • Genomic data
  • Methodology
  • Population genetics
  • SNP
  • Statistics


Dive into the research topics of 'Introduction to Statistical Methods for Integrative Data Analysis in Genome-Wide Association Studies'. Together they form a unique fingerprint.

Cite this