Structure-Leveraged Methods in Breast Cancer Risk Prediction

Jun Fan, Yirong Wu, Ming Yuan, David Page, Jie Liu, Irene M. Ong, Peggy Peissig, Elizabeth Burnside

Research output: Contribution to journalJournal articlepeer-review


Predicting breast cancer risk has long been a goal of medical research in the pursuit of precision medicine. The goal of this study is to develop novel penalized methods to improve breast cancer risk prediction by leveraging structure information in electronic health records. We conducted a retrospective case-control study, garnering 49 mammography descriptors and 77 high-frequency/low-penetrance single-nucleotide polymorphisms (SNPs) from an existing personalized medicine data repository. Structured mammography reports and breast imaging features have long been part of a standard electronic health record (EHR), and genetic markers likely will be in the near future. Lasso and its variants are widely used approaches to integrated learning and feature selection, and our methodological contribution is to incorporate the dependence structure among the features into these approaches. More specifically, we propose a new methodology by combining group penalty and lp (1 ≤ p ≤ 2) fusion penalty to improve breast cancer risk prediction, taking into account structure information in mammography descriptors and SNPs. We demonstrate that our method provides benefits that are both statistically significant and potentially significant to people's lives.

Original languageEnglish
Pages (from-to)2956–2970
Number of pages15
JournalJournal of Machine Learning Research
Issue number1
Publication statusPublished - Jan 2016

User-Defined Keywords

  • structure information
  • breast cancer risk prediction
  • mammography descriptors
  • genetic variants
  • personalized medicine


Dive into the research topics of 'Structure-Leveraged Methods in Breast Cancer Risk Prediction'. Together they form a unique fingerprint.

Cite this