TY - JOUR
T1 - OmiEmbed
T2 - A Unified Multi-Task Deep Learning Framework for Multi-Omics Data
AU - Zhang, Xiaoyu
AU - Xing, Yuting
AU - Sun, Kai
AU - Guo, Yike
N1 - Funding Information:
This research was funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement 764281.
Publisher Copyright:
© 2021 by the authors.
PY - 2021/6/18
Y1 - 2021/6/18
N2 - High-dimensional omics data contain intrinsic biomedical information
that is crucial for personalised medicine. Nevertheless, it is
challenging to capture them from the genome-wide data, due to the large
number of molecular features and small number of available samples,
which is also called “the curse of dimensionality” in machine learning.
To tackle this problem and pave the way for machine learning-aided
precision medicine, we proposed a unified multi-task deep learning
framework named OmiEmbed to capture biomedical information from
high-dimensional omics data with the deep embedding and downstream task
modules. The deep embedding module learnt an omics embedding that mapped
multiple omics data types into a latent space with lower
dimensionality. Based on the new representation of multi-omics data,
different downstream task modules were trained simultaneously and
efficiently with the multi-task strategy to predict the comprehensive
phenotype profile of each sample. OmiEmbed supports multiple tasks for
omics data including dimensionality reduction, tumour type
classification, multi-omics integration, demographic and clinical
feature reconstruction, and survival prediction. The framework
outperformed other methods on all three types of downstream tasks and
achieved better performance with the multi-task strategy compared to
training them individually. OmiEmbed is a powerful and unified framework
that can be widely adapted to various applications of high-dimensional
omics data and has great potential to facilitate more accurate and
personalised clinical decision making.
AB - High-dimensional omics data contain intrinsic biomedical information
that is crucial for personalised medicine. Nevertheless, it is
challenging to capture them from the genome-wide data, due to the large
number of molecular features and small number of available samples,
which is also called “the curse of dimensionality” in machine learning.
To tackle this problem and pave the way for machine learning-aided
precision medicine, we proposed a unified multi-task deep learning
framework named OmiEmbed to capture biomedical information from
high-dimensional omics data with the deep embedding and downstream task
modules. The deep embedding module learnt an omics embedding that mapped
multiple omics data types into a latent space with lower
dimensionality. Based on the new representation of multi-omics data,
different downstream task modules were trained simultaneously and
efficiently with the multi-task strategy to predict the comprehensive
phenotype profile of each sample. OmiEmbed supports multiple tasks for
omics data including dimensionality reduction, tumour type
classification, multi-omics integration, demographic and clinical
feature reconstruction, and survival prediction. The framework
outperformed other methods on all three types of downstream tasks and
achieved better performance with the multi-task strategy compared to
training them individually. OmiEmbed is a powerful and unified framework
that can be widely adapted to various applications of high-dimensional
omics data and has great potential to facilitate more accurate and
personalised clinical decision making.
KW - Cancer classification
KW - Deep learning
KW - Multi-omics data
KW - Multi-task learning
KW - Survival prediction
UR - http://www.scopus.com/inward/record.url?scp=85108064167&partnerID=8YFLogxK
U2 - 10.3390/cancers13123047
DO - 10.3390/cancers13123047
M3 - Journal article
AN - SCOPUS:85108064167
SN - 2072-6694
VL - 13
JO - Cancers
JF - Cancers
IS - 12
M1 - 3047
ER -