TY - JOUR
T1 - DeSide
T2 - A unified deep learning approach for cellular deconvolution of tumor microenvironment
AU - Xiong, Xin
AU - Liu, Yerong
AU - Pu, Dandan
AU - Yang, Zhu
AU - Bi, Zedong
AU - Tian, Liang
AU - Li, Xuefei
N1 - Funding Information:
We thank Prof. Leihan Tang, Dr. Law Ellie Yuen Yi, and Dr. Adam George Craig from Hong Kong Baptist University for their critical feedback and discussion on the manuscript, as well as Dr. Rongji Mu from Shanghai Jiao Tong University and Prof. Wei Liang from Xiamen University for their valuable discussion on the prognostic analyses. This work was supported by the National Key Research and Development Program of China (2021YFA0911100), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB0480000), the National Natural Science Foundation of China (32170672 and 32000886), the Guangdong Basic and Applied Basic Research Foundation (2021A1515012461), and the Shenzhen Science and Technology Program (ZDSYS20220606100606013) to X.L.; the Research Grants Council of Hong Kong (C2005-22Y and 12301723), the National Natural Science Foundation of China (12275229), the Hong Kong Chinese Medicine Development Fund (22B2/049A), and the Initiation Grant for Faculty Niche Research Areas of Hong Kong Baptist University (RC-FNRA-IG/23-24/SCI/05) to L.T.
Publisher Copyright:
© 2024 the Author(s). Published by PNAS.
PY - 2024/11/12
Y1 - 2024/11/12
N2 - Cellular deconvolution via bulk RNA sequencing (RNA-seq) presents a cost-effective and efficient alternative to experimental methods such as flow cytometry and single-cell RNA-seq (scRNA-seq) for analyzing the complex cellular composition of tumor microenvironments. Despite challenges due to heterogeneity within and among tumors, our innovative deep learning–based approach, DeSide, shows exceptional accuracy in estimating the proportions of 16 distinct cell types and subtypes within solid tumors. DeSide integrates biological pathways and assesses noncancerous cell types first, effectively sidestepping the issue of highly variable gene expression profiles (GEPs) associated with cancer cells. By leveraging scRNA-seq data from six cancer types and 185 cancer cell lines across 22 cancer types as references, our method introduces distinctive sampling and filtering techniques to generate a high-quality training set that closely replicates real tumor GEPs, based on The Cancer Genome Atlas (TCGA) bulk RNA-seq data. With this model and high-quality training set, DeSide outperforms existing methods in estimating tumor purity and the proportions of noncancerous cells within solid tumors. Our model precisely predicts cellular compositions across 19 cancer types from TCGA and proves its effectiveness with multiple additional external datasets. Crucially, DeSide enables the identification and analysis of combinatorial cell type pairs, facilitating the stratification of cancer patients into prognostically significant groups. This approach not only provides deeper insights into the dynamics of tumor biology but also highlights potential therapeutic targets by underscoring the importance of specific cell type or subtype interactions.
AB - Cellular deconvolution via bulk RNA sequencing (RNA-seq) presents a cost-effective and efficient alternative to experimental methods such as flow cytometry and single-cell RNA-seq (scRNA-seq) for analyzing the complex cellular composition of tumor microenvironments. Despite challenges due to heterogeneity within and among tumors, our innovative deep learning–based approach, DeSide, shows exceptional accuracy in estimating the proportions of 16 distinct cell types and subtypes within solid tumors. DeSide integrates biological pathways and assesses noncancerous cell types first, effectively sidestepping the issue of highly variable gene expression profiles (GEPs) associated with cancer cells. By leveraging scRNA-seq data from six cancer types and 185 cancer cell lines across 22 cancer types as references, our method introduces distinctive sampling and filtering techniques to generate a high-quality training set that closely replicates real tumor GEPs, based on The Cancer Genome Atlas (TCGA) bulk RNA-seq data. With this model and high-quality training set, DeSide outperforms existing methods in estimating tumor purity and the proportions of noncancerous cells within solid tumors. Our model precisely predicts cellular compositions across 19 cancer types from TCGA and proves its effectiveness with multiple additional external datasets. Crucially, DeSide enables the identification and analysis of combinatorial cell type pairs, facilitating the stratification of cancer patients into prognostically significant groups. This approach not only provides deeper insights into the dynamics of tumor biology but also highlights potential therapeutic targets by underscoring the importance of specific cell type or subtype interactions.
KW - bulk RNA sequencing
KW - cellular deconvolution
KW - deep learning
KW - single-cell RNA sequencing
KW - tumor microenvironment
UR - http://www.scopus.com/inward/record.url?scp=85208803315&partnerID=8YFLogxK
U2 - 10.1073/pnas.2407096121
DO - 10.1073/pnas.2407096121
M3 - Journal article
C2 - 39514318
AN - SCOPUS:85208803315
SN - 0027-8424
VL - 121
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 46
M1 - e2407096121
ER -