Accurate and interpretable gene expression imputation on scRNA-seq data using IGSimpute

Ke Xu, Chin Wang Cheong, Werner P Veldsman, Aiping Lyu*, William K Cheung*, Lu Zhang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

2 Citations (Scopus)

Abstract

Single-cell ribonucleic acid sequencing (scRNA-seq) enables the quantification of gene expression at the transcriptomic level with single-cell resolution, enhancing our understanding of cellular heterogeneity. However, the excessive missing values present in scRNA-seq data hinder downstream analysis. While numerous imputation methods have been proposed to recover scRNA-seq data, high imputation performance often comes with low or no interpretability. Here, we present IGSimpute, an accurate and interpretable imputation method for recovering missing values in scRNA-seq data with an interpretable instance-wise gene selection layer (GSL). IGSimpute outperforms 12 other state-of-the-art imputation methods on 13 out of 17 datasets from different scRNA-seq technologies with the lowest mean squared error as the chosen benchmark metric. We demonstrate that IGSimpute can give unbiased estimates of the missing values compared to other methods, regardless of whether the average gene expression values are small or large. Clustering results of imputed profiles show that IGSimpute offers statistically significant improvement over other imputation methods. By taking the heart-and-aorta and the limb muscle tissues as examples, we show that IGSimpute can also denoise gene expression profiles by removing outlier entries with unexpectedly high expression values via the instance-wise GSL. We also show that genes selected by the instance-wise GSL could indicate the age of B cells from bladder fat tissue of the Tabula Muris Senis atlas. IGSimpute can impute one million cells using 64 min, and thus applicable to large datasets.

Original languageEnglish
Article numberbbad124
Number of pages12
JournalBriefings in Bioinformatics
Volume24
Issue number3
Early online date10 Apr 2023
DOIs
Publication statusPublished - May 2023

Scopus Subject Areas

  • Information Systems
  • Molecular Biology

User-Defined Keywords

  • deep neural network
  • dropout imputation
  • model interpretability
  • single-cell RNA sequencing

Fingerprint

Dive into the research topics of 'Accurate and interpretable gene expression imputation on scRNA-seq data using IGSimpute'. Together they form a unique fingerprint.

Cite this