On improved EM algorithm and confidence interval construction for incomplete r × c tables

Man Lai TANG*, Kai Wang Ng, Guo Liang Tian, Ming Tan

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

19 Citations (Scopus)


Constructing confidence interval (CI) for functions of cell probabilities (e.g., rate difference, rate ratio and odds ratio) is a standard procedure for categorical data analysis in clinical trials and medical studies. In the presence of incomplete data, existing methods could be problematic. For example, the inverse of the observed information matrix may not exist and the asymptotic CIs based on delta methods are hence not available. Even though the inverse of the observed information matrix exists, the large-sample delta methods are generally not reliable in small-sample studies. In addition, existing expectation-maximization (EM) algorithm via the conventional data augmentation (DA) may suffer from slow convergence due to the introduction of too many latent variables. In this article, for r × c tables with incomplete data, we propose a novel DA scheme that requires fewer latent variables and this will consequently lead to a more efficient EM algorithm. We present two bootstrap-type CIs for parameters of interest via the new EM algorithm with and without the normality assumption. For r × c tables with only one incomplete/supplementary margin, the improved EM algorithm converges in only one step and the associated maximum likelihood estimates can hence be obtained in closed form. Theoretical and simulation results showed that the proposed EM algorithm outperforms the existing EM algorithm. Three real data from a neurological study, a rheumatoid arthritis study and a wheeze study are used to illustrate the methodologies.

Original languageEnglish
Pages (from-to)2919-2933
Number of pages15
JournalComputational Statistics and Data Analysis
Issue number6
Publication statusPublished - 1 Mar 2007

Scopus Subject Areas

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

User-Defined Keywords

  • Bootstrap
  • Confidence interval
  • Convergence rate
  • Data augmentation
  • EM algorithm
  • Incomplete data
  • Paired binary data
  • Small sample size


Dive into the research topics of 'On improved EM algorithm and confidence interval construction for incomplete r × c tables'. Together they form a unique fingerprint.

Cite this