Sample size determination for disease prevalence studies with partially validated data

Shi Fang Qiu*, Wai Yin Poon, Man Lai TANG

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example.

Original languageEnglish
Pages (from-to)37-63
Number of pages27
JournalStatistical Methods in Medical Research
Volume25
Issue number1
DOIs
Publication statusPublished - 1 Feb 2016

Scopus Subject Areas

  • Epidemiology
  • Statistics and Probability
  • Health Information Management

User-Defined Keywords

  • Asymptotic inference
  • disease prevalence
  • double-sampling
  • partially validated data
  • sample size

Fingerprint

Dive into the research topics of 'Sample size determination for disease prevalence studies with partially validated data'. Together they form a unique fingerprint.

Cite this