Abstract
Ever greater technological advances and democratization of digital tools such as computers and smartphones offer researchers new possibilities to collect large amounts of health data in order to conduct clinical research. Such data, called real-world data, appears to be a perfect complement to traditional randomized clinical trials and has become more important in health decisions. Due to its longitudinal nature, real-world data is subject to specific and well-known methodological issues, namely issues with the analysis of cluster-correlated data, missing data and longitudinal data itself. These concepts have been widely discussed in the literature and many methods and solutions have been proposed to cope with these issues. As examples, mixed and trajectory models have been developed to explore longitudinal data sets, imputation methods can resolve missing data issues, and multilevel models facilitate the treatment of cluster-correlated data. Nevertheless, the analysis of real-world longitudinal occupational health data remains difficult, especially when the methodological challenges overlap. The purpose of this article is to present various solutions developed in the literature to deal with cluster-correlated data, missing data and longitudinal data, sometimes overlapped, in an occupational health context. The novelty and usefulness of our approach is supported by a step-by-step search strategy and an example from the Wittyfit database, which is an epidemiological database of occupational health data. Therefore, we hope that this article will facilitate the work of researchers in the field and improve the accuracy of future studies.
Original language | English |
---|---|
Article number | 7023 |
Journal | International Journal of Environmental Research and Public Health |
Volume | 19 |
Issue number | 12 |
DOIs | |
Publication status | Published - 2 Jun 2022 |
Scopus Subject Areas
- Public Health, Environmental and Occupational Health
- Pollution
- Health, Toxicology and Mutagenesis
User-Defined Keywords
- cluster-correlated data
- longitudinal data
- methodological issues
- missing data
- modeling
- occupational health
- real-world data