Checking the adequacy of regression models with complex data structure

  • Xu Guo

Student thesis: Doctoral Thesis

Abstract

In this thesis, we investigate the model checking problem for parametric regression model with missing response at random and nonignorable missing response. Besides, we also propose a hypothesis-adaptive procedure which is based on the dimension reduction theory. Finally, to extend our methods to missing response situation, we consider the dimension reduction problem with missing response at random. The .rst part of the thesis introduces the model checking for parametric models with response missing at random which is a more general missing mechanism than missing completely at random. Di.erent from existing approaches, two tests have normal distributions as the limiting null distributions no matter whether the inverse probability weight is estimated parametrically or nonparametrically. Thus, p-values can be easily determined. This observation shows that slow convergence rate of non­parametric estimation does not have signi.cant e.ect on the asymptotic behaviours of the tests although it may have impact in .nite sample scenarios. The tests can de­tect the alternatives distinct from the null hypothesis at a nonparametric rate which is an optimal rate for locally smoothing-based methods in this area. Simulation study is carried out to examine the performance of the tests. The tests are also applied to analyze a data set on monozygotic twins for illustration. In the second part of the thesis, we consider model checking for general linear re­gression model with non-ignorable missing response. Based on an exponential tilting model, we .rst propose three estimators for the unknown parameter in the general linear regression model. Three empirical process-based tests are constructed. We discuss the asymptotic properties of the proposed tests under null and local alterna­tive hypothesis with di.erent scenarios. We .nd that these three tests perform the same in the asymptotic sense. Simulation studies are also carried out to assess the performance of our proposed test procedures. In the third part, we revisit traditional local smoothing model checking proce­dures. Noticing that the general nonparametric regression model can be considered as a special multi-index model, we propose an adaptive testing procedure based on the dimension reduction theory. To our surprise, our method can detect local alter­native at faster rate than the traditional optimal rate. The theory indicates that in model checking problem, dimensionality may not have strong impact. Simulations are carried out to examine the performance of our methodology. A real data analysis is conducted for illustration. In the last part, we study the dimension reduction problem with missing response at random. Based on the work in this part, we can extend the adaptive testing pro­cedure introduced in the third part to the missing response situation. When there are many predictors, how to e.ciently impute responses missing at random is an important problem to deal with for regression analysis because this missing mech­anism, unlike missing completely at random, is highly related to high-dimensional predictor vector. In su.cient dimension reduction framework, the fusion-re.nement (FR) method in the literature is a promising approach. To make estimation more accurate and e.cient, two methods are suggested in this paper. Among them, one method uses the observed data to help on missing data generation, and the other one is an ad hoc approach that mainly reduces the dimension in the nonparametric smoothing in data generation. A data-adaptive synthesization of these two methods is also developed. Simulations are conducted to examine their performance and a HIV clinical trial dataset is analysed for illustration. Keywords: Model checking; Inverse probability weight; Non-ignorable missing re­sponse; Adaptive; Central subspace; Dimension reduction; Data-adaptive Synthesiza­tion; Missing recovery; Missing response at random; Multiple imputation.
Date of Award29 Jul 2014
Original languageEnglish
SupervisorLixing ZHU (Supervisor)

User-Defined Keywords

  • Missing observations (Statistics)
  • Multiple imputation (Statistics)
  • Regression analysis

Cite this

'