Applications of Determining the number of mixture components in Multiple Change-Point Detection and Mixture of the Experts System

Project: Research project

Project Details


The finite mixture model is a very flexible statistical model to investigate the heterogeneity of col- lected data. However, without knowing the component number of the finite mixture model, because of the model identifiable problems, the model could not be estimated accurately. So it isn’t easy to extend or apply finite mixture models to other more generalized applications. In this proposal, based on existing methods of determining the number of mixture components for finite mixture models, we investigate two applications or extensions of finite mixture models. We wish to provide some insights to see if the finite mixture model can be estimated more accurately combined with the prior information of the observed data or a more detailed finite mixture model structure.

First, it is well known that detecting multiple change points has wide applications in industry and financial econometric research. Notice that the stochastic multivariate observed series with multiple change-points can be regarded as a series of observations generated from a finite mixture model. The number of the mixture components is determined by the number of change points the observed stochastic multivariate dimensional series contains. The order of the observation determines the mixture compo- nents which the observation is generated. From this insight, the problem of the multiple change-point detections can be transformed as a problem of determining the component number of the finite mix- ture model. Furthermore, it is possible that the position of change points can also be determined simultaneously by the prior order information of the observation series. Based on determination of the component number of a finite mixture model, and combined with prior information of the observers, we propose a flexible and stable method to detect multiple change points in multivariate stochastic ob- served series. The proposed method can simultaneously detect the local, scale, and other multivariate dimensional stochastic series changes. We investigate the consistency of the change-point detection and study asymptotic statistical properties and the proposed method’s efficiency of the model estimations.

Second, the mixture of the experts system (MoS) is an essential extension of the finite mixture model. It assumes that the mixing weight in the finite mixture model for every observed random vector is determined by the corresponding covariate variables. Then the system could adaptively follow the divide-conquer strategy to learn from data. Now MoS has wide applications in different scientific fields. Though it is a generalized finite mixture model, the mixing weights for every observation are changed with such observer or corresponding covariate variables. Classical regularized methods to determine the number of components for the finite mixture model cannot be directly applied to the mixture of the ex- perts system. In this proposal, based on group selection techniques, a regularized methods are proposed or extended to determine the expert number of the mixture of the experts system. The consistency, robustness, and efficiency of the proposed methods are investigated. The proposed methods are also used to analyze heterogeneous data from different scientific fields. We further consider extending the proposed methods for Hierarchical mixtures of Experts system (HME).

This research’s novelty is that the statistical efficiency and computationally feasibility of the finite mixture modeling with prior data information or detailed model structures are investigated. The frame- work to overcome the inefficiency of finite mixture modeling when the number of mixture components is unknown could be developed so that the barrier of the flexible finite mixture modeling in real applications could be removed.
Effective start/end date1/01/23 → …


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.