Outlier detection in traffic data based on the Dirichlet process mixture model

Henry Y. T. Ngan*, Nelson H. C. Yung, Anthony G. O. Yeh

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

33 Citations (Scopus)
30 Downloads (Pure)


Traffic data collections are exceedingly useful for road network management. Such collections are typically massive and are full of errors, noise and abnormal traffic behaviour. These abnormalities are regarded as outliers because they are inconsistent with the rest of the data. Hence, the problem of outlier detection (OD) is non-trivial. This paper presents a novel method for detecting outliers in large-scale traffic data by modelling the information as a Dirichlet process mixture model (DPMM). In essence, input traffic signals are truncated and mapped to a covariance signal descriptor, and the vector dimension is then further reduced by principal component analysis. This modified signal vector is then modelled by a DPMM. Traffic signals generally share heavy spatial-temporal similarities within signals or among various categories of traffic signals, and previous OD methods have proved incapable of properly discerning these similarities or differences. The contribution of this study is to represent real-world traffic data by a robust DPMM-based method and to perform an unsupervised OD to achieve a detection rate of 96.67% in a ten-fold cross validation.

Original languageEnglish
Pages (from-to)773-781
Number of pages9
JournalIET Intelligent Transport Systems
Issue number7
Publication statusPublished - Sept 2015

Scopus Subject Areas

  • Transportation
  • Environmental Science(all)
  • Mechanical Engineering
  • Law


Dive into the research topics of 'Outlier detection in traffic data based on the Dirichlet process mixture model'. Together they form a unique fingerprint.

Cite this