Improving POMDP tractability via belief compression and clustering

Xin Li*, Kwok Wai CHEUNG, Jiming LIU

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

10 Citations (Scopus)

Abstract

Partially observable Markov decision process (POMDP) is a commonly adopted mathematical framework for solving planning problems in stochastic environments. However, computing the optimal policy of POMDP for large-scale problems is known to be intractable, where the high dimensionality of the underlying belief space is one of the major causes. In this paper, we propose a hybrid approach that integrates two different approaches for reducing the dimensionality of the belief space: 1) belief compression and 2) value-directed compression. In particular, a novel orthogonal nonnegative matrix factorization is derived for the belief compression, which is then integrated in a value-directed framework for computing the policy. In addition, with the conjecture that a properly partitioned belief space can have its per-cluster intrinsic dimension further reduced, we propose to apply a κ-means-like clustering technique to partition the belief space to form a set of sub-POMDPs before applying the dimension reduction techniques to each of them. We have evaluated the proposed belief compression and clustering approaches based on a set of benchmark problems and demonstrated their effectiveness in reducing the cost for computing policies, with the quality of the policies being retained.

Original languageEnglish
Pages (from-to)125-136
Number of pages12
JournalIEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Volume40
Issue number1
DOIs
Publication statusPublished - Feb 2010

Scopus Subject Areas

  • Control and Systems Engineering
  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

User-Defined Keywords

  • Belief clustering
  • Belief compression
  • Nonnegative matrix factorization (NMF)
  • Partially observable Markov decision process (POMDP)

Fingerprint

Dive into the research topics of 'Improving POMDP tractability via belief compression and clustering'. Together they form a unique fingerprint.

Cite this