Skip to main navigation Skip to search Skip to main content

From Policy Comparison to Process Consistency and Beyond

  • Yifan Xu
  • , Yujia Yin
  • , Yiming Xing
  • , Yifan Chen*
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Statistical Policy Comparison (SPC) assesses the equivalence of two stochastic policies (policy consistency) and has received broad attention. However, the SPC framework implicitly assumes the invariance of decision environments, and therefore fails to address a flurry of real-world data science applications. In this work, we refer to this overlooked issue as environment consistency, and together with policy consistency, this extends to a generalized concept process consistency for systematically comparing policy trials under the Markov decision process (MDP) framework. To address process consistency, we propose a unified comparison framework, extending beyond traditional statistical policy comparison studies by incorporating both policy and environment comparisons. For policy consistency, existing statistical policy comparison methods can be seamlessly integrated into our intentionally-designed framework without modification. Specifically for environment consistency (the focus of this work), we devise fine-grained return tests to capture shifts of key elements in MDPs; notably, under special cases where trajectory likelihood information is available or can be estimated, we introduce a trajectory test based on the likelihood ratio test (LRT), offering increased testing power. Extensive experiments demonstrate that our proposed testing methods achieve higher statistical power than existing approaches in testing process consistency, establishing their effectiveness across diverse real-world scenarios. Our code is available at https://github.com/bcxyf123/MDP-Testing.git.
Original languageEnglish
Title of host publicationCIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages3677–3687
Number of pages11
ISBN (Electronic)9798400720406
ISBN (Print)9798400720406
DOIs
Publication statusPublished - 10 Nov 2025

Publication series

NameCIKM: Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure

User-Defined Keywords

  • markov decision process
  • policy trial
  • process consistency
  • statistical policy comparison

Fingerprint

Dive into the research topics of 'From Policy Comparison to Process Consistency and Beyond'. Together they form a unique fingerprint.

Cite this