Exploring the Potential of LLMs for Serendipity Evaluation in Recommender Systems

Li Kang*, Yuhan Zhao, Li Chen

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Serendipity plays a pivotal role in enhancing user satisfaction within recommender systems, yet its evaluation poses significant challenges due to its inherently subjective nature and conceptual ambiguity. Current algorithmic approaches predominantly rely on proxy metrics for indirect assessment, often failing to align with real user perceptions, thus creating a gap. With large language models (LLMs) increasingly revolutionizing evaluation methodologies across various human annotation tasks, we are inspired to explore a core research proposition: Can LLMs effectively simulate human users for serendipity evaluation?

To address this question, we conduct a meta-evaluation on two datasets derived from real user studies in the e-commerce and movie domains, focusing on three key aspects: the accuracy of LLMs compared to conventional proxy metrics, the influence of auxiliary data on LLM comprehension, and the efficacy of recently popular multi-LLM techniques. Our findings indicate that even the simplest zero-shot LLMs achieve parity with, or surpass, the performance of conventional metrics. Furthermore, multi-LLM techniques and the incorporation of auxiliary data further enhance alignment with human perspectives. Based on our findings, the optimal evaluation by LLMs yields a Pearson correlation coefficient of 21.5% when compared to the results of the user study. This research implies that LLMs may serve as potentially accurate and cost-effective evaluators, introducing a new paradigm for serendipity evaluation in recommender systems. Our code is publicly available at https://github.com/Leah-HKBU/SerenEva.
Original languageEnglish
Title of host publicationRecSys2025 - Proceedings of the 19th ACM Conference on Recommender Systems
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages746–754
Number of pages9
ISBN (Electronic)9798400713644
ISBN (Print)9798400713644
DOIs
Publication statusPublished - 7 Sept 2025

Publication series

NameRecSys: Proceedings of the ACM Conference on Recommender Systems
PublisherAssociation for Computing Machinery

User-Defined Keywords

  • Large Language Models
  • Recommender Systems
  • Serendipity

Fingerprint

Dive into the research topics of 'Exploring the Potential of LLMs for Serendipity Evaluation in Recommender Systems'. Together they form a unique fingerprint.

Cite this