Rethinking Occlusion in FER: A Semantic-Aware Perspective and Go Beyond

Huiyu Zhai, Xingxing Yang, Yalan Ye*, Chenyang Li, Bin Fan, Changze Li

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Facial expression recognition (FER) is a challenging task due to pervasive occlusion and dataset biases. Especially when facial information is partially occluded, existing FER models struggle to extract effective facial features, leading to inaccurate classifications. In response, we present ORSANet, which introduces the following three key contributions: First, we introduce auxiliary multi-modal semantic guidance to disambiguate facial occlusion and learn high-level semantic knowledge, which is two-fold: 1) we introduce semantic segmentation maps as dense semantics prior to generate semantics-enhanced facial representations; 2) we introduce facial landmarks as sparse geometric prior to mitigate intrinsic noises in FER, such as identity and gender biases. Second, to facilitate the effective incorporation of these two multi-modal priors, we customize a Multi-scale Cross-interaction Module (MCM) to adaptively fuse the landmark feature and semantics-enhanced representations within different scales. Third, we design a Dynamic Adversarial Repulsion Enhancement Loss (DARELoss) that dynamically adjusts the margins of ambiguous classes, further enhancing the model's ability to distinguish similar expressions. We further construct the first occlusion-oriented FER dataset to facilitate specialized robustness analysis on various real-world occlusion conditions, dubbed Occlu-FER. Extensive experiments on both public benchmarks and Occlu-FER demonstrate that our proposed ORSANet achieves SOTA recognition performance. Code is publicly available at https://github.com/Wenyuzhy/ORSANet-master.
Original languageEnglish
Title of host publicationProceedings of the 33rd ACM International Conference on Multimedia
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages5567-5576
Number of pages10
ISBN (Print)9798400720352
DOIs
Publication statusPublished - 27 Oct 2025
Event33rd ACM International Conference on Multimedia, ACMMM25 - Dublin Royal Convention Centre, Dublin, Ireland
Duration: 27 Oct 202531 Oct 2025
https://whova.com/embedded/event/sa54pNCpHUFy1OTIEiEzceQu5kPuSm3dYlEnqAJdV4o%3D/?utc_source=ems (Conference program)
https://acmmm2025.org/ (Conference website)
https://dl.acm.org/doi/proceedings/10.1145/3746027 (Conference proceedings)

Publication series

NameProceedings of the ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery

Conference

Conference33rd ACM International Conference on Multimedia, ACMMM25
Country/TerritoryIreland
CityDublin
Period27/10/2531/10/25
Internet address

User-Defined Keywords

  • Facial Expression Recognition
  • Occlusion
  • Semantic Prior
  • Segmentation Map
  • Facial Landmark
  • Class Imbalance

Fingerprint

Dive into the research topics of 'Rethinking Occlusion in FER: A Semantic-Aware Perspective and Go Beyond'. Together they form a unique fingerprint.

Cite this