TY - JOUR
T1 - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture
AU - Kong, Chenqi
AU - Luo, Anwei
AU - Bao, Peijun
AU - Li, Haoliang
AU - Wan, Renjie
AU - Zheng, Zengwei
AU - Rocha, Anderson
AU - Kot, Alex C.
N1 - This research is supported in part by the National Nature Science Foundation of China (NSFC) under Grant 62502187; in part by the Natural Science Foundation of Jiangxi Province
of China under Grant 20252BAC240015; in part by Sichuan Science and Technology Fund 2025ZNSFSC0511; in part by the Research Grant Council (RGC) of Hong Kong SAR, under a GRF Grant 12203124 and an ECS Grant 22201125; in part by the São Paulo Research Foundation (Fapesp) Horus project, Grant #2023/12865-8 and the Brazilian National Council for Scientific and Technological Development (CNPq), Aletheia project, Grant #442229/2024-0. This work was carried out at the Rapid-Rich Object Search (ROSE) Lab, School of Electrical & Electronic Engineering, Nanyang Technological University (NTU), Singapore. This research is supported in part by A*STAR under its OTS Research Programme (Award S24T2TS006). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the A*STAR.
Publisher Copyright:
© Copyright 2026 IEEE
PY - 2026/3/2
Y1 - 2026/3/2
N2 - Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains or inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. Our method builds on the assumption that different forgery source domains exhibit distinct style statistics. Specifically, we design a forgery-style-mixture formulation that augments the diversity of forgery source domains, enhancing the model’s generalizability across unseen domains. In addition, previous methods typically require fully fine-tuning pretrained networks, consuming substantial time and computational resources. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild.
AB - Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains or inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. Our method builds on the assumption that different forgery source domains exhibit distinct style statistics. Specifically, we design a forgery-style-mixture formulation that augments the diversity of forgery source domains, enhancing the model’s generalizability across unseen domains. In addition, previous methods typically require fully fine-tuning pretrained networks, consuming substantial time and computational resources. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild.
KW - Deepfakes
KW - face forgery detection
KW - generalization
KW - open-set
KW - parameter-efficient learning
KW - robustness
KW - style mixture
UR - https://www.scopus.com/pages/publications/105032257127
U2 - 10.1109/TCSVT.2026.3669190
DO - 10.1109/TCSVT.2026.3669190
M3 - Journal article
SN - 1558-2205
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -