TY - JOUR
T1 - Unknown-Aware Bilateral Dependency Optimization for Defending Against Model Inversion Attacks
AU - Peng, Xiong
AU - Liu, Feng
AU - Wang, Nannan
AU - Lan, Long
AU - Liu, Tongliang
AU - Cheung, Yiu-ming
AU - Han, Bo
N1 - The work of Xiong Peng and Bo Han was supported in part by the NSFC General Program under Grant 62376235, in part by RGC Young Collaborative Research under Grant C2005-24Y, in part by Guangdong Basic and Applied Basic Research Foundation under Grant 2022A1515011652 and Grant 2024A1515012399, in part by the HKBU Faculty Niche Research Areas under Grant RC-FNRA-IG/22-23/SCI/04, and in part by the HKBU CSD Departmental Incentive Scheme. The work of Feng Liu was supported in part by the Australian Research Council (ARC) under Grant DE240101089, Grant LP240100101, and Grant DP230101540, and in part by the NSF&CSIRO Responsible AI program under Grant 2303037. The work of Nannan Wang was supported in part by the National Natural Science Foundation of China under Grant U22A2096. The work of Long Lan was supported in part by the National Natural Science Foundation of China under Grant 62376282. The work of Yiu-ming Cheung was supported in part by the NSFC/RGC Joint Research Scheme under Grant N_HKBU214/21, and in part by the RGC Senior Research Fellow Scheme under Grant SRFS2324-2S02.
Publisher copyright:
© 2025 IEEE.
PY - 2025/8
Y1 - 2025/8
N2 - By abusing access to a well-trained classifier, model inversion (MI) attacks pose a significant threat as they can recover the original training data, leading to privacy leakage. Previous studies mitigated MI attacks by imposing regularization to reduce the dependency between input features and outputs during classifier training, a strategy known as unilateral dependency optimization. However, this strategy contradicts the objective of minimizing the supervised classification loss, which inherently seeks to maximize the dependency between input features and outputs. Consequently, there is a trade-off between improving the model's robustness against MI attacks and maintaining its classification performance. To address this issue, we propose the bilateral dependency optimization strategy (BiDO), a dual-objective approach that minimizes the dependency between input features and latent representations, while simultaneously maximizing the dependency between latent representations and labels. BiDO is remarkable for its privacy-preserving capabilities. However, models trained with BiDO exhibit diminished capabilities in out-of-distribution (OOD) detection compared to models trained with standard classification supervision. Given the open-world nature of deep learning systems, this limitation could lead to significant security risks, as encountering OOD inputs—whose label spaces do not overlap with the in-distribution (ID) data used during training-is inevitable. To address this, we leverage readily available auxiliary OOD data to enhance the OOD detection performance of models trained with BiDO. This leads to the introduction of an upgraded framework, unknown-aware BiDO (BiDO+), which mitigates both privacy and security concerns. As a highlight, with comparable model utility, BiDO-HSIC+ reduces the FPR95 by $55.02 and enhances the AUCROC by $9.52 compared to BiDO-HSIC, while also providing superior MI robustness.
AB - By abusing access to a well-trained classifier, model inversion (MI) attacks pose a significant threat as they can recover the original training data, leading to privacy leakage. Previous studies mitigated MI attacks by imposing regularization to reduce the dependency between input features and outputs during classifier training, a strategy known as unilateral dependency optimization. However, this strategy contradicts the objective of minimizing the supervised classification loss, which inherently seeks to maximize the dependency between input features and outputs. Consequently, there is a trade-off between improving the model's robustness against MI attacks and maintaining its classification performance. To address this issue, we propose the bilateral dependency optimization strategy (BiDO), a dual-objective approach that minimizes the dependency between input features and latent representations, while simultaneously maximizing the dependency between latent representations and labels. BiDO is remarkable for its privacy-preserving capabilities. However, models trained with BiDO exhibit diminished capabilities in out-of-distribution (OOD) detection compared to models trained with standard classification supervision. Given the open-world nature of deep learning systems, this limitation could lead to significant security risks, as encountering OOD inputs—whose label spaces do not overlap with the in-distribution (ID) data used during training-is inevitable. To address this, we leverage readily available auxiliary OOD data to enhance the OOD detection performance of models trained with BiDO. This leads to the introduction of an upgraded framework, unknown-aware BiDO (BiDO+), which mitigates both privacy and security concerns. As a highlight, with comparable model utility, BiDO-HSIC+ reduces the FPR95 by $55.02 and enhances the AUCROC by $9.52 compared to BiDO-HSIC, while also providing superior MI robustness.
KW - Model inversion attacks
KW - dependency optimization
KW - out-of-distribution detection
UR - http://www.scopus.com/inward/record.url?scp=105002155575&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2025.3558267
DO - 10.1109/TPAMI.2025.3558267
M3 - Journal article
SN - 0162-8828
VL - 47
SP - 6382
EP - 6395
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 8
ER -