Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score

Shuhai Zhang, Feng Liu, Jiahao Yang, Yifan Yang, Changsheng Li*, Bo Han*, Mingkui Tan*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

Abstract

Adversarial detection aims to determine whether a given sample is an adversarial one based on the discrepancy between natural and adversarial distributions. Unfortunately, estimating or comparing two data distributions is extremely difficult, especially in high-dimension spaces. Recently, the gradient of log probability density (a.k.a., score) w.r.t. the sample is used as an alternative statistic to compute. However, we find that the score is sensitive in identifying adversarial samples due to insufficient information with one sample only. In this paper, we propose a new statistic called expected perturbation score (EPS), which is essentially the expected score of a sample after various perturbations. Specifically, to obtain adequate information regarding one sample, we perturb it by adding various noises to capture its multi-view observations. We theoretically prove that EPS is a proper statistic to compute the discrepancy between two samples under mild conditions. In practice, we can use a pre-trained diffusion model to estimate EPS for each sample. Last, we propose an EPS-based adversarial detection (EPS-AD) method, in which we develop EPS-based maximum mean discrepancy (MMD) as a metric to measure the discrepancy between the test sample and natural samples. We also prove that the EPS-based MMD between natural and adversarial samples is larger than that among natural samples. Extensive experiments show the superior adversarial detection performance of our EPS-AD.

Original languageEnglish
Title of host publicationProceedings of 40th International Conference on Machine Learning, ICML 2023
EditorsAndreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett
PublisherML Research Press
Pages41429-41451
Number of pages23
Publication statusPublished - Jul 2023
Event40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States
Duration: 23 Jul 202329 Jul 2023
https://icml.cc/Conferences/2023
https://proceedings.mlr.press/v202/
https://openreview.net/group?id=ICML.cc/2023/Conference

Publication series

NameProceedings of Machine Learning Research
PublisherML Research Press
Volume202
ISSN (Print)2640-3498

Conference

Conference40th International Conference on Machine Learning, ICML 2023
Country/TerritoryUnited States
CityHonolulu
Period23/07/2329/07/23
Internet address

Scopus Subject Areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score'. Together they form a unique fingerprint.

Cite this