TY - JOUR
T1 - DiffKGW
T2 - Stealthy and Robust Diffusion Model Watermaing
AU - Wei, Tianxin
AU - Qiu, Ruizhong
AU - Chen, Yifan
AU - Qi, Yunzhe
AU - Lin, Jiacheng
AU - Bao, Wenxuan
AU - Xu, Wenju
AU - Nag, Sreyashi
AU - Li, Ruirui
AU - Lu, Hanqing
AU - Wang, Zhengyang
AU - Luo, Chen
AU - Liu, Hui
AU - Wang, Suhang
AU - He, Jingrui
AU - He, Qi
AU - Tang, Xianfeng
N1 - This work is supported by National Science Foundation under Award No. IIS-2117902. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government.
Publisher Copyright:
© 2026, Transactions on Machine Learning Research. All rights reserved.
PY - 2026/2
Y1 - 2026/2
N2 - Diffusion models are known for their supreme capability to generate realistic images. However, ethical concerns, such as copyright protection and the generation of inappropriate content, pose significant challenges for the practical deployment of diffusion models. Recent work has proposed a flurry of watermarking techniques that inject artificial patterns into initial latent representations of diffusion models, offering a promising solution to these issues. However, enforcing a specific pattern on selected elements can disrupt the Gaussian distribution of the initial latent representation. Inspired by watermarks for large language models (LLMs), we generalize the LLM KGW watermark to image diffusion models and propose a stealthy probability adjustment approach DiffKGW that preserves the Gaussian distribution of initial latent representation. In addition, we dissect the design principles of state-of-the-art watermarking techniques and introduce a unified framework. We identify a set of dimensions that explain the manipulation enforced by watermarking methods, including the distribution of individual elements, the specification of watermark shapes within each channel, and the choice of channels for watermark embedding. Through the empirical studies on regular text-to-image applications and the first systematic attempt at watermarking image-to-image diffusion models, we thoroughly verify the effectiveness of our proposed watermark identification framework through comprehensive evaluations. On all the diffusion models, including Stable Diffusion, our approach induced from the proposed framework not only preserves image quality but also outperforms existing methods in robustness against a wide range of attacks.
AB - Diffusion models are known for their supreme capability to generate realistic images. However, ethical concerns, such as copyright protection and the generation of inappropriate content, pose significant challenges for the practical deployment of diffusion models. Recent work has proposed a flurry of watermarking techniques that inject artificial patterns into initial latent representations of diffusion models, offering a promising solution to these issues. However, enforcing a specific pattern on selected elements can disrupt the Gaussian distribution of the initial latent representation. Inspired by watermarks for large language models (LLMs), we generalize the LLM KGW watermark to image diffusion models and propose a stealthy probability adjustment approach DiffKGW that preserves the Gaussian distribution of initial latent representation. In addition, we dissect the design principles of state-of-the-art watermarking techniques and introduce a unified framework. We identify a set of dimensions that explain the manipulation enforced by watermarking methods, including the distribution of individual elements, the specification of watermark shapes within each channel, and the choice of channels for watermark embedding. Through the empirical studies on regular text-to-image applications and the first systematic attempt at watermarking image-to-image diffusion models, we thoroughly verify the effectiveness of our proposed watermark identification framework through comprehensive evaluations. On all the diffusion models, including Stable Diffusion, our approach induced from the proposed framework not only preserves image quality but also outperforms existing methods in robustness against a wide range of attacks.
UR - https://jmlr.org/tmlr/papers/
UR - https://www.scopus.com/pages/publications/105033921426
M3 - Journal article
AN - SCOPUS:105033921426
SN - 2835-8856
VL - 2026-February
JO - Transactions on Machine Learning Research
JF - Transactions on Machine Learning Research
ER -