MaXsive High-Capacity and Robust Training-Free Generative Image Watermarking in Diffusion Models (ACM MM 2025)

📑 Authors: Po-Yuan Mao, Cheng-Chang Tsai, Chun-Shien Lu
📄 Paper: arXiv:2507.21195
💻 Code Repository: GitHub - MaXsive

🏆 Key Contributions

High Capacity Training-Free Algorithm: Leveraging Shannon entropy, MaXsive achieves much higher watermark capacity than previous training-free methods, reducing ID collusion risk and enabling real-world deployment without extra fine-tuning.
Robust Diffusion Watermarking: MaXsive surpasses existing training-free diffusion-based generative watermarking algorithms, providing superior robustness for both identification and verification.
Novel RST Attack Resistance: MaXsive introduces a template for diffusion model watermarking, effectively resolving rotation, scaling, and translation (RST) attacks. Unlike prior methods with tightly coupled patterns, MaXsive’s template and watermark are decoupled, preserving watermark capacity.

📝 Abstract

The success of diffusion models in image synthesis has led to the release of large commercial models, raising concerns about copyright protection and inappropriate content generation. Training-free diffusion watermarking offers a cost-effective solution, but prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Some methods use carefully designed patterns to mitigate this, but often at the expense of watermark capacity, increasing the risk of identity (ID) collusion. MaXsive addresses these issues with a training-free generative watermarking technique that is both high-capacity and robust. It utilizes initial noise for watermarking and introduces an X-shape template to recover RST distortions, significantly boosting robustness without sacrificing capacity. MaXsive’s effectiveness is validated on two benchmark datasets for verification and identification scenarios.

🌍 Application Scenario

Real-world applications of training-free diffusion watermarking algorithms.

🛠️ MaXive Framework

MaXsive framework: The watermark is a small-dimension vector sampled from an ideal Gaussian distribution, duplicated and encrypted by private keys to form the input noise for the diffusion model.

🧩 Template Robustness

Visualization of the template space under various distortions. Left of the dashed line: images generated by Stable Diffusion 2.1 with identical prompts and initial noise, template injected in the second column. Right: corresponding distorted template images.

📚 Citation

If you use MaXsive, please cite:

@article{mao2025maxsive,
  title={MaXsive: High-Capacity and Robust Training-Free Generative Image Watermarking in Diffusion Models},
  author={Mao, Po-Yuan and Tsai, Cheng-Chang and Lu, Chun-Shien},
  journal={ACM International Conference on Multimedia},
  year={2025}
}