Sampling from the posterior distribution poses a major computational challenge in solving inverse problems using latent diffusion models. Common methods rely on Tweedie's first-order moments, which are known to induce a quality-limiting bias. Existing second-order approximations are impractical due to prohibitive computational costs, making standard reverse diffusion processes intractable for posterior sampling.
This paper introduces Second-order Tweedie sampler from Surrogate Loss (STSL), a novel sampler that offers efficiency comparable to first-order Tweedie with a tractable reverse process using second-order approximation. Our theoretical results reveal that the second-order approximation is lower bounded by our surrogate loss that only requires O(1) compute using the trace of the Hessian, and by the lower bound we derive a new drift term to make the reverse process tractable. Our method surpasses SoTA solvers PSLD[3] and P2L[4], achieving 4X and 8X reduction in neural function evaluations, respectively, while notably enhancing sampling quality on FFHQ, ImageNet, and COCO benchmarks. In addition, we show STSL extends to text-guided image editing and addresses residual distortions present from corrupted images in leading text-guided image editing methods.
To our best knowledge, this is the first work to offer an efficient second-order approximation in solving inverse problems using latent diffusion and editing real-world images with corruptions.
We introduce a new framework for high-fidelity image editing in real-world environments with corruptions. To the best of our knowledge, this is the first framework that can handle corruptions in image editing pipelines.
"a high quality photo of a tiger face" → "a high quality photo of a leopard face"
"a high quality photo of a cat face" → "a high quality photo of a fox face"
Motion Blur
Super-Resolution (8X)
Gaussian Blur
[1] Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, and Daniel Cohen-Or. Prompt-to-prompt image editing with cross attention control.ICLR, 2023.
This research has been partially supported by NSF Grant 2019844, Google Research, and the UT Austin Machine Learning Lab (MLL). Litu Rout has been supported by the Ju-Nam and Pearl Chew Endowed Presidential Fellowship in Engineering and the George J. Heuer, Jr. Ph.D. Endowed Graduate Fellowship.
@misc{rout2023secondorder,
title={Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion},
author={Rout, L and Chen, Y and Kumar, A and Caramanis, C and Shakkottai, S and Chu, W},
journal={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}