ICLR 2026 - Reviews

SubmissionsReviews

Reviews

Summary Statistics

EditLens Prediction Count Avg Rating Avg Confidence Avg Length (chars)
Fully AI-generated 0 (0%) N/A N/A N/A
Heavily AI-edited 1 (25%) 8.00 4.00 2608
Moderately AI-edited 0 (0%) N/A N/A N/A
Lightly AI-edited 1 (25%) 4.00 4.00 3010
Fully human-written 2 (50%) 4.00 3.00 3356
Total 4 (100%) 5.00 3.50 3082
Title Ratings Review Text EditLens Prediction
A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models Soundness: 4: excellent Presentation: 4: excellent Contribution: 3: good Rating: 8: accept, good paper Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. This paper introduces the Manifold-Probabilistic Projection Model (MPPM) and its latent variant (LMPPM), which unify geometric and probabilistic interpretations of generative modeling. The method interprets diffusion models as iterative projections onto the manifold of “good” images, defined through a distance function and an associated kernel-based probability density. The authors derive this formulation rigorously from geometric principles, introduce both ambient-space and latent-space implementations, and connect the model to classical autoencoder architectures. The paper is well-written, mathematically detailed, and offers a clear conceptual framework that bridges geometry and probability in generative modeling. * Excellent clarity and presentation of the theoretical derivation. * Well-organized narrative: the geometric intuition, probabilistic extension, and algorithmic details are all coherent and rigorous. * The proposed framework provides an elegant deterministic alternative to diffusion sampling, supported by sound intuition. * Clear and readable mathematical notation throughout. **Experimental scope:** * Despite the theoretical strength, experiments are limited to MNIST and SCUT-FBP5500, which are small and relatively trivial datasets. After such a solid theoretical development, this weak experimental section feels like a missed opportunity. * Evaluations rely mainly on the Latent MPPM variant, and mostly in a reconstruction setting rather than true generation. * While reconstruction is a valid demonstration, it is not the most relevant metric for generative models. The paper would be much stronger if generation quality were assessed on more challenging datasets such as ImageNet 64×64, CelebA-HQ, or CIFAR-10. Even a small-scale generation study (e.g., 32×32), if focused, would make the contribution more complete. **Focus dilution:** The inclusion of reconstruction experiments makes the paper feel slightly misaligned with its main message. A more focused evaluation of generation performance would better highlight the model’s strengths. **Minor technical comments:** * Missing citations for the Eikonal equation (line 145) and kernel density estimation (line 185). * Line 216: the term “normalized gradient” seems redundant since $|| D_M (x) || = 1$ by the Eikonal equation. * Equation (10): unclear why $G(z)$ appears outside the exponential. * Line 55: the authors do not **propose** the manifold assumption but rather **assume** it. I have no questions other than asking the authors to perform more focused experiments, as indicated in the "Weakness" section. Heavily AI-edited
A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models Soundness: 2: fair Presentation: 3: good Contribution: 1: poor Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The paper introduce a geometric picture framing VAE, GAN diffusion with respect to the data manifold, and they interpreted diffusion models as iterative manifold projection. Then they derive a model / objective (LMPPM) based on this idea, and and show some improved performance regarding clearing image degradation on some datasets. ### Strength - The conceptual discussion about the manifold geometry in diffusion, VAE and GAN is interesting, and think about diffusion model in this geometric way is laudable. ### Weakness - The new methods in Sec.4 is not very convincing and/or not super well framed in the literature. Specifically, I feel there are so many connection to existing diffusion models, energy based model etc., just by re-interpreting the entities. - e.g. the first term of the loss in eq. 13 learn the distance or energy instead of the score vector itself. - the 5th term is very similar to the denoising score matching objective if we parametrize score by denoisers [^3,^4] since $z_i^{shift}$ is basically the denoiser. which is enforcing the gradient of distance to be nicely aligned to score - In this regard, it seems the main innovation is that we have a distance function without explicit conditioning of noise scale. But seems [^5,^6] also discussed / discovered that the time / noise scale conditioning is not necessary. [^5] Sun, Q., Jiang, Z., Zhao, H., & He, K. (2025). Is Noise Conditioning Necessary for Denoising Generative Models?. [^6] Kadkhodaie, Z., Guth, F., Simoncelli, E. P., & Mallat, S. (2024). Generalization in diffusion models arises from geometry-adaptive harmonic representations. ICLR - From the algorithm or the method itself, I cannot see a clear reason why the proposed MPPM or LMPPM method is better than LDM. is it the case than LDM needs a certain noise / time conditioning, thus if you input the wrong noise / time, it will not correctly denoise the image? but for LMPPM, you have no time conditioning, so you are more robust in that regard? - Currently the FID in Table 1 is very high for LDM and DAE, which is a bit concerning. I feel something is wrong in the implmentation of these baselines… - Elucidating why LMPPM is better via ablation / control experiment can largely improve the paper, and increase my evaluation of the paper. - in abstract why do the authors say “*The foundational premise of generative AI for images is the assumption that images are inherently low-dimensional objects embedded within a high dimensional space*”? Seems generative AI can still work if images are not low dimensional objects…. I agree with the assumption, but do not think it’s a foundational premise of generative AI. - I feel the geometric view of diffusion models (Sec 3, Fig. 2) is definitely correct and worth noting, but it’s also not entirely new. Authors could mention very similar figures as in Fig1 [^1] Fig4 [^2]. e.g. the quantity noted in eq. 8 has name in many papers, i.e. ideal denoiser [^3,^2], and the relation between score and denoiser has been known as tweedie’s formula [^4]. $\hat{x}_{\text{MMSE}} = \mathbb{E}[u \mid x] = x + \sigma^2 \nabla_x \log P(x)$ [^1] Chen, D., Zhou, Z., Wang, C., Shen, C., & Lyu, S. (2024). On the trajectory regularity of ode-based diffusion sampling. ICML https://arxiv.org/abs/2405.11326 [^2] Wang, & Vastola, (2024). The unreasonable effectiveness of gaussian score approximation for diffusion models and its applications. TMLR https://arxiv.org/abs/2412.09726 [^3] Karras, T., Aittala, M., Aila, T., & Laine, S. (2022). Elucidating the design space of diffusion-based generative models. NeurIPS [^4] Efron, B. (2011). Tweedie’s formula and selection bias. Journal of the American Statistical Association - Eq.14 is also known as ideal denoiser with delta mixture distribution / empirical distribution [^2,^3]. - As the authors pointed out the Riemannian geometry of data manifold through the generator of GAN, VAE have been studied for a while, some reference could be added for this tradition [^7,^8,^9]. [^7] Shao, H., Kumar, A., & Fletcher, P. T. (2017). The riemannian geometry of deep generative models. *CVPR Workshops* [^8] Wang, B., & Ponce, C. R. (2021). The geometry of deep generative image models and its applications. ICLR [^9] Chadebec, C., & Allassonnière, S. (2022). A geometric perspective on variational autoencoders. NeurIPS Fully human-written
A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. The paper proposes a new model, the Manifold Probabilistic Projection Model (MPPM) and its latent version, which interprets diffusion models as geometric projections that iteratively move corrupted inputs toward the clean image manifold. Based on the manifold assumption that image data resides on a low-dimensional smooth manifold, the paper integrates a learned distance function to the probability vector fields to guide image reconstruction and generation. The method shows superior performance compared to the latent diffusion model on image restoration and generation tasks. - The paper introduces a new view of the diffusion model as a projection onto the manifold. - Using a distance-based geometric approach and a kernel-based probabilistic model, the paper tries to make an interpretable link between them and attempts to make a unified framework. - Detailed definitions on loss, architectures, and training settings have been provided in the Appendix. - While the idea of viewing the diffusion model as a projection onto the manifold is interesting, I cannot find a theoretical explanation or demonstration that the iterative process approximates a projection. Also, Equations 11 and 12 are heuristic updates without any guarantee of convergence. - The formulation is overly complex without clear benefit. Distance function, kernels, and autoencoders introduce considerable complexity, but I do not see why it should be explicitly better than exisiting diffusion models. Empirical results cannot be the justification as the datasets are too small and baselines are too weak. - The experiments are limited to simple datasets: MNIST and SCUT-FBP5500 datasets. It does not show general applicability or scalability. Experiments on datasets with the scale of CIFAR-10 or LSUN would be recommended. Also, the compared baselines are too weak and naive to say the complex formulation of the method should be used. - As the model introduces additional networks and iterative updates, it should require a comparison of computational complexity to diffusion models. Does learning distance functions and using it cost significantly? - Ablation analysis on the main components, like the distance function or kernels, would make the claim of the paper stronger. Please address the questions raised in the weakness section. Fully human-written
A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models Soundness: 3: good Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The article presents an integrated perspective that unifies geometric and probabilistic views by introducing a geometric framework and a kernel-based probabilistic method. Within this framework, the diffusion model is interpreted as a projection mechanism on the manifold of “high-quality images,” providing new insight into its underlying nature. Building on this interpretation, the authors propose a deterministic model—the Manifold Probability Projection Model (MPPM)—which operates coherently in both the representation (pixel) and latent spaces. Experimental results indicate that the Latent Space MPPM (LMPPM) surpasses the latent diffusion model (LDM) across multiple datasets, demonstrating superior performance in image restoration task. 1.The perspective of this article is very interesting. It is of great significance to unify the understanding of geometry and probability. This is of great significance for modeling more complex data manifold distributions. 2.The theory in this article is very solid. As a work of great theoretical significance, it deserves attention. 3.The paper is well-written and the motivation is very convincing. 1.I have some concerns about the theoretical assumptions. The article assumes the existence of Gaussian noise perturbations between points on the clean image manifold and the real images. However, if the task is not image restoration, or if the data are already sufficiently clean, this assumption and consequently the proposed theory may not hold effectively. 2.The experimental section lacks comparisons with several relevant baselines in image restoration and inverse problem research [1–3]. In addition, manifold-preserving approaches [4–6] should also be considered for a more comprehensive evaluation. It seems insufficient that the author only compares with DAE and LDM. [1] A Unified Conditional Framework for Diffusion-based Image Restoration [2] DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [3] Refusion: Enabling Large-Size Realistic Image Restoration [4] Manifold Preserving Guided Diffusion [5] CFG++: Manifold-Constrained Classifier-Free Guidance for Diffusion Models [6] Improving Diffusion Models for Inverse Problems using Manifold Constraints 3.Why do other methods work well on SSIM but worse on FID? Could this be because they only learned the distribution with noise added instead of the clean data distribution? The author lacks a more profound analysis. 1.Would the proposed method still be effective under the assumption of clean data? Or whether it can be directly used for generating images rather than for image restoration tasks? 2.Refer to Weakness 2, how about the performance of other related models? 3.In Figure 5, I noticed that the image generated by LMPPM does not contain white teeth. Could this be caused by the limitations of some manifold probability distributions? Or is it because of some probability assumptions that the model ignores the special manifold of the tooth part? Lightly AI-edited
PreviousPage 1 of 1 (4 total rows)Next