ICLR 2026 - Reviews

SubmissionsReviews

Reviews

Summary Statistics

EditLens Prediction Count Avg Rating Avg Confidence Avg Length (chars)
Fully AI-generated 1 (20%) 6.00 4.00 4120
Heavily AI-edited 0 (0%) N/A N/A N/A
Moderately AI-edited 0 (0%) N/A N/A N/A
Lightly AI-edited 1 (20%) 6.00 4.00 3142
Fully human-written 3 (60%) 7.33 4.33 2970
Total 5 (100%) 6.80 4.20 3235
Title Ratings Review Text EditLens Prediction
PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference Soundness: 3: good Presentation: 4: excellent Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. **PriorGuide** targets a practical SBI need: adapting a trained diffusion-based posterior estimator to **new priors at test time** with **no retraining**. The paper derives a clean **score decomposition** in which the target posterior score equals the trained score plus a **prior-ratio guidance** term . To make this usable, the authors (i) use a *standard* Gaussian reverse-kernel approximation (via Tweedie’s formula) and (ii) fit a Gaussian mixture to the prior ratio, yielding a *novel* closed-form guidance update that plug directly into the diffusion sampler. Optional few-step Langevin corrections at low noise tighten asymptotics and expose a neat compute–accuracy knob alongside the diffusion steps. Empirically, PriorGuide improves both posterior and posterior-predictive metrics across several SBI tasks while keeping the implementation lightweight and training unchanged. - **Well written easy to follow**:The derivation is easy to follow; the GMM prior-ratio leads to implementable, closed-form guidance. Propositions are sound and proven in the Appendix. - **Good empirical results with clear trade-offs and ablations**: Consistent gains on posterior and predictive metrics; straightforward ablations over diffusion/Langevin steps make hyperparameter selection transparent. The paper is upfront about approximation choices and where Langevin steps matter, which helps practitioners reason about when to use the method. - **Prior family diversity**: Main experiments emphasize changes among Gaussian priors (often **diagonal covariance**). This is required for fair ACE comparisons but does not demonstrate the claimed method’s generality. A demonstration on **non-factorized** or more structured priors (complicated priors) would strengthen external validity. For example one use-case that requires flexible priors would be for "i.i.d" data which also can be handled by sequentially updating the "prior" with the "current posterior". 1. **Non-factorized complex priors**: Have you tried dense-covariance Gaussians or general complicated priors? Have you tried inference on iid data using PriorGuide? 2. **Better Gaussian reverse Kernel approximations**: There exists better reverse kernel approximations (i.e. [1] and related). These usually are more costly i.e. require Jacobians or other terms. Would be interesting to see if these also help in that case but atleast should be discussed in the manuscript. 3. **Matrix calculations**: The authors state that a current limitation is that "matrix-operations may-pose a scalability issue". A straight forward approach to avoid this is to constrain the involved covariance to be *diagonal* (or other approx.). After all a mixture of Gaussians with diagonal covariance is still a universal approximation family (just may needs more components). As the backward kernel approximation is by design also diagonal the whole Guidance terms are reduced to the diagonal. This hence makes *PriorGuideDiag* an interesting alternative to the current approach. [1] Boys, Benjamin, et al. "Tweedie moment projected diffusions for inverse problems." _arXiv preprint arXiv:2310.06721_ (2023). Lightly AI-edited
PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference Soundness: 3: good Presentation: 4: excellent Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The paper proposes PriorGuide, a method to adapt diffusion-based amortized simulation-based inference (SBI) models to new priors at test time without retraining. The key idea is to express the target-posterior score as the original score (under the training prior) plus a guidance term involving the prior ratio $(r(\omega)=q(\omega)/p_{\text{train}}(\omega))$. To make the guidance tractable, the reverse transition kernel is approximated as Gaussian and the prior ratio is approximated by a Gaussian Mixture Model (GMM), which yields a closed-form guidance update. The method is evaluated on a suite of SBI problems for both posterior and posterior-predictive inference. Empirically, PriorGuide attains competitive performance relative to baselines, and the paper further studies a compute–accuracy trade-off using interleaved Langevin refinement. **Quality.** The paper’s theoretical foundation is solid: the score decomposition with a guidance term is standard and correctly instantiated for the SBI setting. The Gaussian reverse-kernel and GMM prior-ratio approximations are reasonable and enable a practical algorithm. The analysis of test-time compute via Langevin refinement is appropriate, and the extended experiments cover canonical SBI tasks. The ablation and sensitivity studies in the appendix further strengthen the empirical grounding. **Clarity.** The manuscript is exceptionally well written, with well-structured exposition and clear intuition for the guidance term. Notation is introduced early and consistently; figures and tables are easy to interpret. The appendix integrates detailed methodological clarifications—GMM fitting, diagnostics, and runtime analyses—that improve reproducibility. The discussion of the Pareto front between diffusion and Langevin steps is particularly insightful. **Significance and Contribution.** The work addresses a practically important problem—changing priors after training—where retraining amortized SBI models can be prohibitively expensive. Adapting diffusion guidance to encode new priors at inference is an appealing and useful contribution. The empirical scope is strong, especially given the additional baselines in the appendix (Rejection Sampling, Sequential Importance Resampling, NLE+MCMC), a sensitivity analysis, and a 20D test demonstrating scalability. The contribution remains primarily empirical but offers meaningful methodological value for the SBI community. **Originality.** Applying diffusion guidance to test-time prior adaptation in amortized SBI is a natural but novel design choice. The GMM ratio approximation provides a practical mechanism to operationalize this guidance, and the extended analyses confirm that the method maintains competitive performance where simpler baselines fail. **Scope and positioning.** While the paper clearly demonstrates PriorGuide’s strengths, it could offer more explicit practical guidance for practitioners—specifically, when PriorGuide should be preferred over simpler baselines such as NLE + MCMC, RS, SIR, or related methods like SIMFORMER and ACE and how to balance its additional test-time computation against retraining or posterior-correction methods. While this is discussed qualitatively in the appendix, a concise summary in the main text would make the contribution more actionable. **Dimensionality constraints.** Although the new 20D Gaussian Linear results demonstrate scalability, these remain relatively simple benchmarks. A more complex or real-world high-dimensional example would strengthen the claim of general applicability. **Approximation limits.** The Gaussian reverse-kernel approximation, while adequate for the reported tasks, remains a simplifying assumption. Even with Langevin refinement mitigating some of its effects, further diagnostic guidance for practitioners would be useful. 1. Could the authors provide clearer practical guidance on when PriorGuide is preferable to simpler or amortized alternatives? For instance, how should practitioners trade off its added test-time computation against retraining costs or simpler posterior-correction methods? Fully AI-generated
PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference Soundness: 4: excellent Presentation: 3: good Contribution: 4: excellent Rating: 10: strong accept, should be highlighted at the conference Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully. PriorGuide introduces an extension to framework of simulation-based inference (SBI) with diffusion models that allows pre-trained diffusion models to incorporate new prior distributions at inference time without requiring retraining. This is achieved using a novel Gaussian mixture model approximation to make the target prior tractable for guiding the diffusion sampling process. Empirical results on synthethic data demonstrate PriorGuide’s effectiveness in accurately recovering posterior and posterior-predictive distributions across various SBI problems. The method also allows for refinement through Langevin dynamics, enabling a balance between computational cost and accuracy. By enabling adaptation to new priors without retraining, PriorGuide exemplifies a “test-time compute” approach, extending the capabilities of pre-trained models with targeted computations. This decoupling of simulator runs from prior specification offers practical benefits especially in the case where simulations are costly, including the ability to perform post-hoc prior sensitivity analyses and incorporate domain expert knowledge after training, ultimately reducing the computational burden of scientific workflows which employ SBI. The manuscript proposes an extension of SimFormer modeling to the challenging problem of representing complex joint probability distributions. The core idea is relatively straightforward, yet the authors provide a rigorous mathematical treatment that appropriately identifies both the potential benefits and inherent limitations of this approach, acknowledging the necessary reliance on approximations. The empirical evaluation is a strength of the paper; the results presented demonstrate promising performance, particularly in the accuracy of the reported uncertainty estimates (Tables 1 & 2). The selection of benchmarking use cases is clearly motivated and relevant to the stated goals. Finally, the authors effectively utilize concise and informative visualizations to support their claims and illustrate the underlying hypothesis. The manuscript would benefit from a more careful review of its language, as instances of anthropomorphism (e.g., “the model having seen few training examples”) can detract from the scientific rigor. Additionally, the presentation of the derivation for 'r' is hampered by misaligned page breaks, separating key explanatory text and hindering comprehension. More significantly, while the paper thoroughly explores the technical capabilities of PriorGuide, it lacks a discussion of the broader implications of manipulating the prior distribution within a Bayesian framework. The authors appear to assume a familiarity with this interplay between prior, simulator, and the scientific endeavor, which may limit the accessibility and impact of the work for a wider audience. - page 4, the derivation of eq 6-9 is central to the paper, please try to rearrange so that the text in lines 216-220 is readily visible close the equations (e.g. by removing lines 191-195) - the code for the experiments was not found in the paper, i.e. through an anonymised git repo, this should be corrected for publication - line 232: please do not use anthropomorphic language to describe model behavior "the model having seen few training examples", i.e. the model has consumed or been trained on - line 234: "using standard OOD metrics" as such a standard exists only colloquially, please remove "standard" - equ 13: the text lacks a discussion if `K` is a free parameter and how to choose it - line 274: "Notably, when p(θ) is uniform, r(θ) reduce*s* to q(θ), which can be directly specified as a Gaussian mixture." perhaps also add hint why r reduces to q (due to using the expectation I guess) - line 310: "To incorporate this with regular diffusion ..." perhaps comment on the computational cost here already. Fully human-written
PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference Soundness: 3: good Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. This paper considers the amortized simulator-based inference framework and one of its major limitations, i.e., prior distributions used to generate model parameters during training. The authors propose a novel method, PriorGuide, which leverages a guidance approximation for flexible test-time prior adaptation. The method was evaluated on a set of benchmarks. The most significant strength is requiring no retraining. The paper has a few strengths: 1. The "prior-rigidity" of amortized SBI models is a well-known and significant bottleneck for scientific application. This paper offers a direct, "plug-and-play" solution that enables, for the first time, post-hoc prior sensitivity analysis and the incorporation of new expert knowledge using expensive, pre-trained models. 2. The method is built on a sound mathematical foundation. The derivation of the target posterior as a tilted distribution (Proposition 1) is clear and correct. 3. The experiments are convincing. The paper correctly identifies that its method is an approximate sampler and intelligently introduces Langevin correctors to manage the accuracy-compute trade-off. However, despite it’s strengths, there are also weaknesses: 1. The entire method hinges on a critical, and likely fragile, approximation: fitting the prior ratio $r(\theta) = q(\theta) / p_{train}(\theta)$ with a GMM. The paper understates the difficulty of this step. This is a density-ratio estimation problem, which is notoriously difficult, especially as the dimension of $\theta$ increases. The paper's solution (a gradient-based $L_2$ fit) is a heuristic that is not guaranteed to be stable or accurate, particularly if $p_{train}(\theta)$ is non-trivial. 2. The paper employs two major approximations: the GMM for the ratio and a simple diagonal Gaussian for the reverse kernel. It is the combination of these (especially the latter) that biases the sampler and requires the heavy use of Langevin correction steps to achieve good results. This means the method, while "retrain-free," has a high and non-trivial per-sample inference cost. Besides the Weaknesses, I have also a question:
 1. The GMM approximation of the prior ratio $r(\theta)$ appears to be the most critical (and fragile) part of the pipeline. How does the $L_2$ fitting procedure scale with the dimensionality of $\theta$? Have you considered replacing this heuristic fit with more robust, modern density-ratio estimators (e.g., flow-based, MINE, etc.)? Fully human-written
PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 8: accept, good paper Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The paper consideres an interesting problem of adapting a diffusion model pretrained to sample from a posterior when a different (unknown at the training state) prior model is adopted by the user. The proposed algorithm, PriorGuide, is able to correct biase of the pretrained diffision model during the sampling stage and providing a test time adaptive inference. The guidance is intractable but is proximated using Gaussian mixture models. The experiments show that on a wide variety of bechmark datasets, the proposed method outperforms similar diffusion methods with no-prior adaptation. Overall, I think this paper makes a solid contribution to the SBI community. The problem of changing priors is indeed common in many applications, and the ability to adapt a pre-trained posterior model without retraining the entire model is highly desirable and can make a lot of sense when the user's prior is different and comes in at a later stage. Diffusion guidance is a well-studied technique in the generative model adaptation, so it is natural to leverage this approach to adapt SBI diffusion models. To my best knowledge, the proposed idea is novel. The proposed method is mathematically sound, and its resulting formulation is natural, separating the pre-trained model from the guidance term (the expected prior shift). Although the guidance term itself is intractable, authors show it can be effectively approximated. My concerns are mostly on the literature review: In the main text, the authors do not provide a comprehensive review or comparison of prior adaptation methods in the existing literature. As briefly mentioned in the introduction and the experimental section, there are already methods that consider adaptive priors. Although the authors include a separate section in the appendix, a dedicated subsection in Section 2 would be appreciated, as readers would like to understand the specific limitations of these prior adaptation methods and how (or whether) they relate to the proposed guidance method. In Section 2, recent progress (2022 ~ ) on applying diffusion models to SBI should also be comprehensively reviewed and cited as they provide the backdrop (the pre-trained posterior score) to the proposed method. To the best of my knowledge, the earliest attempt to use diffusion models in SBI is often attributed to Geffner et al. (2023), first published on arXiv in 2022. Simformer is a subsequent work built on Geffner et al. and others, introducing masks and a transformer architecture to handle missing observations and more versatile inference tasks. None. Fully human-written
PreviousPage 1 of 1 (5 total rows)Next