ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	1 (33%)	4.00	3.00	2906
Heavily AI-edited	0 (0%)	N/A	N/A	N/A
Moderately AI-edited	0 (0%)	N/A	N/A	N/A
Lightly AI-edited	2 (67%)	6.00	3.00	2450
Fully human-written	0 (0%)	N/A	N/A	N/A
Total	3 (100%)	5.33	3.00	2602

Title	Ratings	Review Text	EditLens Prediction
Reducing Hallucinations in Generative Models through Truncated Statistics	Soundness: 3: good Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	This paper presents a novel and theoretically-grounded algorithm to mitigate hallucinations in generative models. The authors formulate the problem as a constrained optimization task: maximizing the likelihood of observed valid data while ensuring the model's hallucination rate (the probability mass assigned to a set of invalid outputs) remains below a predefined threshold `α`. The core contribution is a novel connection between this problem and the field of truncated statistics. By using a Lagrangian relaxation, the authors show that the constrained objective can be related to the negative log-likelihood of a truncated distribution, which is a well-studied convex problem for exponential families. The paper proves that for "powerful" model families (a class they define, which includes exponential families), this approach leads to a computationally and query-efficient algorithm. The proposed method, based on projected stochastic gradient descent (PSGD) over sublevel sets of the loss function, is provably effective and settles an open problem posed by Hanneke et al. (2018) regarding the existence of such an algorithm. 1. In a field often dominated by empirical results, this work provides rigorous, provable guarantees on computational efficiency, query efficiency, and correctness (i.e., achieving the target hallucination rate). Proving that the problem is tractable under the "powerful model" assumption, while being NP-hard in general, is a solid theoretical contribution. 2. A key strength is the paper's novel and principled formulation. By framing hallucination reduction as a constrained optimization problem and connecting it to truncated statistics, the authors leverage established mathematical tools to develop a rigorous solution. 1. The theoretical guarantees hinge on the "powerful model" assumption, primarily exemplified by exponential families. This creates a significant gap, as it is unclear whether complex Transformer architectures satisfy this property, limiting the direct applicability of the theory to state-of-the-art models. 2. The work is entirely theoretical and lacks empirical experiments, even on synthetic data. This absence makes it difficult to assess the algorithm's practical performance and whether its polynomial complexity is feasible for large-scale applications. 3. The framework presumes access to a perfect and cost-free "invalidity oracle." This overlooks the practical challenges of implementing such an oracle, which is often a noisy, biased, and expensive component (e.g., human feedback or another model) in real-world systems. 4. The algorithm's multi-level structure, involving nested optimization loops and projections, appears complex to implement and tune. The practical difficulties of managing these components, particularly the projection onto sublevel sets, may hinder its adoption compared to simpler methods. Refer to Weaknesses	Fully AI-generated
Reducing Hallucinations in Generative Models through Truncated Statistics	Soundness: 3: good Presentation: 2: fair Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	This paper formulates the problem of hallucination in generative models as a constrained loss minimization task. The model is treated as a probability distribution over the data space, and the authors assume access to an invalidity oracle that determines whether a generated instance constitutes a hallucination. The learning objective is to minimize the negative log-likelihood on the training data while constraining the probability mass assigned to the invalid region to remain below a prescribed threshold. The setting is precisely the same as the valid generative modeling framework introduced by Hanneke et al. (2018). The main technical contribution is a computationally efficient algorithm for solving this constrained optimization problem. Crucially, the result holds for a class of so-called “powerful” distribution families, which can be satisfied by any general exponential family. The proof follows by reducing the constrained optimization problem to a truncated negative log-likelihood optimization problem, which allows for efficient resolution. 1. Although I did not verify every details of the proof due to time constraints, the technical result that establishing an efficient solution to the constrained negative-log-likelihood problem over general exponential families appears both interesting and significant. 2. The paper also introduces novel conceptual and technical ideas, notably the notion of the “powerfulness” of a distribution family, which provides a useful structural condition linking expressivity to computational tractability. 1. In line 114, the authors claim that “the assumption is mild and purely about expressivity: any parametric family ….” I understand that the argument is to expand the support by adding an additional element $x^\$ and extending the family via the construction of $p_m$. However, to establish powerfulness*, shouldn’t the proof operate within this expanded family as well? This reasoning works for exponential families, but I do not see how it holds for arbitrary parametric families. 2. It seems to me that the proposed algorithmic approach is specifically tailored to exponential families. Can the method can be extended to more general model classes? 3. Given that the paper’s main contribution is computational efficiency, it would be helpful to include some empirical validation, which shouldn't be too hard. 4. Is the sentence in line 200 generated by LLM? Please address the points in the weakness section.	Lightly AI-edited
Reducing Hallucinations in Generative Models through Truncated Statistics	Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	This paper formulates hallucination reduction as likelihood maximization under an explicit upper bound on the probability of invalid generations. It connects this constrained objective to truncated statistics and proposes a projected SGD–based procedure that actively queries invalid outputs, operating over expressive exponential families. The authors prove computational/query efficiency and provide conditions (“powerful” models) under which proper learning achieves a target hallucination rate. 1) The work gives a statistically, query-, and computationally efficient proper learner that drives the hallucination rate to any target \alpha addressing the efficiency question posed by Hanneke et al. (2018). 2) A key theoretical contribution is the novel bridge established between the generative model hallucination problem and the field of truncated statistics. By reformulating the constrained optimization via a Lagrangian relaxation, the objective function can be mapped to a truncated negative log-likelihood, making the problem tractable and solvable with convex optimization techniques for certain model families. 3) The authors develop a concrete algorithm that is proven to work for "powerful" exponential families, a class of models shown to be sufficiently expressive for this task. The algorithm is efficient, requiring only a polynomial number of invalidity queries and computation time to find a solution that meets the desired hallucination rate while remaining near-optimal in data likelihood. The paper is entirely theoretical and does not include any experiments to validate its claims. While the theoretical contributions are strong, the work would be significantly strengthened by empirical results, even on a synthetic toy model. Such experiments could demonstrate the algorithm's practical behavior, verify that it achieves the target hallucination rate, and provide insight into its performance in a controlled setting. 1) The paper's theoretical results are derived for exponential families, which creates a gap between this setting and the large-scale models like LLMs that motivate the work. Could the authors add a discussion on the primary challenges or potential paths for extending this framework to more general architectures, such as Transformers? 2) Could you report the empirical results on some widely-used hallucination benchmark datasets to validate the proposed method?	Lightly AI-edited

PreviousPage 1 of 1 (3 total rows)Next