ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	0 (0%)	N/A	N/A	N/A
Heavily AI-edited	1 (33%)	2.00	3.00	2650
Moderately AI-edited	0 (0%)	N/A	N/A	N/A
Lightly AI-edited	0 (0%)	N/A	N/A	N/A
Fully human-written	2 (67%)	2.00	4.50	1132
Total	3 (100%)	2.00	4.00	1638

Title	Ratings	Review Text	EditLens Prediction
Endogenous Communication in Repeated Games with Learning Agents	Soundness: 2: fair Presentation: 1: poor Contribution: 2: fair Rating: 2: reject Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The paper studies pre-play communication in infinitely repeated games. Each agent observes a private signal and sends a discrete message via an encoder constrained by a mutual information budget. Policies are learned by mirror descent; encoders maximize expected continuation value minus λ\times MI. The authors define a “stable communicating equilibrium” where policies are best responses, encoders are budget-optimal, and learning converges. They show that: (1) if budgets exceed a problem-specific threshold κ*, value-sufficient messages enable efficient payoffs; (2) below threshold, any equilibrium pools signals into a finite partition bounded by exp κ, implying a welfare gap; and (3) standard no regret dynamics are sufficient to reach a near stable point with O(1/epsilon^2) data. 1. The paper poses a clear, meaningful problem and introduces a formulation that links repeated‐game incentives with information-constrained pre-play communication, with a notion of stable communicating equilibrium. 2. The thresholding results given by Theorems 1–2 is clean and interesting: when the information budget exceeds a problem-specific threshold value-sufficient messaging can implement efficient outcomes; when it does not, any equilibrium must pool signals, leading to an unavoidable welfare loss. 1. The writing is often unclear. Key terms such as the formal definition of V, the notion of Lipschitz continuity, and the exact meaning and role of the learning rate \eta are never properly defined. It’s also confusing to bundle assumptions about the game itself and the learning algorithm into one block. The reference to “standard folk theorem” should be made explicit rather than assumed. 2. The proofs are mostly brief sketches and difficult to follow. The theorems are not stated in a fully formal way, and several terms used in them are never clearly introduced. 3. The related-work discussion is thin. It mentions prior directions in broad terms but does not cite or compare against specific, closely related papers. Overall, the paper is very hard to follow, especially for readers who are not already experts in all relevant literatures. Clearer structure and more careful exposition would make it far more readable. 1. Could the authors clearly define all notation and formally state each theorem, giving precise definitions for verbal notions and complete proofs instead of sketches? The paper is quite hard to follow, and clearer formalization would make it easier to evaluate. 2. In Theorem 3, the learning-rate choice (\eta_t \propto t^{-1/2}) appears inconsistent with Assumption 1’s requirement that (\sum_t \eta_t^2 < \infty)?	Heavily AI-edited
Endogenous Communication in Repeated Games with Learning Agents	Soundness: 1: poor Presentation: 1: poor Contribution: 1: poor Rating: 2: reject Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully.	The authors study a model in which no-regret learning agents are augmented with the ability to send costless messages to each other. I think the intersection of agent communication and learning in games can produce interesting settings and research directions. The paper is very poorly written. There are 10 references, some of which are only tangentially related, and none of which are even mentioned in the main body, unless I have missed something. The proofs are too informal and vague, and have several non-sequiturs. The setup is not specific enough. This is not a length issue either; the paper is only six pages long including appendix, and the extra length could easily have been used to provide much more relevant detail. These writing issues alone are enough to recommend rejection. I implore the authors to add more detail. The setting certainly looks interesting enough that there could be some interesting results and analysis in this paper, but the writing issues meant that I gave up on attempting to parse the paper before being able to come to a complete understanding of what the claims and techniques are. None.	Fully human-written
Endogenous Communication in Repeated Games with Learning Agents	Soundness: 2: fair Presentation: 1: poor Contribution: 2: fair Rating: 2: reject Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper analyzes endogenous communication among learning agents in infinitely repeated stage games with a costless pre-play channel. Each agent compresses its private signal via an encoder subject to an information budget, then plays the stage game; policies are updated by no-regret learning, while encoders optimize a myopic value-minus-information objective. 1. The paper cleanly ties cheap talk and information bottlenecks: it formalizes value-sufficiency, defines a budget threshold, proves existence of efficient communication above the threshold, and a necessary pooling structure with an explicit welfare-gap lower bound below it. These results offer actionable predictions about when emergent messages become informative vs. collapse 2. The stability notion is coupled to no-regret policy updates and information-penalized encoder updates, with a convergence guarantee of samples under standard step sizes and ergodicity. The provided alternating scheme makes the framework concrete 1. The paper is incomplete, lack a great amount of details. The proof is only sketch. 2. Many assumptions are strong and unjustified. NA	Fully human-written

PreviousPage 1 of 1 (3 total rows)Next