ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	1 (25%)	6.00	2.00	2763
Heavily AI-edited	1 (25%)	6.00	4.00	1860
Moderately AI-edited	1 (25%)	2.00	5.00	2868
Lightly AI-edited	1 (25%)	4.00	4.00	1941
Fully human-written	0 (0%)	N/A	N/A	N/A
Total	4 (100%)	4.50	3.75	2358

Title	Ratings	Review Text	EditLens Prediction
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models	Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The paper proposes DERMARK, a dynamic multi-bit watermarking scheme for autoregressive LLMs that (1) models per-segment embedding success via a CLT/Poisson-binomial approximation, deriving an inequality to decide when a generated token segment has enough capacity to encode one watermark bit, (2) performs online variable-length segmentation during generation to place each watermark bit into just-large-enough segments, and (3) recovers bits with a dynamic-programming extractor that minimizes segmentation + color losses to improve robustness to edits. Empirically the method is evaluated on OPT-1.3b and LLaMA-2-7b and shown to reduce tokens-per-bit, lower embedding-time overhead, and increase robustness vs a Balance-Marking baseline. 1. Principled theoretical framing (Poisson-binomial → CLT → inequality) that connects token-level probabilities to required segment length. 2. Practical algorithm: online segmentation during inference with negligible extra compute compared to baseline multi-bit methods. Reported embedding overhead is near zero and extraction is efficient enough for practice. 3.Strong empirical gains on tokens-per-bit and robustness to small insertion/deletion attacks across two model families. Table 1 + figures show consistent improvements. 1. CLT approximation may be unreliable when segments are short (the very regime the method targets), and the paper lacks finite-sample error bounds or bootstrap-style corrections. 2. many heuristics (λ smoothing, β weighting, iterative ϵ updates). The paper reports defaults but more ablations on hyperparameter sensitivity and cross-domain robustness (beyond news-like prompts) would be helpful. 3. the authors justify using Balance-Marking as SOTA and critique MPAC; still, including more recent multi-bit baselines (or reproducing MPAC carefully under comparable settings) would make the empirical claims stronger. The authors discuss this choice, but reviewers may still view it as a gap. 4. the method improves edit robustness but remains vulnerable to large rewrites / reorderings — inherent to dispersed multi-bit strategies. The paper states this limitation but does not quantify the breakpoint where robustness collapses. 1.Can you provide a small-N correction or empirical calibration strategy that quantifies CLT approximation error for segments of length, say, 5–20 tokens? (A simple calibration table would help.) 2.How sensitive are final detection rates to β and λ across domains (e.g., code, dialogue, scientific text)? Please provide an ablation sweep or an appendix table. 3.The DP extractor is O(N²); what are practical limits on N for real documents? Is there a streaming or beamed approximate extractor that keeps near-optimal segmentation with lower cost?	Fully AI-generated
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	The paper introduces a multi-bit text watermarking method for LLMs that dynamically determines segment lengths for embedding each watermark bit based on a probabilistic criterion derived from the model’s logits. It (1) analyze watermark embedding as following a normal distribution, leading to an inequality that estimates whether a segment has enough capacity to reliably encode one bit; (2) use this condition online during generation to adaptively end a segment and move to the next bit; and (3) propose a dynamic-programming extractor that combines a segmentation loss (how tightly the inequality is satisfied) with a “color” imbalance loss to improve robustness to edits. Experiments on OPT-1.3B and LLaMA-2-7B claim fewer tokens per bit and lower time overhead than Balance-Marking, with improved robustness to token insertions/deletions. 1. The paper explains why multi-bit watermarking (beyond one-bit detection) is needed for fine-grained attribution (LLM/user) and why fixed-length segmentation can fail, especially on low-entropy text. 2. Derives an inequality from a CLT-style analysis that treats aligned-token proportion as approximately normal; this enables an online, per-bit stopping rule during generation. 3. For matched detection rates, DERMARK uses fewer tokens per embedded bit, with further gains on low-entropy subsets. 1. The experimental setup is outdated. There are many new multi-bit watermarking works. However, this paper only uses one 2023 paper as a baseline. Comparing with more extensive, recent baselines will strengthen the claims. 2. Considering most application scenarios of the LLM watermark are under the chat. Evaluating the performance on long-form QA dataset and instructed models (e.g., Llama-3.1-8B-Instruct) is necessary. 3. Robustness focuses on random insert/delete at 5–10%, which is limited compared with existing works, e.g., [1]. [1] http://arxiv.org/abs/2401.16820 Please see above.	Lightly AI-edited
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models	Soundness: 2: fair Presentation: 3: good Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper proposes a new watermarking framework for LLMs that dynamically adjusts watermark embedding based on text capacity and token statistics. 1. DERMARK adaptively determines segment boundaries in real time based on token-level statistics, achieving 2–4 fewer tokens per bit at the same detection rate. This dynamic rule substantially enhances embedding efficiency without retraining. 2. The embedding complexity is linear (O(N)) and extraction is O($kL^2$), and test for a large model (LLaMA-2-70B). The method is fully plug-and-play, requiring no fine-tuning or architectural modification. 3. The inclusion of perplexity (PPL) experiments confirms that semantic quality is largely preserved across different watermark strengths ($\delta$), addressing the concerns about possible generation degradation. 1. While Appendix C discusses MPAC (NAACL 2024) conceptually, the paper still provides no quantitative comparison with recent multi-bit watermarking approaches. Furthermore, I think the method can be extended to multi-bit watermarking methods such as MPAC. A discussion of this point would greatly strengthen the paper. 2. The robustness tests remain restricted to random insertion/deletion attacks. No experiments address paraphrasing, shuffling, gradient-based, or LLM-assisted removal attacks, which are crucial for assessing real-world resilience. 3. The central-limit-theorem assumption in Lemma 2 is untested for short segments, leaving the statistical soundness of the normal approximation uncertain. 4. Detection assumes perfect access to the watermark key and exact segmentation alignment; the paper does not discuss desynchronization or partial-key scenarios. 5. Although equations for bias parameters and color loss are formalized, their conceptual motivation and iterative update dynamics remain only briefly explained. See above.	Heavily AI-edited
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 2: reject Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully.	This paper introduces DERMARK, a dynamic multi-bit watermarking framework for large language models (LLMs). The method adaptively determines text segment lengths during generation based on an inequality derived from a normal distribution assumption, aiming to balance watermark capacity, efficiency, and robustness. * The paper is well-motivated, addressing the limitations of fixed-length segmentation in prior multi-bit watermarking methods. * The theoretical formulation connecting watermark embedding and normal distribution is novel and mathematically rigorous. * Typos and citation issues Yoo et al. (2024a) and Yoo et al. (2024b) refer to the same paper and should be merged. Line 265: “Eq. equation 4” should be corrected to “Eq. (4)” for consistency. * Misrepresentation of prior work (L115) The description of Yoo et al. (2024b) is inaccurate. Their method does not assign bits to segments manually; instead, the bit–token mapping is determined via a hash function, as shown in BiMark [1], Robust Multi-bit Watermarking [2], and StealthInk [3]. The paper should revise this discussion to reflect the actual mechanism. * Limited experimental comparison The comparison in Section 5 includes only Balance-Marking. More recent and representative baselines such as MPAC (Yoo et al., 2024a), BiMark [1], and StealthInk [3] should be incorporated to strengthen the empirical claims. Without these, it is difficult to assess the relative advantage of DERMARK in the evolving landscape of multi-bit watermarking. [1] Feng, X., Zhang, H., Zhang, Y., Zhang, L. Y., & Pan, S. (2025). BiMark: Unbiased Multilayer Watermarking for Large Language Models. arXiv:2506.21602. [2] Qu, W., Zheng, W., Tao, T., Yin, D., Jiang, Y., Tian, Z., ... & Zhang, J. (2025). Provably Robust Multi-bit Watermarking for AI-generated Text. USENIX Security 2025. [3] Jiang, Y., Wu, C., Boroujeny, M. K., Mark, B., & Zeng, K. (2025). StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models. arXiv:2506.05502. * Incomplete treatment of text-length limitations Section 4.2 discusses handling overly long text but overlooks the case when the generated text is too short to encode the full bit string. Appendix F.1 briefly mentions this issue, but the main paper should explicitly explain how DERMARK behaves or fails in this scenario, and whether it adapts δ or truncates the watermark. * Segmentation vulnerability and missing evaluation metrics The method’s reliance on segmentation per bit raises robustness concerns when the text is truncated or edited. As each bit is tied to a segment, any truncation makes bit recovery impossible. Moreover, although DERMARK is presented as a multi-bit watermark, the bit match rate (BMR)—a standard evaluation metric for multi-bit detection—is not reported. Including this metric would provide a fairer comparison.	Moderately AI-edited

PreviousPage 1 of 1 (4 total rows)Next