ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	1 (33%)	2.00	5.00	7360
Heavily AI-edited	0 (0%)	N/A	N/A	N/A
Moderately AI-edited	0 (0%)	N/A	N/A	N/A
Lightly AI-edited	1 (33%)	2.00	4.00	2897
Fully human-written	1 (33%)	0.00	4.00	1585
Total	3 (100%)	1.33	4.33	3947

Title	Ratings	Review Text	EditLens Prediction
Abnaolizer: An AI Agent for Converting Antibodies to Nanobodies	Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 2: reject Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully.	This paper addresses the lack of large-scale paired data and complex molecular recognition in antibody-to-nanobody conversion by proposing AbNanolizer, a physically guided weakly supervised AI framework. It uses two-stage training (contrastive pre-training for antigen-targeted embedding alignment; multi-task fine-tuning with binding energy as weak supervision), equipped with dual encoders (for antibodies/nanobodies) and a multi-objective scoring head. The contributions of this paper are: 1) Defines "cross-format functional retrieval"; 2) proposes data-efficient weakly supervised framework. 1. This paper aims to rapidly convert validated antibodies into "smaller, more stable, and lower-cost nanobodies", avoiding the long cycle and high cost of traditional processes, and holding practical significance for clinical treatment and drug iteration. 2. The established standardized test set and evaluation protocol can serve as a benchmark for the related field. 1. Dual Limitations in Dataset Type and Scale: -Only the CoV-AbDab dataset is used, which has issues such as "single antigen type (only coronavirus-related targets, excluding cancer/autoimmune disease targets like HER2, TNF-α, and PD-1)" and "single antibody background (not including antibodies from different species such as human, mouse, and camel, and not covering samples with different mutation types/experimental conditions)", making it impossible to verify the model’s generalizability; -No data deduplication strategy is specified (e.g., whether deduplication is based on sequence similarity or antibody ID), and duplicate sequences may cause training redundancy or overfitting. 2. Insufficient Result Reliability Due to Single Structure Prediction Method: -Binding energy calculation relies entirely on PyRosetta, with no comparison to new tools such as AlphaFold3 (full-length protein prediction), blotz-1 (high accuracy for CDR regions), protenix (flexible structure prediction), and chai-1 (multi-template fusion), making it impossible to evaluate the impact of method differences on binding energy (errors may reach 20%-30%); -No experimental structures (e.g., X-ray crystallography, cryo-EM) are supplemented as gold standards to verify the accuracy of calculated binding energy; PyRosetta has limitations in predicting "CDR3 region flexible conformations" and "antigen-binding-induced conformational changes", which may introduce noise into supervision signals. 3. Unreasonable Antibody-Nanobody Alignment Logic: -Alignment relies only on "targeting the same antigen", without considering differences in antibody affinity (different requirements for high-affinity antibodies with KD<1nM and low-affinity antibodies with KD>100nM, and forced alignment introduces noise); -No alignment method is specified for "multiple antibodies targeting the same antigen" (e.g., random pairing, pairing by affinity ranking), and ambiguous rules lead to poor consistency in training data. 4. Lack of Reproducibility Details: -No code repository or complete preprocessing scripts (e.g., antigen sequence retrieval, structure prediction parameter configuration, deduplication and alignment code) are made public; -No version information for structure prediction tools (e.g., PyRosetta 4.5/5.0) or hardware resources (CPU cores, memory) is provided, affecting result reproducibility. 5. Lack of Practical Application Details: -No model inference speed (e.g., time consumption for retrieving 100,000/1,000,000-scale nanobody libraries) or deployment costs are mentioned, failing to meet the "high-throughput screening" requirements of drug R&D; -No discussion on the model’s applicability to "antibody mutant conversion" scenarios (e.g., clinical affinity optimization needs). 1. Regarding Dataset Diversity: -Do you plan to supplement multi-source datasets to cover non-coronavirus targets? If so, which targets are planned, and what is the expected scale of the supplemented dataset (number of sequences, number of antigen types)? -Have you tested the model’s performance in "cross-dataset transfer" scenarios? If tested, how do the Top-10 retrieval accuracy and binding energy improvement of each encoder differ from those on the original dataset? 2. Regarding Comparison of Multi-Structure Prediction Methods: -Have you tested tools such as AlphaFold3, blotz-1, protenix, and chai-1? What are the differences in the average binding energy of encoders like ESMC across different tools? Is there a pattern where "certain tools are more suitable for this task" (e.g., does blotz-1’s high CDR region prediction accuracy improve binding energy reliability)? -Have you selected experimental structure pairs to verify the MAE between the binding energy of each tool and experimental values? What is the MAE of each tool, and why was PyRosetta ultimately chosen? -Have you considered "multi-tool signal fusion"? How are weights determined (e.g., inverse weighting by MAE), and does fusion improve model performance? 3. Regarding Data Deduplication: -What deduplication strategy is used (sequence similarity, antibody ID, antigen type)? How is the deduplication threshold (e.g., sequence identity percentage) determined (whether referring to industry standards)? -What are the dataset scales (number of sequences, number of antigens, number of antibody types) before and after deduplication? Have you verified the impact of deduplication on training stability (e.g., loss convergence speed, generalization error)? 4. Regarding Antibody-Nanobody Alignment Logic: -Have you collected antibody affinity data (KD values, EC50)? If so, why was affinity not incorporated into the alignment strategy (e.g., prioritizing pairing high-affinity antibodies with high-binding-energy nanobodies)? If not, do you plan to supplement this data to optimize the alignment logic? -When multiple antibodies target the same antigen (e.g., 50 antibody sequences associated with a SARS-CoV antigen in CoV-AbDab), what alignment method is used (e.g., random pairing, pairing by publication time, one-to-many)? Have you tested the impact of different alignment methods on Top-10 accuracy and binding energy improvement? 5. Regarding Reproducibility: -Do you plan to make public complete code (encoder training, multi-tool binding energy calculation, deduplication and alignment scripts) and preprocessing scripts? If so, when is the expected upload time, and will you provide installation and parameter configuration guidelines for tools like PyRosetta and AlphaFold3? -What are the specific versions of tools like PyRosetta used? Can different versions cause differences in binding energy, and have you verified the impact of version consistency on results? 6. Regarding Inference Efficiency: -What is the time consumption for the model to retrieve 100,000/1,000,000-scale nanobody libraries? Have you considered accelerating retrieval using approximate nearest neighbor methods such as FAISS? Is the accuracy drop after acceleration within an acceptable range (e.g., ≤3%)? 7. Regarding Multi-Indicator Utilization: -Do indicators predicted by the scoring head (e.g., interface energy, number of hydrogen bonds) assist in ranking? Have you tested "multi-indicator weighted fusion ranking" (e.g., 0.7 weight for binding energy, 0.3 weight for number of hydrogen bonds)? Does fusion improve Top-10 retrieval accuracy?	Fully AI-generated
Abnaolizer: An AI Agent for Converting Antibodies to Nanobodies	Soundness: 1: poor Presentation: 2: fair Contribution: 1: poor Rating: 0: Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper introduces Abnaolizer - a framework for converting aiming at converting conventional antibodies to nanobodies (single-chain antibodies). - Paper presents an original idea and an ML framework - Presented ML framework outperforms the simpler baselines on the selected metrics. - While the proposed task is original I have serious doubts about its relevance & applicability. In practice, a scenario where it would be interesting to generate a nanobody, while already having a developed antibody is extremely rare. - The proxy for binding used by the authors is based on the force-field estimated energy. This is misleading given the claims author make about functional improvements - this aspect was not tested and there’s no convincing evidence in the paper pointing towards that. - Contunuing on the point above, force field methods are rather weak estimators of binding propensities, even within similar variants of the same proteins. Comparing energy scores of between largely different proteins (and also of different molecular weight, interaction area etc) is even more risky and I’d assume the performance being close to random. - Authors train their method on Covabdab database which contains only COVID19-related immunoglobulins. Due to this, the general claims about converting arbitrary antibody to nanobody are a bit exaggerated and not supported by evidence. - For the qualtitative example / case study - I miss the information how the retreived nanobody compares to the original query antibody? Are there any sequence similarities, similar predicted binding mode?	Fully human-written
Abnaolizer: An AI Agent for Converting Antibodies to Nanobodies	Soundness: 1: poor Presentation: 2: fair Contribution: 2: fair Rating: 2: reject Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	AbNanolizer addresses the practical problem of converting validated conventional antibodies (dual-chain IgG) into functionally equivalent or superior nanobody candidates. The authors formalize this as antigen-conditioned cross-format functional retrieval and multi-objective ranking. The framework uses a dual-encoder backbone (separate encoders for antibody and nanobody sequences) to embed inputs into a shared representation space, followed by a pairwise scoring head. Training proceeds in two phases: contrastive pre-training (InfoNCE) to align antibody–nanobody representations and multi-task fine-tuning using in silico generated physicochemical labels derived from an AlphaFold2-Multimer → PyRosetta docking/relaxation pipeline. The fine-tuning objective combines contrastive regularization, pairwise ranking (RankNet), and Huber regression to improve both ranking and calibration. Empirical evaluation is conducted on 10 anti–SARS‑CoV‑2 antibodies. The paper is clearly written. 1. Limited scope and uncertain generalization. All experiments are confined to anti–SARS‑CoV‑2 antibodies. Important antigen classes such as influenza, HIV, bacterial toxins, and non-viral targets are absent, leaving the framework’s ability to generalize across antigens, epitopes, and sequence/structure regimes untested. 2. Ambiguity in “antigen-conditioned” retrieval. While antigen labels appear to organize training pairs, the antigen does not seem to be provided explicitly as an input at inference. Without an explicit antigen-conditioned mechanism in the scoring function, the claim of antigen conditioning risks being misleading. The paper should clarify whether and how antigen information is encoded and used at retrieval time. 3. The evaluation centers on binding energy proxies computed from predicted complexes. Given the dependence on modeled structures, standard confidence metrics (e.g., AF2-multimer/AF3 pLDDT, pTM, and ipTM). 4. Writing and presentation issues. The manuscript needs careful proofreading and polishing. For instance, a. Line 256: “We assess our proposed AbNanolizer framework on a series of challenging tasks, including: 1. Functional nanobody retrieval (3.1); 2. Ablation studies of the training strategy (3.2); and 3. A case study analysis (3.3).” Ablations and case studies are methodology analyses, not “challenging tasks”. b. Line 123: missing space; “ihas” should be “i has.” c. Line 124: “j” should be typeset in italics to match mathematical notation. 1. What is the computational cost of the full pipeline (contrastive pre-training, fine-tuning, AF2-Multimer + PyRosetta labeling), and how does it scale to larger libraries? 2. Have you performed any analysis to interpret what the model has learned, such as using attention map visualization or embedding space analysis to reveal whether the model is focusing on specific biophysical properties?	Lightly AI-edited

PreviousPage 1 of 1 (3 total rows)Next