ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	0 (0%)	N/A	N/A	N/A
Heavily AI-edited	0 (0%)	N/A	N/A	N/A
Moderately AI-edited	2 (67%)	7.00	3.50	1866
Lightly AI-edited	0 (0%)	N/A	N/A	N/A
Fully human-written	1 (33%)	4.00	3.00	3369
Total	3 (100%)	6.00	3.33	2367

Title	Ratings	Review Text	EditLens Prediction
Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment	Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	The paper proposes ProtAlign, a multi-objective preference-alignment framework for protein inverse folding that optimizes developability properties without compromising designability. The method uses a semi-online DPO loop: generate rollouts at higher temperature, score them with property predictors, construct pairwise preferences per property, then train offline with an adaptive preference margin to reconcile conflicts among objectives. Instantiated on ProteinMPNN as MoMPNN, the approach is evaluated on CATH 4.3, de novo backbones from RFDiffusion, and realistic de novo binders; results show developability gains while maintaining or improving structural consistency relative to strong baselines. - Method is simple and general: multi-objective DPO with an adaptive preference margin to mitigate conflicts across properties; the training pipeline evenly samples pairwise entries across properties and alternates rollout and training for efficiency. - Practical semi-online training decouples rollout/evaluation from optimization, enabling batch computation and easier deployment while retaining online exploration benefits. - Evaluations are broad and application-relevant: crystal redesign, de novo backbones, and realistic binder design; the study systematically integrates developability metrics into inverse-folding evaluation beyond amino acid recovery. - The presentation style is good, with nice-looking figures and easy-to-follow-up narration styles. - Limited ablations on multi-objective weights and margin settings. It might be helpful to quantify how weights, temperature, and margin thresholds shape the Pareto front and to provide transferable default configurations as the paper heavily relies on it. - The adaptive preference margin m(yw,yl) is precomputed from auxiliary property deltas and then kept fixed during training. This is simple and fast, but it cannot react if the policy distribution drifts, predictors recalibrate, or property trade-offs evolve; the “right” margin may change as the frontier moves. - Pair construction may over-represent “easy wins” and under-sample ambiguous regions. Preference pairs are formed by sorting rollouts and pairing top-half vs. bottom-half, with a delta threshold to drop uncertain pairs. While this stabilizes supervision, it can bias learning away from the decision boundary where the frontier is decided. Active pair mining (hard-negative selection) or uncertainty-aware sampling could help learn more from the ambiguous region and reduce label imbalance across properties. - Can the weights across properties and the adaptive margin be tuned online using objective-improvement rates to more reliably approach a Pareto front across backbones and lengths? - What is the effect of the number of rollouts and sampling temperature on the stability of training and final metrics in the semi-online loop, given that the paper uses a higher temperature for exploration but evaluates at a lower temperature for ProteinMPNN-family models?	Moderately AI-edited
Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment	Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 8: accept, good paper Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	This paper applies multi-objective preference optimization to protein inverse folding, using semi-online DPO with adaptive margins to balance structural accuracy against properties like solubility and thermostability. The resulting model, MoMPNN, beats existing baselines across several benchmarks. The approach is solid but not particularly novel—it's essentially transplanting techniques from LLM alignment into protein design. That said, the execution is strong: the experiments are thorough, the amino acid distribution analysis shows the model learns sensible patterns, and the framework appears general enough to extend to other properties. The comprehensive evaluation is strong. See summary See summary No questions.	Moderately AI-edited
Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The paper addresses the challenge that protein inverse folding models must balance designability (recovering a backbone) with developability properties (e.g., solubility, thermostability). The authors propose ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained models using a semi-online Direct Preference Optimization (DPO) strategy. The method uses a flexible preference margin to mitigate conflicts between competing objectives and constructs preference pairs using in silico property predictors. Applying this to ProteinMPNN yields MoMPNN. Experiments on CATH 4.3 crystal structures , de novo backbones , and binder design scenarios show that MoMPNN enhances developability properties without compromising structural fidelity compared to baselines. This method improves developability metrics using a preference alignment framework , which does not require additional specific, curated datasets of experimentally-validated proteins. The authors evaluate MoMPNN on a strong set of tasks beyond standard sequence recovery. This includes redesigning CATH 4.3 crystal structures, designing sequences for de novo generated backbones, and a practical de novo binder design scenario. This rigorous evaluation demonstrates the method's utility in realistic design workflows where other baselines show performance degradation. It would be better to report the metrics on ground truth sequences, as these metrics are based on prediction models as approximations. Full names of abbr.’s in tables are missing in the captions. The temperatures used in inference of different baselines are not identical, resulting in potentially unfair comparison. A fair comparison would be either the greedy strategy (without temperature), or comparing the best point on the temperature-performance curves between different methods; or at least report the results under one identical temperature. is it a typo in eq 4? k appears in the formula of m, which seems irrelevant to k. Explanation for the relationship between L and L_MO is needed. Why is the AAR of ProteinMPNN on CATH 4.3 test 0.39, which seems lower than most of the reproduction of this model, e.g., 0.44 on CATH 4.3 was reported in ProteinInvBench? If this AAR is not correct, does it indicate a significant compromise of AAR? RL-based preference methods like ProteinDPO for inverse folding are discussed in the related work section. Why are they compared as baselines? They are supposed to be the most related baselines. Regarding the semi-online training strategy, is the preference dataset $\mathcal{D}_k$ at iteration t cumulative (containing all rollouts from iterations $1 \dots t$), or is it replaced entirely by the new rollouts? The paper provides a compelling comparison against a "Weighted-score DPO" baseline in Appendix A.2, showing MoMPNN is more stable. Can the authors provide more intuition on why the flexible margin (Eq. 4 ) achieves better and more stable multi-objective optimization compared to directly optimizing on a weighted sum of preference scores? The model is trained on protein monomers but evaluated on a de novo binder design task, which involves protein complexes. Did the authors observe any specific failure modes or performance issues at the binder-target interface, given that the model was not explicitly trained for complex-specific properties?	Fully human-written

PreviousPage 1 of 1 (3 total rows)Next