|
Enzyme-Unified: Learning Holistic Representations of Enzyme Function with a Hybrid Interaction Model |
Soundness: 2: fair
Presentation: 2: fair
Contribution: 2: fair
Rating: 6: marginally above the acceptance threshold
Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
This paper addresses critical limitations in enzyme function prediction—isolated single-property prediction and overestimated performance due to homology-biased datasets—by proposing ENZYME-UNIFIED, a multi-task learning framework for holistic enzyme property prediction. The core innovation is a Hybrid Interaction Model that dynamically fuses fine-grained local interactions (via cross-attention) and global feature representations (via concatenation) using a trainable gate. The framework simultaneously predicts five key enzyme properties: turnover number, Michaelis constant, catalytic efficiency, optimal temperature, and optimal pH.
To enable robust evaluation, the authors construct three large-scale, sequence-dissimilar datasets (clustered by 40% sequence identity to avoid homology leakage) for the five target properties. Experiments show ENZYME-UNIFIED SOTA performance on the public CataPro benchmark and their new datasets. Ablation studies validate the synergy of the hybrid architecture and the value of the trainable gate, while a case study on Ribonuclease A (RNase A) confirms the model’s ability to identify biochemically relevant catalytic sites, ensuring interpretability.
Key contributions include: (1) the ENZYME-UNIFIED framework with a novel Hybrid Interaction Model; (2) three rigorously partitioned, homology-unaware datasets for multi-property enzyme prediction; (3) SOTA results across kinetic and environmental property prediction, with validated interpretability.
- Methodological novelty: The gated hybrid architecture elegantly bridges fine-grained molecular interaction modeling with traditional global encoders.
- Careful dataset curation, transparent evaluation, and homology-aware partitioning.
- Consistent improvement over strong baselines (CataPro, UniKP) across multiple properties.
- Limited interpretability generalization: The RNase A case study is convincing but narrow. The model’s ability to identify catalytic sites is only demonstrated for one enzyme (a ribonuclease). Extending this to 2–3 additional enzymes from different EC classes (e.g., lactase, a common hydrolase) would confirm that the attention mechanism consistently targets functional sites across enzyme types, rather than RNase A-specific patterns.
- Limited evidence of cross-task synergy: Each property is modeled independently; a joint multi-output model might better support the “unified” claim.
Have you tested the model’s attention mechanism on additional enzymes (e.g., lactase, cytochrome P450) to confirm it consistently identifies catalytic sites across EC classes? If not, could you include this analysis in a revised version? |
Fully AI-generated |
|
Enzyme-Unified: Learning Holistic Representations of Enzyme Function with a Hybrid Interaction Model |
Soundness: 2: fair
Presentation: 2: fair
Contribution: 2: fair
Rating: 4: marginally below the acceptance threshold
Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
The paper identifies two significant limitations in the current machine learning-based prediction of enzyme properties: 1) models predict properties in isolation, failing to capture the biophysical interplay between them , and 2) models are often evaluated on homology-unaware, biased datasets, leading to inflated performance.
To address this, the authors present two main contributions: (1) three new large-scale, rigorously partitioned datasets for multi-property prediction; (2) ENZYME-UNIFIED, a unified framework for holistic enzyme property prediction, powered by a novel HYBRID INTERACTION MODEL that adaptively fuses global and local interaction features for more powerful and flexible representations.
The authors train independent instances of this model for five properties and demonstrate SOTA performance.
The paper does an excellent job motivating the work. The critique of the "fragmented" single-task paradigm and the practical need for a "holistic view" of an enzyme's profile is very compelling .
The creation and public release of three new, large-scale datasets is a significant contribution to the field.
The case study provides strong evidence that the fine-grained attention pathway is learning biochemically meaningful information, as it correctly identifies the catalytic histidines
The introduction is built entirely on the need to move beyond the single-task research paradigm. It argues for capturing the intricate biophysical interplay and inter-property relationships using a multi-task learning paradigm that can co-predict multiple, interdependent properties. However, The implementation seems to completely contradicts this. Section 3.3 explicitly states: "The Enzyme-Unified hybrid architecture is trained independently for each of the five target properties..." Figure 1 explicitly labels the output as an "All-in-one model" but it seems that It is an all-in-one architecture used to train five "one-at-a-time" models. This seems to be a critical distinction, and the current framing overstates the contribution.
If the above judgement is correct, then a direct comparison between the "trained independently" strategy and a true multi-task learning strategy (e.g., a shared hybrid trunk with five separate prediction heads trained jointly) will be very helpful.
The authors use ProtT5 as baselines but combine ProstT5 in their algorithm without clear explanation (i.e. why cannot combine ProtT5 or why the baseline cannot use ProstT5)
See above. |
Fully human-written |
|
Enzyme-Unified: Learning Holistic Representations of Enzyme Function with a Hybrid Interaction Model |
Soundness: 2: fair
Presentation: 3: good
Contribution: 2: fair
Rating: 4: marginally below the acceptance threshold
Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully. |
The paper established a suite of three new datasets for predicting catalytic efficiency, optimal temperature and optima pH. Then the paper developed a unified framework to simultaneously predict five distinct properties.
1. The curated three new datasets of different enzyme properties are good contributions to the enzyme design and enzyme engineering committee.
2. The idea of simultaneously predicting different enzyme properties are useful.
1. The performance of the provided method show very minor improvement compared to baseline models. I'm not sure if this improvement is significant.
2. The paper lacks some baselines. As it targets at overcoming the limitation of previous works that they solely predict different enzyme properties. There should be baselines to simultaneously finetune a pretrained protein model on different tasks, like finetuning ESM2 and ProtT5 using multiple task layers to achieve the goal of this method. This baseline is more fair to the setting of the proposed method and could be better compare the strong baseline and the proposed framework performance. Additionally, since the performance improvement of the proposed method is quite minor, it would be interesting to see the performance of multitask finetuning on large-scale pretrained models like ESM2-15B and ESM2-3B.
Please see above weaknesses. |
Fully human-written |
|
Enzyme-Unified: Learning Holistic Representations of Enzyme Function with a Hybrid Interaction Model |
Soundness: 2: fair
Presentation: 3: good
Contribution: 2: fair
Rating: 2: reject
Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. |
This paper proposes ENZYME-UNIFIED, a multi-task learning framework that holistically predicts five key enzyme properties, including kinetic constants and environmental optima, via a novel Hybrid Interaction Model that fuses fine-grained cross-attention and global feature concatenation. The authors present three rigorously partitioned, sequence-dissimilar benchmark datasets for fair evaluation and demonstrate state-of-the-art results on both public and new benchmarks, supported by strong ablation and interpretability studies.
1. Architectural innovation: The Hybrid Interaction Model dynamically integrates token-level cross-attention and global feature concatenation, with a learned gate, to represent both local and global enzyme-substrate interactions.
2. New dataset construction: Three new, large-scale, non-homologous datasets are described, with careful cluster-based partitioning to enforce sequence dissimilarity.
3. Reproducibility: Datasets and code are promised for release, and hyperparameter details are comprehensive.
4. Clear, concise presentation: The manuscript flows well, with logically organized sections, visual explanations, and clear mathematical exposition.
1. Limited discussion and incorporation of prior multi-task and multi-label enzyme function prediction works: Both the related work section and the experimental comparisons are missing several directly relevant works, such as CLEAN (Yu et al., 2022), EnzymeCAGE (Liu et al., 2024), and EZSpecificity (Cui et al., 2025). These studies have already addressed multi-label or holistic enzyme function prediction using deep learning models. Their methods and results should be discussed, compared, and cited to properly position the contribution of this work, especially since the novelty of ENZYME-UNIFIED relies heavily on its multi-task, unified perspective. This omission hinders the reader's ability to measure the paper's progress relative to the existing literature.
(1) Yu, Tianhao, et al. "Enzyme function prediction using contrastive learning." Science 379.6639 (2023): 1358-1363.
(2) Liu, Yong, et al. "EnzymeCAGE: a geometric foundation model for enzyme retrieval with evolutionary insights." bioRxiv (2024): 2024-12.
(3) Cui, Haiyang, et al. "Enzyme specificity prediction using cross attention graph neural networks." Nature (2025): 1-3.
2. Clarity in loss transformation and objective function. The transformation $T(y)$ (Section 3.3) is piecewise, distinct for kinetic and environmental tasks. However, it isn’t clear how this interacts with the MSE loss numerically or whether separate losses are weighted; multi-task loss balancing (if present) isn't explicitly described, which could affect model optimization in joint settings.
3. The potential for data leakage in dataset construction is unaddressed.
4. Missing details in modeling token-level chemical interactions: The fine-grained interaction pathway is elegantly formulated, but the actual operationalization of the cross-attention is not deeply detailed.
5. Limited novelty in the model architecture.
1. The paper argues that a limitation of existing deep learning models is their tendency to predict only single attributes while ignoring the correlations between them. However, ENZYME-UNIFIED appears to be trained on multiple task datasets separately, without establishing connections between these different task sets. This approach would also seem to ignore inter-attribute correlations. Could the authors please address this apparent contradiction?
2. Numerous previous works have addressed enzyme function prediction tasks. It seems these existing models could be adapted to the dataset presented in this paper by merely modifying their training objectives. What was the consideration for excluding these models from the baseline evaluation?
3. Did dataset construction strictly avoid information leakage from meta information (e.g., substrate names, assay condition annotations), not just sequence similarity? Can the authors provide statistics on maximum sequence identity or substrate overlap between train/test folds?
4. Please clarify the implementation details for handling disparate sequence lengths in cross-attention: Is there padding, masking, or special encoding to preserve biochemically plausible alignment or neighborhood context between enzyme and substrate tokens? Could position embeddings (absolute vs. relative) affect fine-grained interaction modeling? |
Fully AI-generated |