ICLR 2026 - Reviews

SubmissionsReviews

Reviews

Summary Statistics

EditLens Prediction Count Avg Rating Avg Confidence Avg Length (chars)
Fully AI-generated 1 (25%) 6.00 3.00 2407
Heavily AI-edited 0 (0%) N/A N/A N/A
Moderately AI-edited 1 (25%) 4.00 1.00 1799
Lightly AI-edited 1 (25%) 4.00 3.00 2989
Fully human-written 1 (25%) 4.00 4.00 2375
Total 4 (100%) 4.50 2.75 2392
Title Ratings Review Text EditLens Prediction
MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The paper introduces MultiCFV, a deep learning framework for detecting control-flow-related vulnerabilities and code clones in smart contracts. It combines control-flow graphs (CFG) extracted from bytecode and abstract syntax trees (AST) from source code, and exploit a GRU-GCN and another independent network to process them. Moreover, comment embeddings encoded by fine-tuned BERT models are also used.The three input features are concatenated for final prediction. Experiments on four public datasets show the proposed model outperforming existing static tools. - This work focuses on a pratical problem and address real-world vulnerabilities. - The combination of structural (CFG), syntactic (AST), and semantic (comments) information in a framework is a contribution. - Implementation details and source code are provided. - The proposed architecture integress three components (BERT, GCN, and another network) to process three different features (comment, CFG, and AST). However, all these techniques have already been well explored and widely used in existing approaches. This work represents an incremental extension of earlier multi-encoder frameworks, rather than aligning with the current frontier of LLM-driven contract analysis. - Some implementation details are missing. For example, the AST feature is extracted by a deep learning model but the architecture is not clearly provided. - The paper does not include any baseline or discussion involving modern LLM-based approaches. - The paper claims utilizing deep learning techniques can enable faster and more efficient detection, but there are no measurements of inference time or computational cost on the vulnerability detection task. - The writings need improving. There are several grammatical errors and typos (e.g., To overcome the time-consuming and labor-intensive,). - Please improve the writing quality. - Could the method generalize to function-level or statement-level vulnerability detection instead of contract-level? - Please clarify the computational cost of MultiCFV compared to the compared baselines on the vulnerability detection task. - It would strengthen the paper to include a comparison with modern LLMs and explain why a finetuned GCN+BERT architecture remains necessary in the current LLM era. - It would benifit the paper if evaluation on real-world deployed contracts is provided. Fully human-written
MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 4: marginally below the acceptance threshold Confidence: 1: You are unable to assess this paper and have alerted the ACs to seek an opinion from different reviewers. The paper introduces a multimodal deep-learning framework to detect erroneous control-flow vulnerabilities in Ethereum smart contracts and to perform contract-level clone detection. It fuses three complementary views into a single contract representation used for verification and similarity search. The authors claim the first application of multimodal deep learning to this class of smart-contract vulnerabilities, outline dataset usage, and note that resources will be open-sourced. - The paper’s multimodal design yields a clearly superior representation, with the full fusion outperforming all single and dual modalities. - It delivers large, consistent gains over strong baselines across multiple vulnerability types and also shows good transfer to a new dataset, evidencing robustness and generalization. - Beyond detection, the system adds a practical clone-detection pipeline using the unified contract embedding with an RBF–cosine similarity, broadening utility for auditing and analysis workflows. - The ablation exploration is narrow. It focused mainly on learning-rate sweeps and modality combinations, without probing other impactful choices. - Baseline coverage is thin and clone-detection evaluation hinges on a single dataset and heuristic similarity thresholds. - The paper provides no theoretical analysis to complement its empirical results. - Authors classify a contract as vulnerable when the sigmoid probability exceeds 0.95. Why 0.95? How sensitive are results to that choice? - Tables report point metrics. Please add variance estimates. - Could you also probe hidden sizes, dropout, training epochs/early-stopping, and fusion strategies to assess robustness and design choices? - Please expand to include more learning-based detectors or recent multimodal methods. Moderately AI-edited
MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper proposes MultiCFV, a multimodal deep learning method for detecting control-flow-related vulnerabilities and code clones in smart contracts. The approach integrates three modalities: Control Flow Graphs (CFG) for structural features, Abstract Syntax Trees (AST) for syntactic features, and code comments for semantic information. The features from these modalities are fused to train a model for vulnerability detection and to build a feature database for clone detection. - The paper addresses the critical and high-impact problem of smart contract security. It maintains a clear focus on a specific, challenging class of bugs: erroneous control flow vulnerabilities (e.g., reentrancy, unsafe external calls, and delegatecall). - The ablation study (Table 2) is a strong point of the paper. It clearly demonstrates that the proposed method is effective. - The core idea of combining structural (CFG), syntactic (AST), and human-semantic (Comments) information is logical and provides an intuitive, holistic view for understanding complex code vulnerabilities. - The novelty of this paper is limited. The proposed approach is largely an application of existing, standard components (BERT, GCN, GRU, CNN). The fusion mechanism appears to be simple feature concatenation ("vertically stacked"). More discussions are required to highlight its specific novelty over contemporaneous multimodal vulnerability detection work (e.g., Jie et al., 2023; Qian et al., 2023) cited in its own related work section. - There lacks experimental comparison to other learning-based SOTA methods for vulnerability detection (e.g., Peculiar, or the other GNN/multimodal approaches mentioned in the related work). - Some experimental results require deeper discussions. E.g., in Table 3, the reported 0% accuracy for both Slither and Mythril on "Access Control" vulnerabilities is puzzling, as these tools are industry standards specifically designed to find such flaws. The authors did not clarify how analysis failures (e.g., contracts that Slither or Mythril failed to parse) were handled in the metrics. Were they excluded, or counted as False Negatives? - The reported 99.13% accuracy (Table 2) seems high and may indicate overfitting. The paper mentions using SMOTE (Section 4.2) to balance the dataset; it is critical to clarify that SMOTE was applied only to the training split. If synthetic samples from the test set's distribution were included in training (a common data leakage pitfall), the validation and test results would be artificially inflated. The paper exhibits several presentation issues that affect clarity and precision. In Section 3.2.1, the authors refer to 256-dimensional vectors from BERT while also describing 128-dimensional node feature vectors in Equation (3), but the relationship between the two is unclear. Moreover, Sections 3.3 and 3.4 reuse the same variable ($F_{ast}$) to represent feature vectors for both the AST and the comments, which may cause confusion. Lightly AI-edited
MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning Soundness: 3: good Presentation: 2: fair Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper introduces MultiCFV, a multimodal deep learning framework for detecting control flow vulnerabilities and code clones in smart contracts. The proposed approach integrates three complementary feature types, Control Flow Graphs (CFGs) extracted from bytecode, Abstract Syntax Trees (ASTs) from source code, and comment-based semantic embeddings, to capture structural, syntactic, and contextual information. The authors employ GRU-GCN for graph embedding, CNN with attention for comment feature extraction, and a fusion network for final classification. Extensive experiments are conducted on four benchmark datasets, showing that MultiCFV outperforms existing static analysis tools such as Slither and Mythril in both accuracy and generalization. 1. About design. Combining CFG, AST, and comment information is original and addresses the limitations of unimodal vulnerability detectors. 2. About experiments. The work includes comparisons with several baselines, ablation experiments, and cross-dataset evaluation, establishing strong empirical support. 3. High performance. The model achieves good accuracy and generalization to unseen vulnerabilities, including unprotected Ether withdrawal cases. 1. Incremental novelty. While the multimodal fusion is valuable, it mainly combines known feature extraction techniques rather than introducing a fundamentally new learning paradigm. 2. Limited theoretical justification. The paper lacks a formal explanation of why multimodal integration improves detection robustness beyond empirical evidence. 3. Dataset dependence. The evaluation relies heavily on public datasets; no large-scale or real-world deployment test is included. 4. Scalability and runtime cost. Although mentioned briefly, there is no quantitative analysis of inference time or computational overhead on large-scale contracts. 1. How does MultiCFV handle unseen vulnerability types not represented in the training set? 2. Could you provide runtime benchmarks or scalability analysis compared with Slither or Mythril? 3. How sensitive is the model to the quality or availability of comments? If comments are sparse or missing, does performance degrade significantly? 4. Were any measures taken to mitigate overfitting given the relatively small vulnerability datasets? 5. Can MultiCFV be adapted for on-chain real-time contract auditing or incremental analysis during contract updates? Fully AI-generated
PreviousPage 1 of 1 (4 total rows)Next