ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	1 (25%)	4.00	4.00	2723
Heavily AI-edited	1 (25%)	6.00	5.00	2232
Moderately AI-edited	0 (0%)	N/A	N/A	N/A
Lightly AI-edited	1 (25%)	8.00	1.00	1707
Fully human-written	1 (25%)	2.00	3.00	3306
Total	4 (100%)	5.00	3.25	2492

Title	Ratings	Review Text	EditLens Prediction
Arboreal Neural Network	Soundness: 3: good Presentation: 2: fair Contribution: 3: good Rating: 8: accept, good paper Confidence: 1: You are unable to assess this paper and have alerted the ACs to seek an opinion from different reviewers.	This paper addresses the lack of tree-structured inductive bias in deep neural networks for tabular data. To this end, the authors propose ArbNN, a novel architecture that reformulates decision trees into differentiable neural modules, enabling end-to-end gradient optimization while preserving interpretability. Extensive experimental results on multiple public benchmarks and a large-scale industrial credit risk dataset demonstrate that ArbNN consistently outperforms both traditional tree-based models and neural baselines, achieving superior accuracy and interpretability in tabular learning tasks. * This paper proposes the ArborCell structure to introduce the inductive bias of decision trees, and I am happy to see that the authors also provide visual comparisons to demonstrate the interpretability of the proposed method. * The authors discuss the related literature in considerable detail. * The paper is well written and easy to follow. 1. I am not an expert in tabular data, but I am curious about the convergence behavior of the proposed ArbNN. Could the authors provide training curves and compare them with other networks to illustrate convergence stability? 2. How does the training cost of the proposed method compare to other baselines? In addition, please evaluate the computational efficiency during inference, e.g., in terms of FLOPs, memory usage, and inference time. 3. The figures contain text that is too small to read clearly. It is recommended to increase the font size, use vector graphics for better clarity, and include a complete schematic diagram of the model architecture. 5. The authors do not provide code for reproducibility checks. My questions are in Weakness Section.	Lightly AI-edited
Arboreal Neural Network	Soundness: 1: poor Presentation: 2: fair Contribution: 2: fair Rating: 2: reject Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The paper proposes a new architecture for tabular data that is based on the idea of converting decision trees to a particular variation of two matrix multiplications with a non-linearity. It proposes to initialize such trees with XGBoost and then finetune the thresholds and values. The paper also proposes a new credit scoring dataset TabCredit. The method is tested on the new dataset and a simple benchmark constructed from pytorch-frame, claiming state-of-the-art performance. I think that looking into tree-structured models and combining their inner workings with DL models is an interesting pursuit. I had a great time digging through related work on the topic and think that there is something in this line of work that could lead to strong and interpretable models and this direction is currently underexplored. The dataset contribution also seems very timely and important as there are not a lot of realistic testbeds for tabular machine learning methods readily available in academia. When done right this is a major contribution, so I encourage authors to go through with it regardless of this review period decision. At times the writing is very hard to make sense of. In the related work section, for example, I still can't make sense of how challenging instances in datasets are related to the pre-tuned default hyperparameter configurations (lines 91-93). The overall algorithm for constructing an "ArborCell" may also be improved I believe (see the next point for examples). I believe the paper does not fully cover the relevant related work. It packages an idea of decision tree inference in matrix form into an "ArborCell", but this idea seemed not novel, and there are very similar existing approaches indeed: - https://arxiv.org/abs/1604.07143 - Neural Random Forests. Which seems to do exactly what authors propose here - https://blog.dailydoseofds.com/p/transform-decision-tree-into-matrix a blog post example, which does a better job in explaining the same procedure which is used in the paper Finally, I do not believe the results are solid as there are some indications of poorly tuned baselines. Like TabM (recent SoTA model) performing on par with or sometimes worse than an MLP, or some large performance gains over XGBoost just from tuning the thresholds and leaf values (may indicate poorly tuned XGBoost in the first place). I also had trouble understanding some of the results like which datasets exactly were used (e.g. what dataset is CH?, why JA - Jannis is seemingly binclass and not multiclass as it is in the pytorch-frame benchmark). Without code being available this is impossible to check further. I suggest the authors compare to an established and well tuned set of baselines, you can take TabArena benchmark which publishes reference model scores in a csv on github: ```python import pandas as pd pd.read_parquet("https://tabarena.s3.us-west-2.amazonaws.com/results/df_results_leaderboard.parquet") ``` Comparing the method to a correct set of baselines would increase reliability of the results very much. See suggestions in weaknesses. Regarding the newly introduced dataset. Does it have a dedicated train/val/test split which is time-based? Or is it different? Can you provide more details regarding the evaluation and tuning setup on the new dataset?	Fully human-written
Arboreal Neural Network	Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully.	The paper introduces Arboreal Neural Networks (ArbNN), a differentiable architecture that bridges gradient-boosted decision trees and neural networks. The key idea is to encode a pretrained XGBoost model into a neural form by translating its structure—feature splits, thresholds, and leaf values—into matrix operations that can be optimized end-to-end. This allows the model to retain the interpretability and inductive bias of trees while gaining the flexibility of gradient-based learning. Experiments on eight public tabular datasets and one large industrial credit dataset (TabCredit) show that ArbNN consistently matches or outperforms strong baselines. The paper proposes a novel and well-structured idea that combines the structural bias of decision trees with the flexibility of neural networks. The concept is intuitive yet original, and the formulation is clearly presented. The writing is clean and logically organized, making the technical details easy to follow. The experiments are thorough within the chosen scope and demonstrate consistent improvements over strong baselines such as XGBoost. - Limited Benchmark Coverage The evaluation includes only eight public datasets, which is considerably below the current standard in the tabular learning community. This narrow benchmark scope limits the credibility of the claimed generalization. Given the model’s conceptual promise, it would be valuable to test ArbNN on a broader set of heterogeneous tabular tasks. - Unclear Motivation and Overemphasis on Industrial Data The paper’s motivation is not fully convincing. Although the central idea—learning the structural bias of trees—is conceptually interesting, the claimed interpretability advantage remains unsubstantiated, as XGBoost provides only limited transparency. It appears that the work may be driven by a specific industrial objective, possibly related to the proprietary dataset used. If so, this motivation can be stated explicitly and the framing adjusted accordingly. Clarifying how the industrial requirements connect to the model’s broader scientific contribution and analyze why previous dl models perform worse, would significantly strengthen the paper’s coherence and impact. See Above.	Heavily AI-edited
Arboreal Neural Network	Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	The paper proposes Arboreal Neural Networks (ArbNNs), a framework that converts trained decision trees into differentiable neural operators called ArborCells. Each ArborCell encodes a tree’s split features, thresholds, structure, and leaf values into four matrices/vectors, enabling end-to-end optimization while preserving the original tree semantics. 1. A differentiable “tree-as-layer” formulation (ArborCell) with explicit feature–node selection matrix $W$, split-threshold vector $f$, tree-structure / routing matrix $P$, and leaf-value vector $v$ that avoids path-probability products via one-shot matrix aggregation 2. An algorithm to parse trees into ArborCells and the ability to decompile trained ArborCells back to refined trees, maintaining symbolic interpretability 3. Competitive performance on public tabular tasks and consistent vintage-curve improvements over XGBoost on TabCredit under temporal drift 4. Introduction of TabCredit, an industrial credit-risk dataset with temporal splits to benchmark robustness and interpretability in realistic settings 1. The experimental section does not include comparisons with strong, modern baselines, especially tabular foundation models. 2. Limited gains over XGBoost in Table 2 relative to method complexity. On the reported datasets, the improvement over a well-tuned XGBoost baseline is small. 3. The paper evaluates on a relatively small set of benchmarks 4. Dependence on pretrained tree models for initialization. The core recipe assumes the availability of a strong GBDT (XGBoost/LightGBM) to parse into ArborCells. This limits applicability in settings where (i) trees are hard to train well, or (ii) one would like to learn the structure jointly with the downstream objective. The paper does not show a convincing “from-scratch ArbNN” alternative. 1. Can the authors add comparisons with recent tabular foundation models (e.g., TabPFNv2 [1], TabICL [2])? 2. Can the authors clarify the necessity of GBDT-based initialization? The current version treats “compiling from a strong GBDT” as a given prerequisite, but there is no experiment demonstrating whether ArbNN can still achieve comparable performance. 3. Can the authors provide more detail on scalability and serving? Since each ArborCell does a one-shot aggregation over all leaves, how does inference time and memory compare to the original XGBoost model. A brief complexity analysis or inference time comparison would make the method more practical. [1] Hollmann, Noah, et al. "Accurate predictions on small data with a tabular foundation model." Nature 637.8045 (2025): 319-326. [2] Qu, Jingang, et al. "Tabicl: A tabular foundation model for in-context learning on large data." ICML 2025	Fully AI-generated

PreviousPage 1 of 1 (4 total rows)Next