ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	2 (50%)	3.00	3.50	6840
Heavily AI-edited	0 (0%)	N/A	N/A	N/A
Moderately AI-edited	0 (0%)	N/A	N/A	N/A
Lightly AI-edited	0 (0%)	N/A	N/A	N/A
Fully human-written	2 (50%)	4.00	4.00	4067
Total	4 (100%)	3.50	3.75	5454

Title	Ratings	Review Text	EditLens Prediction
DyBraSS: Dynamic Brain State Modeling with State-Space Model	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 6: marginally above the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The authors propose DyBraSS, which combines a structured state-space model with orthonormal cluster aggregation to simultaneously model temporal dynamics and global spatial context at the ROI level. This approach allows for interpretable modeling of dynamic brain states from rs-fMRI and is used for disease classification (ASD, ADHD). The authors compare their results against multiple state-of-the-art baselines on ABIDE-I and ADHD-200, reporting robust performance improvements and providing individual and group-level brain state analysis. 1. Combining ROI-wise SSM with "soft assignment to orthogonal clusters (brain-state) → aggregation → feedback to ROI" forms a closed-loop spatiotemporal coupling mechanism, which is conceptually natural and explainable. 2. On ABIDE-I and ADHD-200, DyBraSS generally outperforms various representative baselines in AUROC/ACC/SEN. Table 1 shows the specific improvements . 3. The authors examine the effects of various global aggregation strategies, as well as the next-time FC auxiliary loss and TR regularization, providing intuitive comparisons. 4. Provides individual/group-level state transition differences, dwell time analysis, and brain network visualization, attempting to link model findings with neurobiological literature and provide clinical interpretability. 1.Table 1 reports mean ± standard deviation, but does not explicitly report significance tests for baseline vs. DyBraSS. Given that the improvements in AUROC are mostly ~0.02–0.04, statistical tests demonstrating that the improvements are not due to chance are needed, along with detailed per-fold values or boxplots. Please also provide additional significance test results. 2.Appendix D.1 states, "We choose to calculate group-level means under the optimal validation AUROC fold and analyze only subjects correctly classified by the model." Taking a group average only for correctly classified subjects introduces bias and may exaggerate the consistency of the model's interpretable analysis. Please provide additional analysis results for all subjects; or at least demonstrate consistency between the results for only correctly classified samples and the full sample. 1.How are cluster centers (K) maintained/updated? Are they learned during training?The paper states, "K cluster centers are defined as an orthogonal basis and obtained from the initial random vector V by Gram-Schmidt" . However, it is unclear whether K is updated during training, or what the frequency of the Gram-Schmidt method is. This determines whether the cluster representation is a static basis or a learnable subspace that evolves with the data, which directly impacts the novelty and interpretability of the method. 2.The authors use sliding-window Pearson correlation to construct dFC. The choice of sliding window can significantly affect dFC representation. Please provide additional sensitivity analysis of window length/stride, or try a non-windowed approach to verify the robustness of the method. 3.The interpretive use of Captum to derive importance scores is excellent, but should avoid presenting only representative cases. Please provide additional statistical summaries and explain the consistency of importance scores with traditional neuroscience metrics.	Fully human-written
DyBraSS: Dynamic Brain State Modeling with State-Space Model	Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 2: reject Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper introduces a novel structured state space model (DyBraSS), which jointly models the spatiotemporal dependencies of brain dynamics within a unified framework by incorporating a clustering-based global aggregation module. The work demonstrates significant algorithmic innovation in handling dynamic functional connectivity from fMRI data, with systematic and comprehensive experimental design, showcasing outstanding diagnostic performance and model interpretability across multiple benchmark datasets. However, the paper exhibits notable shortcomings in theoretical rigor, motivation for methodological choices, statistical validation of results, and clarity of graphical representations. 1. DyBraSS successfully integrates ROI-level temporal evolution modeling with global brain state clustering within a single framework, effectively addressing the limitation of treating spatial and temporal dynamics in isolation, as seen in prior methods. 2. The core design, combining a dynamic SSM (Dyn-SSM) with an orthonormal cluster-based global aggregation mechanism, is novel. It preserves the brain's network topology while enhancing both ROI-level modeling capacity and the interpretability of state transitions. 3. The model demonstrates superior performance against a wide range of SOTA baselines (CNNs, Transformers, structured SSMs) on the ABIDE-I and ADHD-200 datasets. 1. The identifiability of the learned state transition matrices (Ar) and observation matrices (Cr) is not discussed. Given the high noise and low temporal resolution of fMRI data, it remains unclear whether these parameters are uniquely determined or confounded by equivalent parameterizations, which casts doubt on their neuroscientific interpretability. 2. The model introduces a feedback loop from the global clustering module to the local SSMs (Φr(t)). However, no stability analysis of this closed-loop dynamical system is provided. It is crucial to demonstrate that this feedback does not lead to divergent hidden states or unstable oscillations. 3. The theoretical motivation for using orthonormal clustering is weak. Orthogonality does not equate to statistical independence or neuroscientific dissociability. A stronger theoretical or empirical justification for why this method is superior to other clustering strategies is required. 4. The rationale behind key design choices is insufficiently explained: 1) Why was Mamba chosen as the foundational framework over other sequence models like Transformers? A comparative justification based on the related work is needed. 2) What is the motivation for the two-stream design ("x-branch" and "z-branch"), and specifically, the gating mechanism in the 'z-branch'? Why was gating chosen over an attention mechanism? 3) The reasoning for not using separate parameters for each feature dimension (as mentioned in Section 4.2.1, contrasting with the standard formulation in Section 3) is not provided. 5. The methodological description is challenging to follow. The data flow and interaction between modules are not clearly articulated. Providing a structured algorithm box or high-level pseudocode is strongly recommended, with explicit statements of inputs and outputs for each core module. 6. Performance comparison tables (e.g., Table 1) report means and standard deviations but lack statistical significance tests (e.g., t-tests, ANOVA). This makes it impossible to confirm the reliability of the performance improvements. Similarly, differences in transition matrices and dwell times between groups are described qualitatively without quantitative statistical validation. 7. An explanation is needed for why the proposed model does not achieve the best Specificity (SPC) scores in Table 1. 8. The results in Table 3 regarding the impact of Lpred on different metrics (e.g., increase in SEN but potential decrease in SPC) require deeper analysis 9. A detailed analysis of why other aggregation/clustering methods in Table 2 (e.g., Mean, Sum, Attention) perform poorly is necessary. 10. The paper does not report the model's parameter count, training/inference time, or a computational efficiency comparison with baseline methods. This is critical for assessing the method's practicality. 11. The conclusions are primarily based on two public datasets. Including a third independent dataset or conducting cross-dataset generalization experiments would significantly strengthen the claims. 12. Figure 1 (Model Overview) is inadequate. It fails to clearly illustrate the end-to-end data flow, temporal direction, hierarchical information exchange between modules, and the location of components mentioned in ablations (e.g., TR processing, MLPs). A complete redesign is necessary for clarity. 13. The visualization of brain state differences (e.g., in Figures 3 and 4) could be improved with better color schemes and layout. Including a state transition diagram to visually represent the probabilistic transitions between different brain states would greatly enhance interpretability. 14. Figures used to discuss ablation studies should clearly indicate which components were removed 15. The reference list contains numerous arXiv preprints. The authors should verify if these have been peer-reviewed and published subsequently, and prioritize citing the peer-reviewed versions where available. 1. The identifiability of the learned state transition matrices (Ar) and observation matrices (Cr) is not discussed. Given the high noise and low temporal resolution of fMRI data, it remains unclear whether these parameters are uniquely determined or confounded by equivalent parameterizations, which casts doubt on their neuroscientific interpretability. 2. The model introduces a feedback loop from the global clustering module to the local SSMs (Φr(t)). However, no stability analysis of this closed-loop dynamical system is provided. It is crucial to demonstrate that this feedback does not lead to divergent hidden states or unstable oscillations. 3. The theoretical motivation for using orthonormal clustering is weak. Orthogonality does not equate to statistical independence or neuroscientific dissociability. A stronger theoretical or empirical justification for why this method is superior to other clustering strategies is required. 4. The rationale behind key design choices is insufficiently explained: 1) Why was Mamba chosen as the foundational framework over other sequence models like Transformers? A comparative justification based on the related work is needed. 2) What is the motivation for the two-stream design ("x-branch" and "z-branch"), and specifically, the gating mechanism in the 'z-branch'? Why was gating chosen over an attention mechanism? 3) The reasoning for not using separate parameters for each feature dimension (as mentioned in Section 4.2.1, contrasting with the standard formulation in Section 3) is not provided. 5. The methodological description is challenging to follow. The data flow and interaction between modules are not clearly articulated. Providing a structured algorithm box or high-level pseudocode is strongly recommended, with explicit statements of inputs and outputs for each core module. 6. Performance comparison tables (e.g., Table 1) report means and standard deviations but lack statistical significance tests (e.g., t-tests, ANOVA). This makes it impossible to confirm the reliability of the performance improvements. Similarly, differences in transition matrices and dwell times between groups are described qualitatively without quantitative statistical validation. 7. An explanation is needed for why the proposed model does not achieve the best Specificity (SPC) scores in Table 1. 8. The results in Table 3 regarding the impact of Lpred on different metrics (e.g., increase in SEN but potential decrease in SPC) require deeper analysis 9. A detailed analysis of why other aggregation/clustering methods in Table 2 (e.g., Mean, Sum, Attention) perform poorly is necessary. 10. The paper does not report the model's parameter count, training/inference time, or a computational efficiency comparison with baseline methods. This is critical for assessing the method's practicality. 11. The conclusions are primarily based on two public datasets. Including a third independent dataset or conducting cross-dataset generalization experiments would significantly strengthen the claims. 12. Figure 1 (Model Overview) is inadequate. It fails to clearly illustrate the end-to-end data flow, temporal direction, hierarchical information exchange between modules, and the location of components mentioned in ablations (e.g., TR processing, MLPs). A complete redesign is necessary for clarity. 13. The visualization of brain state differences (e.g., in Figures 3 and 4) could be improved with better color schemes and layout. Including a state transition diagram to visually represent the probabilistic transitions between different brain states would greatly enhance interpretability. 14. Figures used to discuss ablation studies should clearly indicate which components were removed 15. The reference list contains numerous arXiv preprints. The authors should verify if these have been peer-reviewed and published subsequently, and prioritize citing the peer-reviewed versions where available.	Fully AI-generated
DyBraSS: Dynamic Brain State Modeling with State-Space Model	Soundness: 3: good Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	The paper introduces DyBraSS, a state-space model designed for analyzing dynamic brain states from resting-state fMRI data. The core idea is to create a model that handles both spatial and temporal aspects of brain dynamics simultaneously. It uses a "global aggregation module" with an orthonormal clustering approach to group evolving brain activity into interpretable states. The method is evaluated on the ABIDE-I and ADHD-200 datasets for diagnostic classification, where it is shown to perform better than several existing methods. The paper also provides an analysis of the learned brain states, suggesting they align with known neurobiological patterns in ASD and ADHD. 1. The proposed method for jointly modeling spatial and temporal dynamics look good to me. Directly integrating a global context into the ROI-level updates, rather than treating them as an ordered sequence, is a more faithful representation of brain topology. 2. The experimental evaluation is quite thorough. The model is compared against a good range of recent SOTA baselines from different architectural families (CNN, Transformer, SSM) on two standard, multi-site datasets. The performance gains shown in Table 1 are clear. 3. The attempt to make the model's outputs interpretable is a significant plus. The brain state analysis in Section 5.3, particularly the visualization of state transitions and network configurations (Figures 3 and 4), connects the model's learned patterns back to clinical neuroscience, which is often missing in purely performance-driven machine learning papers. 1. The motivation for using orthonormal clustering could be stronger. While prior work is cited (lines 280-282), the paper doesn't fully explain why orthogonality is a necessary or superior constraint for defining brain states compared to other clustering approaches. The ablation study (Table 2) shows it works best among the tested options, but the underlying reason isn't entirely clear. 2. The model's complexity seems quite high. With stacked DynBrain-Mamba blocks, multiple MLPs, and several moving parts (lines 769-774), it's hard to tell which components are doing the heavy lifting. The parameterization also involves many choices (e.g., number of clusters, state dimensions) that might be sensitive and difficult to tune. 3. The clinical interpretations, while interesting, feel a bit speculative. For instance, attributing the State 6→4 transition in ASD to "altered coordination between sensorimotor processing and cognitive control systems" (lines 455-456) is a strong claim based on correlational data. While the interpretation aligns with existing literature, the link could be drawn more cautiously. 1. Regarding the global aggregation module (Section 4.2.2): The use of orthonormal bases for cluster centers is an interesting choice. Could the authors elaborate on the neurobiological intuition behind this constraint? Does forcing the cluster centers to be orthogonal impose a structure that is believed to exist in brain functional organization, or is this primarily a choice that was found to be effective empirically? 2. In the ablation study (Table 2), the "Attention" aggregation method performs worse than the clustering-based methods. This is somewhat surprising given the success of attention mechanisms in related fields. Could the authors provide some insight into why this might be the case? Was it a simple attention mechanism, and is it possible a more sophisticated variant could have been more competitive? 3. The paper mentions that the learned brain states align with known neurobiological alterations (lines 089-090). In the ABIDE-I analysis, the TC group shows a tendency to remain in State 1, described as reflecting "stable engagement of internally oriented cognition" (lines 465-466). Could the authors clarify what the functional consequence of not remaining in this state might be for the ASD group? Is the model suggesting that the ASD group is less able to sustain this mode of brain activity? 4. For the dFC calculation (lines 200-204), a sliding-window approach is used. The choice of window length and stride can significantly impact the resulting dynamics. While the paper normalizes these based on TR, how sensitive is the model's performance to the target window size (15s) and stride (3s)? Was any analysis done to select these specific values?	Fully AI-generated
DyBraSS: Dynamic Brain State Modeling with State-Space Model	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 2: reject Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully.	This paper presents DyBraSS, a novel structured SSM that unifies spatial and temporal modeling within a single framework, enhancing ROI-level modeling capacity and interpretability through a clustering-based global aggregation module. Compared with various SOTA methods on ABIDE-I and ADHD-200, it showed a stable improvement in indicators such as AUROC/ACC. Brain state analysis at both the individual- and group-levels reveal that the learned brain states align with known neurobiological alterations, providing valuable insights for computational neuroimaging and clinical applications. (1) The approach to the problem is natural and significant: combining spatial topology (information between ROIs) with the SSM to describe jointly models inter-ROI interactions during temporal state evolution, this perspective is highly compatible with understanding dFC and has potential neurointerpretability. (2) The proposed leverages a global aggregation module that incorporates information from all brain regions into local ROI-level updates, thereby preserving the brain’s network topology during state evolution. (3) The clear organization and presentation of this article make it a pleasure to read. 1. In Appendix D.1, the authors state that " we selected the model from the fold with the best validation AUROC score and analyzed only subjects that were correctly classified by this model to compute group-level averages" , this strategy introduces serious selection bias (only analyzing the samples "selected" by the model), which may exaggerate the robustness of the said group differences and interpretability conclusions. 2. The comparison between orthonormal cluster design and alternative solutions is shallow. Although Table 2 provides a comparison of multiple aggregation methods, it lacks a more in-depth analysis of why orthonormal can bring advantages (such as cluster center visualization or correspondence analysis with brain functional modules). 3. The explicit appearance of Eq.10 $A_r^{-1}$ imposes requirements on the eigenvalues/reversibility of matrix $A_r$. The paper does not discuss how to ensure numerical stability, whether instability (gradient explosion/disappearance) occurs during training, or how to constrain $A_r$. 4. Although DyBraSS performs well on two public datasets, these datasets may not fully cover all the variability of fMRI data. It is necessary to further verify the model's generalization ability on more diverse datasets to ensure its stability and reliability in practical applications. Such as, ADNI, OASIS, PPMI et al. 5. The comparison methods are somewhat limited, as many existing fMRI analysis approaches were not considered — for example, the original Mamba, Graphormer, NAGphormer, NeuroPath, NeuroGraph, and ContrastPool. 6. The model did not analyze different network scales, which means it did not explore the performance of the model under different brain atlases (such as AAL 116, Schaefer 1000). (1) Why adopt the practice of analyzing only the subjects correctly classified by the model in the folds with the best validation AUROC scores to calculate the population average? This will lead interpretive analysis to favor samples that the model "agrees with", thereby exaggerating the differences. If this screening is removed (i.e., a group-level analysis is conducted on all test set samples or all folded test samples), will the results be consistent? (2) Why orthonormal bases via Gram-Schmidt (Eq. 12)? How sensitive is the model to initial $V$ vectors? (3) The construction of orthonormal cluster defines orthogonal bases as cluster centers. In the "Orth(Ours)" mode (Table 2), does this orthogonalization remain fixed during training (initialized only once), or can the cluster center be updated? If it can be updated, how can orthogonality be guaranteed (or is it not mandatory)? If it cannot be updated, will it limit the presentation ability? (4) What is the specific numerical range and stability of Eq.11 $\Delta_{r,t}$ (since $\epsilon_r \in R$ is a learnable parameter)? $\Delta_{r}$ determines the scale of $\overline{A}_r=\exp(\Delta_rA_r)$. Is there $\overline{A}$ situation where $\Delta_{r}$ is so large that $\overline A$ approaches singularity or explod? Should $\Delta_{r}$ be truncated or regularized? (5) Tables 1 and 2 present the mean ±std, but do not explicitly state whether the differences among different methods are significant. Please supplement: Significance test and provide the $p$-value or confidence interval. (6) Please report the training time, the number of model parameters (total parameters), and the computational cost of each sample during inference (such as FLOPs or milliseconds), and compare them with the main baselines. (7) The provided code link in the manuscript doesn't work, resulting in a “The requested file is not found” error.	Fully human-written

PreviousPage 1 of 1 (4 total rows)Next