ICLR 2026 - Reviews

Submissions Reviews

Reviews

EditLens Prediction: Fully AI-generated Heavily AI-edited Moderately AI-edited Lightly AI-edited Fully human-written All

Rating: 1 2 3 4 5 6 7 8 9 10 All

Confidence: 1 2 3 4 5 All

Summary Statistics

EditLens Prediction	Count	Avg Rating	Avg Confidence	Avg Length (chars)
Fully AI-generated	0 (0%)	N/A	N/A	N/A
Heavily AI-edited	1 (25%)	4.00	4.00	2440
Moderately AI-edited	1 (25%)	4.00	4.00	3504
Lightly AI-edited	2 (50%)	4.00	3.50	2567
Fully human-written	0 (0%)	N/A	N/A	N/A
Total	4 (100%)	4.00	3.75	2770

Title	Ratings	Review Text	EditLens Prediction
Multivariate Time Series Forecasting with Fourier Neural Filter	Soundness: 3: good Presentation: 1: poor Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked.	This paper introduces Time Filter (TiF) structure for multivariate time series forecasting. TiF combines: 1) Fourier Neural Filter (FNF), a spectral backbone with an input-dependent kernel that adaptively mixes time-domain and frequency-domain information while filtering noise; 2) Dual Branch Decoupler (DBD), a parallel temporal–spatial architecture that processes time and variable dimensions separately and fuses them later. The paper provides theoretical motivation via the information bottleneck principle and demonstrates improved performance across 12 datasets with notable efficiency and robustness. 1. The method design is well-motivated. The proposed FNF introduces an adaptive, input-dependent spectral filter that effectively bridges time-domain and frequency-domain modeling, while the DBD offers a parallel architecture for capturing both temporal and spatial dependencies. 2. The paper is written clearly, with good method framing and empirical support, making it a sounding backbones for time series forecasting. 3. Consistent better results across 12 datasets demonstrates the generalization and efficiency advantages of the proposed method. 1. The theoretical claims in this paper are mostly definitions or intuition. Theoretical proofs in Sections 3.1 and 3.2 read more as descriptive derivations or qualitative reasoning rather than rigorous theorems or guarantees. To strengthen credibility, the authors could either (a) provide concrete theoretical statements with clear conditions and supporting lemmas, or (b) reframe these sections as design intuitions supported by experiments. 2. The experiment section in the main paper lacks sufficient description of hyperparameters and training details. The authors state that a grid search over input lookback lengths and other hyperparameters was performed. However, it is unclear how this grid search was conducted. Is the performance of the proposed method significantly affected by the choice of lookback window length? Additional experiments using a fixed window size and including statistical significance tests would help address these concerns. 3. The length distribution across sections could be better balanced. Sections 3 and 4 occupy a large portion of the paper, leaving the experimental section relatively brief. If Section 3 mainly focuses on describing the model architecture, it could be condensed to make room for more detailed experiments. Similarly, Section 4 could be merged into Section 3. At present, the ablation studies are rather limited, discussing only a few modules and not conducted on unified dataset selection. 1. The paper claims selective activation enhances mid/high frequencies. Some frequency or distribution visualizations would make this more concrete. 2. Providing pseudo-code for the structures or releasing an anonymized codebase with experimental configs would benifit transparency and reproducibility. 3. How do you design the fusion network for the representations? What is the dimension of the linear layer in Eq. 23? Why it is a Linear - LN - GELU - Linear structure, which is not a common style MLP? If the authors' response adequately addresses my questions and concerns mentioned above, I am willing to raise my score.	Lightly AI-edited
Multivariate Time Series Forecasting with Fourier Neural Filter	Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper proposes a Time Filter (TiF) for multivariate time-series forecasting. TiF employs a Fourier Neural Filter (FNF) as the backbone and a Dual-Branch Decoupler (DBD) as the architectural design. The former provides strong representational capacity, while the latter establishes efficient learning pathways for spatiotemporal modeling. 1. This paper proposes a unified FNF backbone that integrates time-domain and frequency-domain analyses. 2. This paper provides theoretical and empirical evidence for the effectiveness of DBD in spatiotemporal modeling. 3. Comprehensive experiments conducted on long-term and short-term forecasting tasks verify the superior performance of TiF. 1. The organization needs improvement to make it easier to follow. In the Related Work, it is unclear why distribution shift and non-autoregressive decoding are reviewed, as these topics do not appear to be central to the paper’s main contributions. In the Method (Sections 3.1.1–3.1.6), substantial space is devoted to preliminaries such as complex transforms and global convolution, which obscures the core ideas and innovations of the proposed approach. 2. The paper’s novelty appears limited. Simply replacing the fixed kernel in the Fourier Neural Operator with an input-dependent kernel represents only a marginal improvement. In addition, introducing DBD as a parallel paradigm, compared with unified and sequential paradigms, to maintain independent information-processing branches also appears to be an incremental design choice rather than a substantive conceptual advance. 3. More relevant frequency-filter baselines should be considered, such as FilterNet [1] and TSLANet [2]. [1] FilterNet: Harnessing Frequency Filters for Time Series Forecasting. NeurIPS, 2024. [2] TSLANet: Rethinking Transformers for Time Series Representation Learning. ICML, 2024. pls refer to the weakness.	Lightly AI-edited
Multivariate Time Series Forecasting with Fourier Neural Filter	Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	The paper proposes Time Filter (TiF), combining a new Fourier Neural Filter (FNF) backbone with a Dual Branch Decoupler (DBD) architecture. FNF extends FNO with an input-dependent kernel (selective activation), complex transform, and adaptive truncation to modulate time–frequency information; DBD uses parallel temporal/spatial branches to improve gradient flow and capacity from an information-bottleneck perspective. Experiments on 12 benchmarks (eight LTSF datasets plus four PeMS short-term sets) report strong results, along with ablations and an efficiency plot. 1. FNF formalizes an input-dependent spectral operator with selective activation and adaptive truncation; the math exposition (Definitions. 1–7, Remarks 1–6) is explicit and links capabilities to Transformer functions and complexity. 2. DBD’s motivation via the information bottleneck and gradient-path analysis is well argued. 3. Results span LTSF and PeMS with comparisons to Transformer/CNN/MLP/Fourier baselines; Table 2 claims lookback grid search (96–720) for all methods, which ensures a fair comparison. 4. Efficiency analysis (Traffic) and component ablations (AT/SA, LS/GS) give some insight. 1. FNF’s contributions (input-dependent kernel, selective activation, adaptive truncation) are close in spirit to prior spectral/fractional operators (e.g., FNO/AFNO/FITS/FreMLP). The paper does not clearly establish what capability FNF enables that prior spectral blocks cannot, beyond architectural composition; DBD overlaps with known parallel/dual-path decouplers (e.g., Leddam [1], xPatch [2], MTGNN [3], TimeMixer++ [4], just to name a few). A sharper comparison or operator-replacement study is needed. 2. There is no working anonymous repo or pseudo code for review, which limits the reproducibility of the proposed method. 3. For PeMS, the lookback is fixed to 96 for all baselines, which can bias results; fair practice tunes input length per method (as you already did for Table 2). Please align protocols across tasks. 4. Table 2 lacks more recent strong baselines (e.g., TimeMixer++ [4] (ICLR25), PatchMLP [5] (AAAI25), TQNet [6], and TimeBridge [7] (ICML25)), which undermines the SOTA claim; please add with identical splits/tuning. 5. The proposed FNF looks conceptually overlapped with selective state-space models such as Mamba: both realize an input-conditioned long filter with a gated residual/skip path. In FNF, the frequency-domain parameterization (F → complex transform → adaptive truncation → F⁻¹) effectively implements a learnable long convolution; Mamba parameterizes a similar operation via SSM kernels and a selective gate. The manuscript should explicitly position FNF against Mamba/S4-family—clarifying what FNF does that a selective SSM cannot—and include quantitative comparisons. [1] Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling. [2] xPatch: Dual-Stream Time Series Forecasting with Exponential Seasonal-Trend Decomposition. [3] Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks. [4] TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis. [5] Unlocking the Power of Patch: Patch-Based MLP for Long-Term Time Series Forecasting. [6] Temporal Query Network for Efficient Multivariate Time Series Forecasting. [7] TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting. See in weakness.	Moderately AI-edited
Multivariate Time Series Forecasting with Fourier Neural Filter	Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work.	This paper introduces a new architecture for multivariate time series forecasting, including two key components: (1) Fourier Neural Filter (FNF), an input-dependent integral kernel operator that unifies time-domain and frequency-domain modeling, extending the Fourier Neural Operator (FNO) by introducing adaptive gating, selective activation, and learnable truncation for denoising. (2) Dual Branch Decoupler (DBD), a dual-path structure inspired by information-bottleneck theory that decouples temporal and spatial processing for improved gradient flow and representation capacity. 1. This paper presents a new architectural exploration for time series forecasting, offering a meaningful attempt to design a dedicated backbone tailored to the characteristics of temporal data. This represents a positive and constructive step for research in this area. 2. The proposed Dual Branch Decoupler (DBD) introduces a parallel-branch mechanism to decouple temporal and spatial feature learning. This is an interesting design that contributes fresh insights to spatiotemporal modeling in time series forecasting. 3. The experimental evaluation is extensive and convincing, covering 12 benchmark datasets and a broad spectrum of competitive baseline models, which demonstrates the robustness and general applicability of the proposed approach. 1. While the proposed Fourier Neural Filter (FNF) introduces adaptive kernels and learnable truncation mechanisms, much of its formulation builds upon existing frameworks such as FNO and AFNO. 2. The DBD parallel design is conceptually sound but lacks empirical exploration of branch interactions (e.g., information flow visualization or mutual information analysis). It would strengthen the paper to show why the parallel path quantitatively improves gradient dynamics or representation diversity. 3. The Related Work section does not sufficiently discuss prior studies directly related to the paper’s two main contributions, i.e., FNF and DBD. 4. The ablation study is relatively limited. It would be useful to further investigate the effect of architectural choices, such as FNF depth, kernel size, and sensitivity to the patch length P, to better understand the robustness of the proposed design. 1. It is unclear why Equation (22) is claimed to capture global correlations while Equation (21) captures local correlations, given that both modules employ the Fourier Neural Filter (FNF) backbone.	Heavily AI-edited

PreviousPage 1 of 1 (4 total rows)Next