ICLR 2026 - Reviews

SubmissionsReviews

Reviews

Summary Statistics

EditLens Prediction Count Avg Rating Avg Confidence Avg Length (chars)
Fully AI-generated 1 (33%) 8.00 3.00 4399
Heavily AI-edited 0 (0%) N/A N/A N/A
Moderately AI-edited 0 (0%) N/A N/A N/A
Lightly AI-edited 1 (33%) 4.00 4.00 8800
Fully human-written 1 (33%) 8.00 2.00 480
Total 3 (100%) 6.67 3.00 4560
Title Ratings Review Text EditLens Prediction
Einstein Fields: A Neural Perspective To Computational General Relativity Soundness: 2: fair Presentation: 3: good Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. This paper presents a neural tensor‑field (tensors in the General Relativity sense) parametrization of GR metrics with a JAX implementation for differential geometry operators wired through automatic differentiation. The demos (Schwarzschild/Kerr orbits, Weyl scalars, ring deformation under a linearized GW) look correct and generate polished images, and the ablations around Jacobian/Hessian supervision are good. That said, I have major concerns with the claims and evaluations. I believe these are not aligned with numerical relativity (NR) practice, and so this is not yet at the point of being useful for actual science. Thus, the current narrative is a bit misleading. The headline is "compressing 4D NR simulations by 1000-4000x with better derivatives than FD in FLOAT32," but most experiments are static analytic 3D cases (t = 0) plus one simple time‑varying GW; storage comparisons are made against a dense "explicit grid" strawman (modern NR code uses AMR or pseudo-spectral methods); baselines omit spectral/ROM methods; coordinate chart sensitivity is large; and long‑horizon dynamics need FLOAT64 to avoid divergence. I think it's a promising direction for ML to help, but I think this paper needs to be honest about the current status of such an approach. Doing so would not only be better for the paper, but I think also benefit the authors in that it would point out to the ML community where more work is needed. Basically I would like to see the narrative of this paper modified to be honest about the practicality of this, and about the toy baselines, before I consider acceptance, especially as general ML audiences will not know how to evaluate this. - I like the clear pipeline and library, and a JAX code for this seems useful for the NR community. The graph from metric to derived quantities is explicit and leverages forward‑mode Jacobians/Hessians with einsum operations which is nice. This is a useful contribution of reusable tooling for GR in ML, and I think it is a good contribution by itself. - I liked the ablations accounting done in 4.2, it is nice and I think quite useful to see the effect of every modification to the training process. It is interesting to see that Jacobian/Hessian supervision indeed helps. - Neat canonical tests: precession, circular/eccentric Kerr orbits, ring response under a linearized GW, and Weyl/Kretschmann diagnostics are shown and largely match analytics on short horizons. - Multi‑chart training attempt: training/evaluating in multiple coordinate charts acknowledges a real pain point in GR ML (though this is near the end of the appendix, I think it should come earlier) First, my main concerns: 1. First, I think the scope and storage comparisons are misaligned with NR. The paper advertises 4D compression of NR simulations, but the primary experiments are analytic snapshots at t = 0 (Schwarzschild/Kerr); only the linearized‑gravity toy has time evolution. The compression factors compare MLP weights to explicit dense grids counted as "#points x 4 bytes" in FLOAT32, which is not how NR codes actually store data (they would use adaptive mesh refinement stored with an octree). So the 1000–4000x headline is basically comparing against a strawman and misleading to the ML community about the state of this domain. 2. The paper itself notes that modern NR "increasingly opts for (pseudo‑)spectral methods ... up to 1000–5000x faster on CPUs than FD on GPUs at comparable accuracy." But all quantitative baselines in this paper are finite difference stencils (on a uniform grid - but the only actual finite difference codes used in NR are based on adaptive meshes) and an analytic AD. There's no head‑to‑head vs actual NR codes used in GW modeling. This makes it hard to situate their method, even for someone who knows NR, let alone the ML community. 3. The paper claims up to five orders of magnitude derivative gains over FD in FLOAT32, but geodesic integration requires FLOAT64 and long‑time rollouts still diverge (the authors show this in their own figures and explicitly state that they only surpass FD in single precision). This puts the method far from NR‑readiness, where double precision (or even higher) is standard. 4. There is large coordinate‑chart sensitivity which is a bit worrisome. Table 8 (deep in the appendix) reports up to three orders of magnitude variation in "Rel‑L2" error across charts for the same spacetime. That undermines generality claims unless the representation or training explicitly handles diffeomorphisms or evaluates with chart‑invariant metrics. 5. The physics is not enforced or audited. The pipeline mentions Bianchi identities, but experiments focus on pointwise tensor errors, scalar invariants, and geodesic tests. While the pointwise tensor errors are no doubt useful in clarifying (to a NR person) that these methods aren't yet ready, there's no reporting of physics checks, like conservation laws, etc., which are exactly the diagnostics one needs to trust a compressed metric in downstream NR workflows. 6. I am a bit confused about the "discretization‑free" claims, since it seems the method is ultimately trained on a grid. Several places describe training on 4D spacetime grids or "4D training and validation grid data," which undercuts the claim of being discretization‑free. Even an INR is ultimately a finite parametrization/basis. 7. The throughput trade‑off is not discussed. Even if the file is tiny, post‑processing requires many MLP queries to reconstruct fields, whereas decompressing spectral grids is reading coefficients + evaluating polynomials. You still likely win on storage, but the compute/runtime trade‑off for analysis & viz should be stated. Second, some other suggestions/comments: - I make the following comment purely to help the authors improve their work, and do not include this comment as part of my score for the paper, so feel free to not address this in your rebuttal. It is simply a suggestion/idea. So I think the branding of the method as "Einstein Fields" is not an optimal choice, because it would likely conflict with "Einstein Field Equations" in any search. Also, to the physicist, who I assume you would like to include in the audience of the paper, it does not give them any idea that this is related to machine learning. - Edit: What about calling it "Neural Einstein Fields"? - Obviously GR is quite complicated to someone with no background. I am not sure it is possible to give it much of an introduction here, and I worry the current introduction might give the wrong ideas. I think it is better to simply direct the reader to an online resource, rather than give an inevitably incomplete description of the mathematics in the appendix. Try to think think about what purpose it serves: (1) for those who don't know GR, this is not going to be nearly enough to introduce them even to the basics; (2) for those who do know GR, they will not need this anyways. So, why include it at all? Consider, for audience (1) I think you need to simply target the intuition for each variable you model. That is the fundamentally useful thing to write about. And for audience (2) (curious physicists, maybe) I think you simply need to _translate_ GR concepts to machine learning for them. So, when you consider these two audiences, the appendix seems to serve little purpose. I recommend trying the split approach above: focus on intuition of the key target variables for the non‑physicists (and point them to other resources), and focus on translation of the machine learning concepts for the physicists. This would be much more effective in my view. - Also, I did not check through all the math in the appendix. Thus, there could be errors. - It might be worth calling a GR tensor exactly that: a "GR tensor," to differentiate from the ML meaning. - Use proper scientific notation instead of "5.37E‑6" in tables. - Even if the file is small, you still must run tens of thousands of MLP queries for post‑analysis/viz; be transparent about that runtime trade‑off versus reading spectral coefficients. - Figure 1 is noisy. Consider simplifying or splitting. (The caption itself also states training on a 4D spacetime grid, which conflicts with "discretization‑free" messaging.) - Please clarify the precise data format used for training ("4D spacetime grid" vs "arbitrary samples"), since the paper simultaneously describes the approach as discretization-free yet refers to training on regular grids. - Could you provide any comparison, even small-scale, against a spectral or AMR baseline (ideally an actual code used by the GR community) to contextualize the claimed compression? - Please explain how derivative accuracy scales in FLOAT64 and whether the observed long-time geodesic divergence persists under higher precision. Lightly AI-edited
Einstein Fields: A Neural Perspective To Computational General Relativity Soundness: 3: good Presentation: 4: excellent Contribution: 4: excellent Rating: 8: accept, good paper Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper derives a novel neural representation method to compress the relativity simulations. - Outstanding efficiency and accuracy are shown when representing the symmetry of the simulations. - This paper is especially well written and well presented. - The problems addressed by the new tool is of interest to a wide community. - I am not entirely sure that how interesting this paper will be for the readership of ICLR, of whom so few are well versed in this area. See above. Fully human-written
Einstein Fields: A Neural Perspective To Computational General Relativity Soundness: 3: good Presentation: 3: good Contribution: 3: good Rating: 8: accept, good paper Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper introduces Einstein Fields (EinFields), a neural field approach to compress and represent 4D spacetime metrics from general relativity simulations. The method achieves up to 4,000× compression of metric tensor fields with 7-9 decimal digit accuracy (1E-7 to 1E-9 relative precision) while providing discretization-free continuous representations that can be trained on arbitrary point samples and queried at arbitrary resolutions. A key contribution is improved differentiation accuracy through automatic differentiation (AD), computing Christoffel symbols, Riemann tensors, and other derived quantities with up to 10^5 better accuracy than finite difference methods in FLOAT32. The approach parametrizes the metric distortion (deviation from flat space) using MLPs and employs Sobolev supervision. The validation focuses on three analytical solutions to Einstein's field equations: Schwarzschild, Kerr, and linearized gravitational waves, successfully reconstructing key relativistic phenomena. ## Strengths **Originality**: This paper presents a novel application of neural fields to general relativity, introducing the first implicit neural representation for tensor-valued spacetime geometries. The approach creatively adapts neural field techniques from computer vision to computational physics, with several original contributions. **Quality**: The paper demonstrates strong technical rigor with comprehensive validation across multiple canonical GR test cases (Schwarzschild, Kerr, gravitational waves). The evaluation methodology is sound. Ablation studies (Table 3) properly isolate the contributions of different design choices. The authors are transparent about limitations. **Clarity**: The paper is well-written and accessible. The background section (Section 2) effectively introduces both GR concepts and neural fields. Figure 1 provides an excellent conceptual overview of the pipeline. The mathematical notation is consistent and properly defined (though dense in places). **Significance**: This work addresses genuine computational bottlenecks in numerical relativity—storage (petabytes per simulation) and accurate tensor differentiation. The 4,000× compression factor and 10^5 improvement in derivative accuracy (FLOAT32) represent substantial practical gains. ## Weaknesses **Limited experimental scope**: The validation is restricted to three analytical solutions to Einstein's field equations (Schwarzschild, Kerr, linearized gravitational waves). While these are canonical test cases, they represent idealized scenarios far simpler than realistic numerical relativity (NR) simulations. **Limited contextualization within scientific computing:** While the introduction mentions neural fields and ML for scientific computing, it lacks: (1) discussion of prior ML work specifically targeting numerical relativity or gravitational physics, (2) comparison with traditional compression methods used in scientific computing, and (3) detailed positioning relative to neural operators and PINNs. A dedicated related work section would help readers better understand the landscape and the paper's specific contributions. **Missing Error Quantification:** Tables 1-3 report single-valued metrics without error bars or confidence intervals. Table 1 mentions selecting 'the model with the lowest MAE,' suggesting multiple runs were performed but statistics are not reported. ### Minor issues: Page 10, line 490: "supplimentary" → "supplementary" Figure 4 a caption: "Perihilion precession" → "Perihelion precession" ## Questions * Actual NR simulation data: Your validation strategy using analytical solutions (Schwarzschild, Kerr, linearized GW) with known ground truth is appropriate for demonstrating the method's capabilities. As a natural next step, have you tested EinFields on any actual numerical relativity simulation outputs, even at small scale? What additional challenges arise with real NR data? * Table 1 mentions selecting "the model with the lowest MAE" - how many training runs were performed? Can you report mean ± standard deviation over multiple random seeds for the key results in Tables 1-3? This is important for assessing reproducibility and typical vs. best-case performance. * Parameter generalization: Do you train a separate network for each physical configuration (M, a, etc.), or can one network generalize across parameter ranges? Fully AI-generated
PreviousPage 1 of 1 (3 total rows)Next