|
Gaussian Belief Propagation Network for Depth Completion |
Soundness: 3: good
Presentation: 3: good
Contribution: 2: fair
Rating: 2: reject
Confidence: 5: You are absolutely certain about your assessment. You are very familiar with the related work and checked the math/other details carefully. |
This paper proposes the Gaussian Belief Propagation Network (GBPN) for depth completion. GBPN leverages a learned Markov Random Field (MRF) structure, constructed dynamically from RGB and sparse depth inputs, and performs inference via Gaussian Belief Propagation (GBP). The paper introduces a hybrid message-passing scheme and evaluates the method on NYUv2 and KITTI benchmarks, reporting competitive results.
1. Hybrid Learning-Inference Integration: The paper attempts a meaningful integration of graphical model reasoning with deep learning. Learning the MRF structure dynamically from images represents a conceptual advance over fixed priors or fully feed-forward architectures.
2. Principled Treatment of Sparse Inputs: The method embeds sparse depth directly into the probabilistic inference process, rather than processing it as part of standard CNN input, offering a more theoretically grounded approach to the sparsity challenge.
1. Lack of Computational Analysis: The proposed approach introduces significant inference overhead due to iterative GBP and dynamic graph construction. However, the paper does not report any runtime statistics, GPU memory usage, or scalability discussion. Given the growing importance of efficiency in practical systems, this omission is concerning.
2. Limited Empirical Gain: While the method shows SOTA iRMSE on KITTI, it underperforms on other key metrics (RMSE, MAE), suggesting the gain may not be consistent. On NYUv2, although the reported RMSE is strong, the deltas are small and the competitive landscape is already saturated.
3. Weak Justification of Components: Ablation studies show marginal gains (~3mm RMSE difference on NYUv2), raising questions about the necessity of the complex model components, including non-local edge prediction, dynamic updates, and dual-pass U-Nets.
4. Insufficient Analysis on Iterative Behavior: The authors claim that more iterations improve performance (line 290), but Table 2 only reports 3 and 5 iterations. It remains unclear whether the performance plateaus or continues to improve, and at what computational cost.
5. Unclear Sparsity Robustness Comparison: In Figure 2, the RMSE of some methods (e.g., GuideNet, CFormer) increases as input becomes denser, which is counter-intuitive and not explained. Additionally, curves for many methods converge from 500 points onward, making relative robustness claims less persuasive.
6. Presentation and Layout: Several pages are cluttered with tightly packed text and figures (notably pages 6 and 9), negatively impacting readability.
A key concern is that the ablation results are unconvincing, as the reported gain is only about 3 mm. For the same model, it is quite common that retraining multiple times yields variations of around 5 mm, which means such a small improvement is either unsuitable for ablation analysis or insufficient to demonstrate the effectiveness of individual modules. It is also puzzling that the model’s accuracy degrades when the density of depth points increases, yet the explanations provided fail to address this anomaly satisfactorily.
Even after this issue was highlighted by the reviewers in **NeurIPS comments** before, the authors showed no intention of taking concrete steps to rigorously validate the effectiveness of their method or to improve its interpretability. The current revision remains insufficient for publication, and the reviewer maintains a negative overall assessment of the paper. |
Moderately AI-edited |
|
Gaussian Belief Propagation Network for Depth Completion |
Soundness: 2: fair
Presentation: 1: poor
Contribution: 2: fair
Rating: 4: marginally below the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
This paper introduces a hybrid framework, termed the Gaussian Belief Propagation Network (GBPN), for depth completion using sparse depth points and color images. The core idea is to use a deep network (GMCN) to dynamically construct a Markov Random Field (MRF) for each scene, learning both its potential function and graph structure by predicting non-local edges. Subsequently, the Gaussian Belief Propagation (GBP) algorithm is employed to infer dense depth from the constructed MRF. The authors report that the proposed method achieves state-of-the-art performance on the NYUv2 and KITTI datasets and exhibits strong robustness to input sparsity.
1. Framing deep completion as probabilistic inference on dynamically constructed graph models offers a theoretically sound approach for handling sparse and irregular inputs. Extensive evaluations under varying sparsity levels, noise conditions, and cross-dataset settings show that the proposed framework achieves stronger robustness than pure end-to-end regression models.
2. The proposed method not only learns the MRF parameters but also infers the graph structure by predicting non-local edges, which represents a novel contribution. This design allows the model to adaptively capture long-range dependencies from image content, thereby overcoming the fixed-neighborhood constraints of traditional MRFs.
1. The ablation study (Table 2) is presented in not clear, making it difficult to verify the contribution of each model component.
2. Although the paper provides a thorough empirical comparison with competitors such as BP-Net and demonstrates clear advantages in accuracy and robustness, the discussion does not move beyond empirical evidence and lacks a compelling conceptual justification. The authors do not clearly explain why their MRF+GBP paradigm is theoretically or conceptually superior to the direct learning propagation paradigm represented by BP-Net. The contributions appear to represent a highly successful and well-designed paradigm instantiation rather than a fundamental conceptual advancement.
3. The inference time of this method is considerably longer than that of its main competitors (nearly 80% slower than BP-Net on KITTI), yet the accuracy improvement is negligible (only about 0.4%). This trade-off is unacceptable for real-time applications such as autonomous driving.
4. This paper employs loopy belief propagation on dynamically generated graphs, a method that lacks formal convergence guarantees. However, the paper does not discuss or analyze the potential stability and convergence issues associated with this setting.
1. Could you provide a clearer version of Table 2 that explicitly lists the configuration details for each ablation study variant?
2. For practical applications such as autonomous driving, how do you justify a considerable increase in inference latency in exchange for only a marginal gain in accuracy?
3. What conceptual advantages does your approach offer over methods that directly learn propagation operators?
4. When applying loopy belief propagation to dynamically generated graphs, have you observed any cases of non-convergence or oscillation? |
Lightly AI-edited |
|
Gaussian Belief Propagation Network for Depth Completion |
Soundness: 3: good
Presentation: 3: good
Contribution: 3: good
Rating: 6: marginally above the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
This paper addresses the depth completion task by introducing a Graphical Model Construction Network (GMCN), which constructs a scene-specific graph utilized by a Markov Random Field to optimize sparse depth through Gaussian Belief Propagation. Experimental results on the KITTI DC and NYU datasets demonstrate state-of-the-art performance, highlighting the effectiveness of the proposed approach.
1. The proposed method achieves state-of-the-art performance on both indoor and outdoor datasets.
2. It shows superior robustness across varying depth sparsity levels compared to existing approaches.
3. The paper provides a comprehensive analysis and extensive experimental results in the supplementary material, which further supports the validity of the proposed approach.
1. In Figure 1, it is recommended to add essential legends for better clarity, such as explaining the meaning of “T” in the top-middle and the significance of the green, blue, and orange lines.
2. The Method section currently occupies a substantial portion of the paper, leaving limited space for the Experiment section. It is suggested to compress the Method section to allow more room for presenting additional experimental results.
3. The influence of local edges and GBP iterations should be analyzed individually. Table 2 appears cluttered, making it difficult to identify corresponding variants. A clearer presentation or separate analysis would improve readability and understanding.
1. In Table 1, could you clarify whether the entry labeled GBPN corresponds to the GBPN-1 or GBPN-2 variant? |
Moderately AI-edited |
|
Gaussian Belief Propagation Network for Depth Completion |
Soundness: 3: good
Presentation: 3: good
Contribution: 2: fair
Rating: 6: marginally above the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
This paper addressed the depth completion task by developing the Gaussian Belief Propagation Network (GBPN). The GBPN consists of a Graphical Model Construction Network (GMPN) for constructing a
scene-specific MRF over dense depth variables and the Gaussian
Belief Propagation strategy that infers the dense depth on the learned MRF. The GMPN models the potentials of the MRF and its structure by predicting
non-local edges to capture the complex, long-range spatial dependencies
guided by image content. The GBP strategy uses serial & parallel message passing scheme to enhance information flow.
Experiments on KITTI and NYUv2 show that the proposed method achieves SOTA performance. The authors also conduct comprehensive ablations to validate the effectiveness of the proposed modules and the robustness.
The proposed method achieves SOTA performance on public benchmarks, KITTI and NYUv2
The authors validate the effectiveness of the proposed modules and the robustness over sparsity.
The authors also provide detailed information regarding the method, like model structure, proof, parameters, etc.
The idea of using use Gaussian Belief Propagation for inference is interesting, with strong motivation from previous methods.
- The strategy, MRF for depth estimation, has been explored before [1][2]. The authors should provide some discussions.
- What are the advantages of the MRF for depth completion (GMCN & GBP) in comparison with previous propagation-based methods?
- The authors claim that " allowing the model to adaptively capture complex, long-range spatial dependencies
guided by image content". Maybe it would be better if some cases are provided.
- As shown in Tab. 9 of the Supplementary, the proposed method has higher running time than BP-Net, while there performance is very close. Therefore, what advantages does the method have in comparison with BP-Net (apart from fewer parameters)
- Discussion about the Serial & Parallel Propagation Scheme should be provided, like the efficacy. How the strategy improves the performance/computational cost?
- In which scenarios, the proposed method performs better? and what issues it can solve? Please give more examples and analysis. Since the performance is close to the latest methods, the authors should give more evidences for the effectiveness of the proposed method.
[1] Chen et al., Fast MRF Optimization with Application to Depth Reconstruction.
[2] Liu et al., Deep Convolutional Neural Fields for Depth Estimation from a Single Image
See the weakness. Please give feedbacks for each point. |
Fully human-written |