|
Goal-driven Bayesian Optimal Experimental Design for Robust Decision-Making Under Model Uncertainty |
Soundness: 3: good
Presentation: 2: fair
Contribution: 2: fair
Rating: 2: reject
Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. |
This paper proposes Goal-driven Bayesian Optimal Experimental Design, a framework that optimizes experimental designs to minimize downstream decision costs rather than just maximizing EIG on parameters as in the traditional BOED. The approach combines variational inference for posterior approximation with convex optimization for robust control under parameter uncertainty, using chance constraints or CVaR to handle uncertainty in the constraints. The authors apply differentiable optimization layers (cvxpylayers) to enable gradient-based design selection and demonstrate the framework on two simulated examples: epidemic management using an SIQR model and pharmacokinetic dose optimization.
1. The paper studies an important problem in BOED that maximizing EIG may not optimize decision-making objectives, and demonstrates this on two concrete applications (epidemic management and pharmacokinetic control).
2. The proposed approach is technically sound, and the use of chance constraints and CVaR for robust optimization looks reasonable to me.
1. The proposed framework primarily applies existing techniques without domain-specific adaptation. Goal-oriented BOED is well-established in prior work (as acknowledged in the related work section), and the variational BOED framework with importance sampling is directly adopted from [1] without justification or specific adjustments for the two applications. The paper should better position its contribution, either as novel methodology or as an application study demonstrating feasibility in specific domains.
2. The author only compares the proposed method against traditional EIG-based BOED, not other goal-oriented approaches such as [2] which would fit the experimental setting. Additionally, the paper only tests on two applications (SIQR, PK) without evaluating on standard BOED benchmarks commonly used in the literature (e.g., source localization problems).
3. The experimental results do not sufficiently demonstrate the value of goal-oriented BOED, especially in the SIQR setting. Could the authors elaborate more on this part?
4. The paper lacks crucial experimental details including training procedures (optimizer, learning rate, epochs, batch size), model architecture specifications, and key hyperparameters. More importantly, no ablation studies are provided to justify design choices such as: importance sampling vs direct VI sampling, impact of N (posterior samples), sensitivity to $\eta$, or the architecture choices.
[1] Foster, Adam, et al. "Variational Bayesian optimal experimental design." Advances in neural information processing systems 32 (2019).
[2] Smith, Freddie Bickford, et al. "Prediction-oriented bayesian active learning." International conference on artificial intelligence and statistics. PMLR, 2023.
1. Please see the questions in the Weakness part.
2. The authors only conduct single-step experimental design over small discrete spaces. Why use amortized inference in this setting? Amortization is essential for sequential BOED where posteriors must be computed repeatedly across multiple rounds, but for single-step designs, evaluation of all candidate designs would be computationally feasible and simpler. Can you provide justification or computational cost comparisons demonstrating why amortization is necessary for your experimental setting?
3. In the experiments section, the authors mentioned that they estimate EIG using nested Monte Carlo with 5,000 outer samples and 3,000
inner samples for the marginal likelihood. Can you elaborate on how these specific numbers were selected? Have you conducted any ablation studies to determine the optimal sample sizes? |
Lightly AI-edited |
|
Goal-driven Bayesian Optimal Experimental Design for Robust Decision-Making Under Model Uncertainty |
Soundness: 3: good
Presentation: 2: fair
Contribution: 3: good
Rating: 4: marginally below the acceptance threshold
Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
This paper presents GoBOED, a framework that unifies Bayesian Optimal Experimental Design (BOED) and robust optimal control for decision-making under model uncertainty. Instead of focusing solely on parameter uncertainty reduction (via expected information gain), the proposed method explicitly optimizes experiments to reduce uncertainty that most impacts downstream decisions. The framework uses variational inference for amortized posterior approximation, convex optimization for tractable robust control, and differentiable decision layers to enable end-to-end gradient-based training. Applications to epidemic management (SIQR model) and pharmacokinetic (PK) control demonstrate the method’s capacity to identify flexible, near-optimal experimental designs that balance decision quality and acquisition cost.
1. **Conceptual contribution**: The idea of integrating BOED with robust control in a single differentiable pipeline represents a novel and meaningful conceptual advance over classical information-theoretic approaches, which often overlook decision performance.
2. **Practical relevance**: Applications in epidemic control and pharmacokinetics are both timely and compelling, demonstrating generalizability across domains requiring safe and cost-aware experimental scheduling.
3. **Computational tractability**: The amortized inference and differentiable control layers reduce the sampling and computational overhead that typically affect BOED, addressing a key bottleneck in the literature.
4. **Empirical results**: Case studies show interpretable outcomes (e.g., near-optimal observational “windows”), emphasizing the trade-off between informational and operational objectives.
1. **Clarity and structure**:
- The motivation is unclear to me. The introduction section mentions many challenges, including the computational challenges in BOED and parameter uncertainty issues in robust control, but it is unclear which challenge this paper aims to address. It would be helpful if the authors could indicate which section addresses each challenge.
- For the problem formulation, the combined objective in equation (4) is interesting, but there is little interpretation provided. The overall goal remains unclear, and my uncertainty about which problem (BOED, robust control, or both) this paper seeks to address is not resolved even after reading Section 2.
- The exposition is often dense and notation-heavy, particularly in Sections 3–4. Long derivations (e.g., Eq. (11)) could benefit from a clearer explanation of the high-level logic before diving into detailed formulations.
- I think Figure 2 can be significantly improved by labeling the notations directly on the corresponding charts, rather than relying on the text. It would also be very helpful to provide a complete algorithm block for the proposed method.
- Some passages conflate notations from BOED and control optimization, making it difficult to distinguish design variables ($\xi, \xi^*, \xi^\star$), control variables ($q$), and variational parameters ($\phi$) on first read. A summary table of notations would help.
1. **Limited comparison baselines**: Only classical EIG-based BOED is compared. Including a *decision-focused baseline* (e.g., Expected Predictive Information Gain) would provide a fairer benchmark to demonstrate decision-aware benefits.
2. **Computational efficiency validation**: The authors claim to propose a computationally efficient method, but this claim is not validated by experimental evidence.
Please see the weaknesses for my concerns. In particular, I would like the authors to provide a clearer explanation of the overall procedure, for example by including algorithm blocks and clearer diagrams. |
Fully AI-generated |
|
Goal-driven Bayesian Optimal Experimental Design for Robust Decision-Making Under Model Uncertainty |
Soundness: 1: poor
Presentation: 1: poor
Contribution: 1: poor
Rating: 2: reject
Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. |
The authors propose an experimental-design method that targets minimal expected loss in a decision of interest and considers parameter constraints. They focus on two particular decision problems: choosing quarantine rates during an epidemic, and choosing a dosing rate for administering a drug. They demonstrate the performance of their method on these two problems.
Originality: this is unclear to me, despite quite a lot of effort to work it out.
Quality: the high-level problem (Section 2 up to Equation 4) and the main empirical results (Figure 2) are clear.
Clarity: the writing is understandable at a low level.
Significance: decision-oriented BED is an important direction.
I’m struggling to understand the proposed method. I agree with Equations 3-4, which align with existing decision-oriented objectives (eg, Bernardo & Smith, 1994; Bickford Smith et al, 2025; Huang et al, 2024; Neiswanger et al, 2022; Raiffa & Schlaifer, 1961) if we set $\rho = \mathbb{E}$ as the authors here do in Equation 5, making $h[p(\theta|\xi,y)] =\min_{q \in \mathcal{Q}} \mathbb{E}_{p(\theta|\xi,y)}[J(q,\theta)]$ the key quantity to target, where $J$ is a loss function, $q$ is an action, and $\theta$ is an unknown ground-truth variable.
Things get confusing thereafter because the authors actually consider $J$ not being a function of $\theta$ while placing constraints on $\theta$, leading to Equation 5. Mathematically it’s unclear to me how changing beliefs over $\theta$ lead to a change in the minimising $q$, unless there is some $\theta$ assigned nonzero weight by the prior and zero weight by the posterior: all that matters with regard to $\theta$ is that the constraints are met, and these are set upfront, before any experimentation.
Even if there’s some way this works out, it’s unclear to my why the constraints on $\theta$ should not just be thought of as implying an updated belief state, $p'(\theta|\xi,y,\mathcal{C})$ for constraints $\mathcal{C}$, produced by applying the constraints and renormalising. If we had this updated belief state and a $J$ that depends on $\theta$ then I think we’re back to the setup from past work.
Aside from these methodological issues, it looks to me like the proposed method is not compared against existing methods, even though the authors promise to “compare GoBOED with standard BOED baselines”. I think the authors should be comparing against EIG maximisation as well as other non-parameter-oriented methods (eg, Huang et al, 2024; Kandasamy et al, 2019).
Finally I think there is a general inflation of the paper’s novelty. The goal-oriented aspect of the work is not new (see for example the citations for decision-theoretic methods). Considering the intersection between experimental design and control is not new (eg, Anderson et al, 2023; DeGroot, 2004; Mesbah & Streif, 2014). Studying SIR and pharmacokinetic models is not new (eg, Ivanova et al, 2021). The lack of novelty would be fine if there were a compelling contribution otherwise, but this is very unclear to me.
---
Anderson et al (2023). Experiment design with Gaussian process regression with applications to chance-constrained control. Conference on Decision and Control.
Bernardo & Smith (1994). Bayesian Theory. John Wiley & Sons.
Bickford Smith et al (2025). Rethinking aleatoric and epistemic uncertainty. ICML.
DeGroot (2004). Optimal Statistical Decisions. John Wiley & Sons.
Huang et al (2024). Amortized Bayesian experimental design for decision-making. NeurIPS.
Ivanova et al (2021). Implicit deep adaptive design: policy-based experimental design without likelihoods. NeurIPS.
Kandasamy et al (2019). Myopic posterior sampling for adaptive goal oriented design of experiments. ICML.
Mesbah & Streif (2014). A probabilistic approach to robust optimal experiment design with chance constraints. arXiv.
Neiswanger et al (2022). Generalizing Bayesian optimization with decision-theoretic entropies. NeurIPS.
Raiffa & Schlaifer (1961). Applied Statistical Decision Theory. Division of Research, Harvard Business School.
Can you show how different beliefs over $\theta$ lead to different optimal $q$?
If so, can you show why constraints cannot just be applied as a belief update over $\theta$?
How well do the abovementioned baseline methods work?
Can you confirm that Equation 2 is correct? I don’t think it matches any estimators from Foster et al (2019). |
Fully human-written |
|
Goal-driven Bayesian Optimal Experimental Design for Robust Decision-Making Under Model Uncertainty |
Soundness: 2: fair
Presentation: 2: fair
Contribution: 2: fair
Rating: 4: marginally below the acceptance threshold
Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. |
This paper introduces GoBOED, an integrated framework combining Bayesian Optimal Experimental Design (BOED) with convex optimization–based decision-making under uncertainty. The main idea is to choose experiments that improve downstream decision quality, not merely parameter accuracy. While the concept of linking BOED to decision-aware control is worthwhile, the paper’s claims of “robustness under model uncertainty” and its empirical contributions are overstated relative to what is demonstrated.
The idea of linking Bayesian Optimal Experimental Design (BOED) directly to downstream decision-making is conceptually important, addressing a genuine gap between information-theoretic design and decision-focused inference. The framework combines variational inference (VI) with differentiable convex optimization (via cvxpylayers) to enable end-to-end gradient flow from experiment design through control decisions. This is a clean engineering contribution that improves computational tractability. The use of amortized VI and differentiable convex optimization allows efficient gradient-based optimization over both experiment designs and decision variables without repeated posterior refits.
The approach can, in principle, be applied to multiple convex decision problems, demonstrated with epidemiological (SIQR) and pharmacokinetic (PK) case studies. The main ideas are well structured, with clear visuals (e.g., Fig. 1) illustrating how the BOED and decision layers interact.
While the paper’s central idea is conceptually appealing, its current presentation overstates robustness. The method addresses parameter uncertainty within a fixed model rather than broader forms of model misspecification. The robustness achieved through chance constraints or CVaR is useful but conventional, and the paper should more clearly define its scope as parameter-uncertainty-aware rather than fully model-uncertainty-robust. The framework’s differentiable structure is well executed but not fundamentally new in either BOED or robust optimization.
Empirically, the examples in Figure 2 highlight an interesting divergence between EIG-optimal and decision-optimal designs, particularly the emergence of a broader, flatter near-optimal window under risk-sensitive criteria. This is an intriguing and potentially valuable observation. However, the paper does not quantify or interpret why this difference matters or what practical benefit arises from using GoBOED instead of traditional BOED. A more systematic analysis, such as measuring control cost improvements, robustness to posterior misspecification, or constraint-violation frequency, would make the contribution more convincing.
Terminology and exposition could be clearer. Concepts such as risk functional, chance term, and the “discrepancy” corrected through importance sampling are introduced without rigorous definition. Moreover, since the entire method depends on the posterior quality from variational inference, the absence of any evaluation of posterior calibration or its effect on decision reliability leaves an important gap.
The current evaluation relies on a single-shot design, which limits the interpretability of the framework. Extending the experiments to an iterated design, where updated posteriors inform subsequent measurement choices, would better demonstrate the claimed benefits of goal-driven experimental design. Visualizing posterior updates would also provide concrete evidence of how decision-aware objectives reshape uncertainty, rather than inferring these effects indirectly from control costs. Although benchmarking new methods is challenging, the authors could still compare their approach across multiple time points and posterior evaluations for both models, and contrast the results with plain EIG optimization. Posterior evolution could be illustrated visually or through calibration metrics such as L-C2ST.
The literature review misses seminal work and blurs distinctions between related methods.
- The separation between Kleinegesse and Foster’s MI-based BOED methods is overstated; both rely on similar bounds, though Kleinegesse optimized a critic. The paper should also specify which bound is optimized, since MI bias and variance depend on that choice.
- Foundational references such as Lindley (1956) and Barber–Agakov (2003) are missing.
Finally, the literature review seems to take a very high-level overview of the field of BOED hinting at a misunderstanding of some fundamental concepts. Careless exposition makes it difficult for readers to place this paper in the context of those before, so I recommend edits to the introduction and background on BOED.
1. The framework relies heavily on how the posterior distribution changes under different design choices, but no posterior visualizations are provided. Could the authors show what the posteriors look like for the single-shot examples presented? Even a qualitative comparison between the EIG-optimal and decision-optimal designs would help clarify how the decision-aware objective shapes posterior uncertainty.
2. How would the proposed approach behave in an iterated experimental design setting, where posteriors from earlier measurements inform the next design choice? This seems especially relevant for the SIQR example, where measurements could be taken over multiple time points. Would the decision-aware design criterion lead to different sequences of measurement times compared to classical EIG? A short discussion or pilot experiment illustrating this would strengthen the paper’s argument for real-world applicability. |
Fully AI-generated |