ICLR 2026 - Reviews

SubmissionsReviews

Reviews

Summary Statistics

EditLens Prediction Count Avg Rating Avg Confidence Avg Length (chars)
Fully AI-generated 0 (0%) N/A N/A N/A
Heavily AI-edited 1 (20%) 6.00 3.00 4067
Moderately AI-edited 0 (0%) N/A N/A N/A
Lightly AI-edited 0 (0%) N/A N/A N/A
Fully human-written 4 (80%) 4.50 3.00 3542
Total 5 (100%) 4.80 3.00 3647
Title Ratings Review Text EditLens Prediction
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search Soundness: 2: fair Presentation: 3: good Contribution: 3: good Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper designs data structures for the ANN problem against an adaptive adversary, who selects a worst-case, size-$n$ dataset and $Q$ *adaptive queries* based on past responses of the data structure. The authors develop a progression of algorithms with provable robustness and efficiency guarantees: they first (i) connect adversarially robust ANN to fair ANN (where outputs are uniformly random among near-neighbors), showing that the latter implies the former, then (ii) reduces search to a decision ANN and solve the latter with a DP mechanism on top of a LSH, and finally (iii) introduce a concentric-annuli LSH construction that privatizes which annulus is predicted to terminate quickly, and then runs a fair ANN only within that annulus. On efficiency, (i) yields a query time that depends on the "density ratio" (i.e. the number of points in the $cr$-ball relative to the $r$-ball for a query $q$) which can be large in a worst-case dataset. In comparison, (ii) removes this data dependence; however, it partitions the data into $\sqrt{n}$-sized buckets, thus the query time is at least $\sqrt{n}$ (i.e. it does not diminish as $c$ grows). Finally, (iii) mitigates this issue and achieves space and query complexities that are data-independent and whose dependence on $n$ diminish as $c$ increases. - I find it appealing that Theorem 1.2 and 1.3 achieve runtime and space bounds independent of dataset-specific quantities (i.e. $s$ in [Feng'25] or $D$ in Theorem 1.1). This makes performance predictable on worst-case datasets and avoids hidden inefficiency due to dense neighborhoods. The search-to-decision reduction that enables this isolates the leakage channel and patches it with a DP mechanism, which feels natural and standard, but is executed nicely. - (Subject to correctness,) the fairness to robustness connection feels neat: framing robustness as a consequence of returning a uniformly random near neighbor yields a simple, reusable principle. As the authors note, this argument extends to any algorithm which is required to answer adaptive queries by picking from a discrete set of candidate values. In my view, this offers an alternative methodology beyond the now-standard DP-for-robustness recipe and could lead to further interesting results. - The presentation is clear. I appreciate how the authors explain the inefficiencies of Theorems 1.1 and 1.2, progressively motivating Theorem 1.3. The efficiency guarantees and trade-offs of the three algorithms are clearly discussed and compared with [Feng'25]. I only checked the first proof (fairness implies robustness) and I'm confused about the following point: In Definition 2.1, both $R$ and $R_{setup}$ are used to denote the randomness used to *initialize* the data structure. So I assume that $R = R_{setup}$ and write $W_i := (R_{setup}, R_1, \cdots , R_i)$. My question concerns Definition 3.1: is the ith answer $a_i$ independent of $(a_1, \cdots, a_{i-1})$ and $R$, or is $a_i$ independent of $(a_1, \cdots, a_{i-1})$ and $W_{i-1}$? Definition 3.1 seems to assert the former, and Definition 2 in [Aumuller et al.'21] only asserts independence from $(a_1, \cdots, a_{i-1})$; however, it appears to me that the proof of Claim 3.3 assumes independence of $(a_1, \cdots, a_{i-1})$ and $W_{i-1}$. In particular, Line 266 states: "since the $R_i$ are fixed, suppose that $f(R) \subset M$ is the set of queries for which A wrongfully answers $\bot$." This sentence makes sense to me if $f(R)$ here is really $f(W_Q)$. Then, to use DPI, line 269 would need to read "$(a_1, \cdots, a_{i-1})$ is independent of $f(W_{i-2})$", which does not seem to be guaranteed by the definition of a fair NN, per the concern above. Alternatively, if $f(R)$ on Line 266 truly means $f(R) = f(R_{setup})$, then I think $f(R_{setup})$ alone does not define the set of queries for which A wrongfully answers. (This could be the case for a specific fair NN construction, e.g. if $R_1, \cdots, R_Q$ are used in a limited way that does not affect the set of incorrect queries. But from the current description I don't think this is generally the case, especially since Definition 2.1 explicitly says that the failure probability is "taken over the algorithm’s entire internal randomness $ (R_{setup}, R_1, \cdots , R_i)$".) - Could you address the question raised in the Weaknesses section? - In [Feng'25], in addition to the space and the query time, the authors also analyze update time and preprocessing time: 1. Both this paper's definition of robust NN and the fair NN definition in [Aumuller et al.'21] appear to assume a static setting where the dataset is fixed up front. Do the algorithms here support (or admit natural modifications to support) dynamic updates, i.e., insertions and deletions? If so, how do the update times compare to [Feng'25]? 2. What are the preprocessing times of the algorithms in this paper? Fully human-written
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search Soundness: 3: good Presentation: 2: fair Contribution: 2: fair Rating: 4: marginally below the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. The authors address the challenge of approximate nearest neighbors search (NNS) to adversarially constructed queries. In particular, they consider the case that for a fixed randomness, the query designer can use a sequence of $Q$ queries. They make the observation that fairness implies robustness, which allows you to apply fair algorithms for robustness. To overcome the distributional assumption of this fairness algorithm, The paper subsequently introduce a bucketing approach. To break the $\sqrt{n}$ query time of this approach, they introduce their main algorithm which uses concentric annuli. They conclude by giving an algorithm that provides guarantees for any query, not just a fixed one. - Their algorithm appears to give an algorithm that improves over the prior query time of $\sqrt{n}$, when $c$ is large, while maintain robustness to adaptive adversarial queries. - The algorithm using concentric annuli with guarantees that one of the annuli will finish quickly is interesting and appears new. - In general, I am not a fan of how the paper is structured and presented. The paper is presented as a story or a textbook chapter, so that it is unclear what the main results are. In particular, the final section, giving the algorithm with guarantees for all queries, seems fairly disconnected from the previous. At several points in the paper, ideas are presented that sound interesting but are not clearly explained. One such example is the notion of ‘timestamps’ on line 146 that is repeated in the conclusion. - It is not clear why the reader should care about their main result for improving over the $\sqrt{n}$ barrier. It might be nice for example to see experiments that verify a setting where the proposed algorithm is faster. - Other than the concentric annuli algorithm, the proposed approaches/connections between fairness and robustness do not seem surprising. - How does your work differ from [1], which also provides guarantees for worst-case queries? Is it that the guarantees are for interactive querying in your work? They also develop an adversarially robust algorithm for approximate NNS based on LSH. In particular, their algorithm optimizes performance for the worst possible query, which seems to be quite similar to the goal of this work. [1] “Learning to Hash Robustly, Guaranteed”. Andoni , Beaglehole ICML 2022. None. Fully human-written
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search Soundness: 2: fair Presentation: 2: fair Contribution: 2: fair Rating: 6: marginally above the acceptance threshold Confidence: 2: You are willing to defend your assessment, but it is quite likely that you did not understand the central parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper tackles ANN under adversarial query sequences where both the dataset and queries are adversarially chosen. For high-dimensional settings, it proposes a progression of algorithms: leveraging fairness to achieve robustness, using differential privacy-based robust deciders with bucketing, and ultimately introducing a concentric-annuli LSH construction to break inherent query time lower bounds. For lower dimensions, it presents for-all algorithms based on metric coverings that guarantee correctness for all queries. The paper introduces new theoretical tools and extends known connections between fairness, robustness, and privacy in ANN search. 1. This work establishes and proves (Claim 3.3) the core theoretical result that exact fair ANN algorithms are adversarially robust. 2. The paper presents a thoughtful progression from fairness-induced robustness (Theorem 1.1) to assumption-free methods via bucketing (Theorem 1.2), culminating in the concentric annuli construction (Theorem 1.3) that achieves sublinear query time even in worst-case datasets. 3. Table 1 provides a summary of algorithmic tradeoffs (query time, space) under varying assumptions and concretes the advances over rival works, notably Feng et al. (2025). 1. This work is entirely theoretical. It lacks experiments to evidence the theoretical results. I would suggest authors to include some experiments to support the claims. 2. Claims are ambiguous. For instance, Theorem 1.1 and Theorem 1.2 offload much of the complexity into density ratios, while we are still not sure about how these density ratios affects in the real application scenarios. 3. In the for-all algorithms (sec 1.1.2), how severe is the intractability when $d$ is only modestly large (e.g., $d = 30$ or $100$)? Are there adaptations that would make these algorithms relevant in practice, beyond toy cases? 4. Table 1 has many notations. I would suggest authors to draw a figure or simplify the notations to highlight the changes between different algorithms. For example, the definition of D, p, beta, s, are actually unknown to readers at the first time. See weakness. Fully human-written
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search Soundness: 1: poor Presentation: 2: fair Contribution: 3: good Rating: 4: marginally below the acceptance threshold Confidence: 4: You are confident in your assessment, but not absolutely certain. It is unlikely, but not impossible, that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. The paper considers the problem of building adversarially robust approximate near neighbor (ANN) data structures. In the classical ANN setting, the goal is to build a data structure $D$ which when instantiated with a dataset of $n$ $d$-dimensional datapoints, a target distance $r > 0$, and an approximation constant $c > 1$, supports the following queries: Given a query point $q$, return a point $x$ in the dataset with $\|x - q\| \leq cr$ if there exists any point $y$ in the dataset with $\|q - y\| \leq r$. Classical work has produced data structures with query times scaling \emph{sub-linearly} in $n$ and space complexities scaling almost linearly in the size of the dataset. Unfortunately, these classical works often assume that the sequence of queries given to the data structure are independent of the randomness used to instantiate it. In realistic scenarios, where the query sequence may potentially depend on answers to prior queries (and hence, the internal randomness of the data structure), this assumption breaks down and the correctness guarantees of the data structure no longer hold. Consequently, there has been much recent interest in remedying these shortcomings by developing data structures, resilient to these effects. This paper operates in the strong adversarial setting where \emph{both} the dataset and the sequence of queries are assumed to be chosen by a potentially malicious adversary. The paper contains several results applying to different regimes: 1) In the high-dimensional setting ($d = O(\sqrt{Q})$), the paper constructs a series of data structures culminating in a data structure with space complexity $\tilde{O} (\sqrt{Q} n^{1 + \beta})$ and query complexity $d n^{\beta}$ for $\beta \approx \log \log (c) / \log (c)$ and 2) In the low-dimensional setting, the paper mildly improves on prior results providing a stronger \emph{for-all} query guarantee with an increased cost in space complexity $\tilde{O} (d n^{1 + \rho})$ (note when $d$ is large, this space complexity dominates the query-dependent results of the high-dimensional setting). Technically, the main insight of the paper is that algorithms for \emph{Fair}-ANN (a recently developed ANN notion which requires that any answer returned by the data structure must be uniformly chosen from the set of correct answers if non-empty), inherently provide resilience to adversarially chosen inputs. Unfortunately, the query times of prior data structures for Fair ANN rely on the density ratios of two neighborhoods around $q$ (specifically, the neighborhoods within $cr$ and $r$ of $q$) which may be as large as $n$. The main technical contribution of the paper addresses this challenge by breaking down the neighborhoods $[r, cr]$ into a series of annuli and observing that not all radii in this series can feature large density ratios. Hence, an alternative approach is to try different data structures for different annuli and only use ones which terminate within a chosen time complexity threshold (one exists since at least one annuli features a small density ratio). Unfortunately, this idea only results in a relaxed Fair ANN data structure which no longer straightforwardly yields robustness as the answer may potentially leak the precise choice of annuli used which is correlated with the internal randomness. The paper then uses standard techniques from differential privacy to hide only the \emph{choice} of annuli while incurring increased space complexity scaling with $\sqrt{Q}$. The combination of these techniques yields the final result. Overall, the paper considers the natural problem of building adversarially robust data for search problems like ANN. The paper draws a connection to the Fair-ANN problem and expands on these ideas to build data structures with sub-linear query times and space-complexities scaling independently of $d$. However, the remainder of the paper largely relies on standard techniques previously explored in the literature for robust \emph{estimation}. Furthermore, the approach in the paper has runtimes scaling as $n^\beta$ (as opposed to $n^\rho$), representing a substantial degradation. It is not clear that such a degradation is necessary and somewhat dampens the main result of the paper. Finally, the writing and organization of the paper need to be substantially improved. For instance, Theorem 1.3 as stated only concerns Fair-ANN and not robust ANN and the proof of Claim 3.3 (the main insight of the paper) is missing some steps -- how can one condition on the correctness of the algorithm?, what is $R$?, why does $\mathcal{A}_{fair}$ not being adversarially robust mean the corresponding random variables are not independent?. In its current state, I cannot recommend acceptance. See main review See main review See main review Fully human-written
Efficient Algorithms for Adversarially Robust Approximate Nearest Neighbor Search Soundness: 3: good Presentation: 4: excellent Contribution: 3: good Rating: 6: marginally above the acceptance threshold Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. This paper studies the Approximate Nearest Neighbor (ANN) problem in the presence of an adaptive adversary. In the model considered, the adversary first fixes the dataset but can subsequently select each query point adaptively based on previously observed outputs. Classical ANN algorithms such as locality-sensitive hashing (LSH) assume an oblivious adversary and may fail under adaptive attacks. The authors seek to design algorithms that remain correct and efficient even against a powerful adaptive adversary. The paper introduces several algorithms and provides corresponding theoretical guarantees. First, the authors show that fairness in ANN search implies adversarial robustness, establishing a formal connection between fair selection among valid near neighbors and adaptive security. They then present a bucketing-based robust search algorithm by reducing ANN search to a weak decision problem and leveraging differential privacy machinery to ensure robustness. Finally, they introduce a concentric-annuli LSH construction that overcomes the $\sqrt{n}$ query-time barrier present in their bucketing framework, again utilizing differential privacy techniques to carefully control information leakage. These algorithms yield provable guarantees across both high-dimensional settings $(d = \omega(\sqrt{Q}))$ and low-dimensional ones $(d = O(\sqrt{Q}))$, and the paper establishes the corresponding bounds in Theorems 1–4. The contributions rely on a combination of tools including fair ANN sampling, differential privacy, and careful runtime analysis over concentric geometric partitions. 1. The observation that fairness implies robustness is conceptually elegant and powerful. Its applicability extends beyond the ANN problem and may motivate further exploration of fairness-based defenses in other algorithmic settings. 2. Theorem 3, in particular, presents a strong result for adversarially robust ANN, improving upon prior work and breaking the $\sqrt{n}$ barrier under mild assumptions. 3. In addition to high-dimensional settings, the authors also provide results for low-dimensional cases, strengthening the completeness of the theoretical contributions. 4. The paper is built on solid mathematical foundations and includes rigorous proofs, carefully addressing subtle issues related to privacy, adaptivity, and randomized algorithm behavior. 1. Both Theorems 2 and 3 include a $\sqrt{Q}$ factor, which can be significant when the number of adaptive queries is large. While Theorem 1 avoids this factor, it introduces dependence on data density, and it remains unclear whether the $\sqrt{Q}$ dependence can be eliminated in the general case. 2. The adversary is assumed to fix the dataset in advance but may adaptively choose queries throughout execution. The paper does not provide sufficient justification for this threat model, and it is not obvious that this form of adversary naturally arises in practice. 3. There is a typo on page 4, line 204. 4. Although the theoretical results are compelling, the paper does not include experiments to assess the algorithms’ performance in practical settings or illustrate their behavior under realistic adversarial scenarios. 1. Can the authors provide concrete examples or motivating applications where an adversary selects the dataset a priori but adaptively selects queries during execution? This would help clarify the practical relevance of the adversary model. 2. Could alternative adversary models also be considered? For instance, in an online learning setting where data points arrive sequentially and both data and queries may be chosen adaptively, would the proposed techniques still apply? 3. The conclusion mentions adversaries with “more information,” such as timestamps. Could the authors elaborate on what forms of additional information are considered and how such leakage might affect their guarantees? 4. Do the authors intend to conduct empirical experiments to validate the performance of their algorithms or to evaluate robustness under practical adaptive attack strategies? Heavily AI-edited
PreviousPage 1 of 1 (5 total rows)Next