|
Transport Clustering: Solving Low-Rank Optimal Transport via Clustering |
Soundness: 4: excellent
Presentation: 4: excellent
Contribution: 4: excellent
Rating: 8: accept, good paper
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
The paper builds upon the ideas introduced in [1], which establish a connection between the optimal transport problem and the k-means clustering algorithm through a non-negative factorization of the transport plan. A key distinction, however, is that while [1] focuses on clustering a single dataset, the present work extends these concepts to the co-clustering of two datasets — a framework the authors refer to as Transport Clustering. Since solving the low-rank optimal transport problem is NP-hard, the authors propose a multi-step approximation method and derive theoretical approximation ratios under various metric settings, including negative, kernel, and general metrics.
[1] Scetbon, Meyer, and Marco Cuturi. “Low-rank Optimal Transport: Approximation, Statistics, and Debiasing.” Advances in Neural Information Processing Systems 35 (2022): 6802–6814.
The paper is very well written and addresses an important problem which could be used in the development of computational tools for domain registration and alignment. The proposed methods provide a strong foundation for bridging distributional differences between datasets through transport-based formulations. Furthermore, the techniques presented in the paper could be extended to design large-scale, class-conditioned domain registration frameworks, enabling more structured and semantically meaningful alignment across complex datasets.
One major source of confusion for me is the incessant switching between assignment form partition formulations of low rank optimal transport problem. In the main body results are stated in assignment form, whereas proofs in appendix are written in partition formulation, which makes it harder for reader to follow the proofs.
While authors of the paper applied their technique to synthetic data, co-clustering on CIFAR-10 and cellular Transcription, I wonder if it could be applied to small domain alignment problem. |
Fully human-written |
|
Transport Clustering: Solving Low-Rank Optimal Transport via Clustering |
Soundness: 3: good
Presentation: 2: fair
Contribution: 2: fair
Rating: 6: marginally above the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
Low-rank optimal transport (OT) aims to approximate the optimal transport plan by constraining it to have small nonnegative rank, effectively routing mass through a few latent anchors. This structure naturally induces a clustering of the supports—points that share an anchor form co-clusters across the source and target distributions. While the motivation for low-rank OT is scalability, the optimization problem itself is nonconvex and NP-hard, in contrast to the convex linear formulation of classical OT. However, practical algorithms for low-rank OT often lead to scalable approximations in practice.
This paper takes the inverse view: instead of discovering clusters through a low-rank constraint, it first computes a full OT map to register the supports, then clusters the induced correspondences to identify anchors, and finally reconstructs a low-rank coupling consistent with this clustering. The resulting “transport clustering’’ approach reduces low-rank OT to a single generalized k-means problem and achieves a provable constant-factor approximation (between 2× and 3×, depending on the cost function) to the best rank-K OT cost for metric or kernel costs.
The presentation is somewhat confusing. The introduction motivates low-rank OT as a scalable alternative to full OT, yet the proposed method begins by computing a full OT map, which appears contradictory. The paper does not clearly emphasize that the full OT is computed only once to extract reusable structure for downstream efficiency. As written, it reads as if the “scalable’’ approach is computationally harder, whereas the real contribution is to show that—despite the NP-hardness of low-rank OT—one can obtain a constant-approximation solution through a simple clustering reduction built from a single OT computation.
The paper tackles an interesting compression problem where the goal is to incur a one-time OT computation cost in order to learn clusters that enable faster and more structured future OT computations. The connection to generalized k-means is elegant, and the constant-approximation result is novel and theoretically clean.
The presentation is lacking in clarity; it takes significant effort to piece together the motivation and the sequence of reductions. The introduction could better separate what is meant by “scalability” in this context.
The experiments do not include the initial full rank OT computation cost.
How does this paper relate to the result of Indyk and Price (STOC 2011), where k-median clustering provides a sparse, relative approximation of the Earth Mover’s Distance? Could their compression-based approach be viewed as an implicit low-rank or clustered approximation to OT? |
Lightly AI-edited |
|
Transport Clustering: Solving Low-Rank Optimal Transport via Clustering |
Soundness: 3: good
Presentation: 3: good
Contribution: 3: good
Rating: 6: marginally above the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
The paper proposes “transport clustering’’ (TC), which reduces the low-rank OT problem to a generalized $K$-means task by first registering the cost matrix with the optimal full-rank transport plan and then solving a single clustering subproblem. The authors obtain polynomial-time constant-factor guarantees—$(1+\gamma)$ for negative-type metrics and $(1+\gamma+\sqrt{2\gamma})$ for kernel costs—and instantiate the idea with mirror-descent (GKMS) and SDP-based solvers. In the experiments, TC outperforms prior low-rank OT methods on synthetic and real datasets.
The reduction from low-rank OT to a single clustering problem via cost registration is novel in my opinion, conceptually unifying co-clustering with $K$-means while providing constant-factor guarantees, leveraging mature toolboxes such as Lloyd iterations and SDP relaxations, and consistently improving transport costs over LOT, FRLC, and LatentOT in experiments—all of which makes for a genuinely novel and useful perspective on low-rank OT.
TC still requires a full OT solve up front, so the practical savings are unclear without runtime data; the constant-factor guarantees depend on assumptions and a small gap $\gamma$ that are not quantified empirically; registration for non-square or weighted marginals is only sketched (no experiments); implementation details, for example, ensuring positive column sums, initialization fairness, iteration counts are lack; the cost registration may create prohibitive memory demands.
- How sensitive is TC to errors in the transport registration?
- Do the constant-factor guarantees still hold when $n\neq m$ or when the marginals are non-uniform? If so, can you illustrate Kantorovich registration with at least one experiment in that setting?
- Could you provide ablations where baseline solvers are initialized with the same transport-registration (e.g., feeding TC's clusters into LOT/FRLC) to isolate whether the gains come from the reduction or from better initialization?
- Since Algorithm~1 begins by solving a full OT problem, can you quantify the end-to-end complexity savings? In settings where the standard OT solve already dominates runtime, does the K-means reduction still offer any benefit? |
Fully AI-generated |
|
Transport Clustering: Solving Low-Rank Optimal Transport via Clustering |
Soundness: 3: good
Presentation: 3: good
Contribution: 2: fair
Rating: 6: marginally above the acceptance threshold
Confidence: 3: You are fairly confident in your assessment. It is possible that you did not understand some parts of the submission or that you are unfamiliar with some pieces of related work. Math/other details were not carefully checked. |
Optimal Transport (OT) has been a popular technique in machine learning in recent years. Its low-rank variant has also received wide attention due to computational and statistical advantages. This paper proposed a method for low-rank OT called \emph{transport clustering} that approximately solves the low-rank OT problem with a clustering subroutine. Approximation error is theoretically analyzed. Experiments demonstrate the advantages over existing low-rank OT methods.
1. The transport clustering method is elegantly designed.
2. The authors provide theoretical proofs for the approximation factors.
3. Experimental results such as Figure 2 illustrate its advantages over existing low-rank OT methods.
1. When the marginal distributions are arbitrary distributions, the authors outline an extension for the method (Line 267-Line 276), but the paper seems to lack theoretical guarantees for such an extension. For example, does problem (9) even have a solution when the marginal distributions are arbitrary? How robust is this extension?
2. The method relies on existing full-rank OT solvers like the Hungarian algorithm or the Sinkhorn algorithm to register the cost matrix, which undermines the computational efficiency of low-rank OT. The Hungarian algorithm may take prohibitive runtime for large-scale real-world problems, and it may not be easy to extract a permutation matrix from the transport plan obtained by the Sinkhorn algorithm. In addition, in some applications, it may be unnecessary to consider low-rank OT after computing full-rank OT.
3. The considered tasks in Sec. 5 seem a bit too traditional.
See above. |
Fully human-written |