ICLR 2026 - Submissions

Submissions

Quantity AI Content: 0-10%10-30%30-50%50-70%70-90%90-100%All

Avg Rating: 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 All

Summary Statistics

Quantity AI Content	Count	Avg Rating
0-10%	11864 (61%)	4.36
10-30%	3952 (20%)	4.14
30-50%	1846 (9%)	3.93
50-70%	1026 (5%)	3.75
70-90%	494 (3%)	3.39
90-100%	199 (1%)	2.90
Total	19490 (100%)	4.20

Title	Abstract	Avg Rating	Quantity AI Content	Reviews	Pangram Dashboard
TIPS: A Text-Image Pairs Synthesis Framework for Robust Text-based Person Retrieval	Text-based Person Retrieval (TPR) faces critical challenges in practical applications, including zero-shot adaptation, few-shot adaptation, and robustness issues. To address these challenges, we propo...	5.00	0%	See Reviews	View AI Dashboard
An Unlearning-Enhanced General Framework for Test-Time Adaptation	Test-time Adaptation (TTA) aims to mitigate performance degradation caused by distribution shifts during testing time. While various TTA approaches exist, such as entropy minimization, pseudo-labeling...	4.00	0%	See Reviews	View AI Dashboard
Rethinking Cross-lingual Alignment: Balancing Transfer and Cultural Erasure in Multilingual LLMs	Cross-lingual alignment (CLA) aims to align multilingual representations, enabling Large Language Models (LLMs) to seamlessly transfer knowledge across languages. While intuitive, we hypothesize, this...	5.00	0%	See Reviews	View AI Dashboard
Color Blindness Test Images as Seen by Large Vision-Language Models	Large vision-language models (LVLMs) are fairly powerful in understanding this colorful world, yet their reasoning is grounded in highly entangled semantics, leaving open the question of whether they ...	2.00	52%	See Reviews	View AI Dashboard
NullGuard: Null-Space Embedding for Driftless Invisible Image Watermarking	Recent progress in text-to-image diffusion highlights the need for invisible, tamper-resilient watermarking that maintains both visual fidelity and prompt alignment. Existing approaches often compromi...	3.00	36%	See Reviews	View AI Dashboard
TTOM: Test-Time Optimization and Memorization for Compositional Video Generation	Video Foundation Models (VFMs) exhibit remarkable visual generation performance, but struggle in compositional scenarios (\eg, motion, numeracy, and spatial relation). In this work, we introduce **Te...	5.50	0%	See Reviews	View AI Dashboard
How Long Do Model Patches Last? A Temporal Perspective on PortLLM	As large language models (LLMs) undergo regular updates through continual pretraining, the temporal reliability of downstream fine-tuning methods becomes increasingly important. Parameter-efficient me...	4.00	0%	See Reviews	View AI Dashboard
Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models	We study the process through which reasoning models trained with reinforcement learning on verifiable rewards (RLVR) can learn to solve new problems. We find that RLVR drives performance in two main w...	4.00	4%	See Reviews	View AI Dashboard
Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms	Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sen...	6.50	0%	See Reviews	View AI Dashboard
On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking	We present a comprehensive analysis of how two-layer neural networks learn features to solve the modular addition task. Our work provides a full mechanistic interpretation of the learned model and a t...	3.00	0%	See Reviews	View AI Dashboard
Flow Marching for a Generative PDE Foundation Model	Pretraining on large-scale collections of PDE-governed spatiotemporal trajectories has recently shown promise for building generalizable models of dynamical systems. Yet most existing PDE foundation m...	2.50	0%	See Reviews	View AI Dashboard
Progressive Multistep Data-free Diffusion Distillation	While one-step distillation achieves strong single-step generation, these methods are not inherently flexible for multi-step sampling. Efforts to adapt them beyond one step frequently lead to reliance...	3.50	1%	See Reviews	View AI Dashboard
SPARC: SURVIVAL PSEUDO-LABEL ADAPTIVE RE- FINEMENT AND CALIBRATION	Accurate survival prediction is critical for oncology, public health, and reliability engineering, yet existing methods remain constrained by limited follow-up, heavy censoring, and static pseudo-labe...	3.00	78%	See Reviews	View AI Dashboard
Tell me Habibi, is it Real or Fake?	Deepfake generation methods are evolving fast, making fake media harder to detect and raising serious societal concerns. Most deepfake detection and dataset creation research focuses on monolingual co...	5.33	8%	See Reviews	View AI Dashboard
DemoReranker: Enhancing the In-context Learning Capability of Multi-modal Large Models via Demonstration Reranking	In the deployment of Large Multi-modal Models (LMMs), researchers and practitioners often rely on simplistic strategies for in-context learning (ICL), such as reusing fixed demonstrations across diver...	N/A	7%	See Reviews	View AI Dashboard
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers	We introduce MCP-Bench, a benchmark for evaluating large language models (LLMs) on realistic, multi-step tasks that demand tool use, cross-tool coordination, precise parameter control, and planning/re...	6.00	26%	See Reviews	View AI Dashboard
Module-Aware Parameter-Efficient Machine Unlearning on Transformers	Transformer has become fundamental to a vast series of pre-trained large models that have achieved remarkable success across diverse applications. Machine unlearning, which focuses on efficiently remo...	3.50	0%	See Reviews	View AI Dashboard
Toward Efficient Exploration by Large Language Model Agents	A burgeoning area within reinforcement learning (RL) is the design of sequential decision-making agents centered around large language models (LLMs). While autonomous decision-making agents powered by...	5.00	0%	See Reviews	View AI Dashboard
LSPO: Length-aware Dynamic Sampling for Policy Optimization in LLM Reasoning	Since the release of Deepseek-R1, reinforcement learning with verifiable rewards (RLVR) has become a central approach for training large language models (LLMs) on reasoning tasks. Recent work has larg...	2.50	6%	See Reviews	View AI Dashboard
AgentAlign: Navigating Safety Alignment in the Shift from Informative to Agentic Large Language Models	The emergence of agentic capabilities in large language models fundamentally transforms their risk profile from passive information providers to autonomous action executors, introducing unprecedented ...	4.00	33%	See Reviews	View AI Dashboard
Federated Hierarchical Anti-Forgetting Framework for Class-Incremental Learning with Large Pre-Trained Models	Large pre-trained models, such as BERT, have demonstrated strong performance across various tasks. However, they are vulnerable to catastrophic forgetting in incremental learning, particularly in fede...	3.33	92%	See Reviews	View AI Dashboard
Physics-Preserving Compression of High-Dimensional Plasma Turbulence Simulations	High-fidelity scientific simulations are now producing unprecedented amounts of data, creating a storage and analysis bottleneck. A single simulation can generate tremendous data volumes, often forcin...	5.50	0%	See Reviews	View AI Dashboard
The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Understanding	This paper reveals that many open-source language models (LLMs) lack hierarchical knowledge about our visual world, unaware of even well-established biology taxonomies. This shortcoming makes LLMs a ...	4.00	0%	See Reviews	View AI Dashboard
TemporalBench: Evaluating Fine-Grained Temporal Dynamics Understanding for Multimodal Models	Understanding fine-grained temporal dynamics is crucial for multimodal video comprehension and generation. Due to the lack of fine-grained temporal annotations, existing video benchmarks mostly resemb...	4.00	11%	See Reviews	View AI Dashboard
GraphPCB: Graph-encoded Printed Circuit Board Datasets for Component Classification with Graph Neural Networks	We present a graph-based framework for Printed Circuit Board (PCB) image analysis, targeting core hardware assurance tasks such as IC segmentation and component identification. PCB images differ funda...	3.00	0%	See Reviews	View AI Dashboard
HoVer: Holistic Verification for Semantic-Aware Speculative Generation	We introduce HoVer, a semantic-aware speculative generation framework that accelerates large language model (LLM) inference without retraining. HoVer employs Holistic Verification: a lightwei...	3.33	7%	See Reviews	View AI Dashboard
Safety-Biased Policy Optimisation: Towards Hard-Constrained Reinforcement Learning via Trust Regions	Reinforcement learning (RL) in safety-critical domains requires agents to maximise rewards while strictly adhering to safety constraints. Existing approaches, such as Lagrangian and projection-based m...	2.00	6%	See Reviews	View AI Dashboard
LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation	In the current paradigm of image captioning, deep learning models are trained to generate text from image embeddings of latent features. We challenge the assumption that fine-tuning of large, bespoke ...	2.50	0%	See Reviews	View AI Dashboard
Expert or not? Assessing data quality in offline reinforcement learning	Offline reinforcement learning (RL) learns exclusively from static datasets, without further interaction with the environment. In practice, such datasets vary widely in quality, often mixing expert, s...	2.00	10%	See Reviews	View AI Dashboard
FACT: Fine-grained Across-variable Convolution for Multivariate Time Series Forecasting	Modeling the relationships among variables has become increasingly important, particularly in high-dimensional multivariate time series forecasting tasks. However, most existing methods primarily focu...	5.00	9%	See Reviews	View AI Dashboard
Leave No Observation Behind: Real-time Correction for VLA Action Chunks	To improve efficiency and temporal coherence, Vision-Language-Action (VLA) models often predict action chunks; however, this action chunking harms reactivity under inference delay and long horizons. W...	3.50	0%	See Reviews	View AI Dashboard
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation	While recent generative models synthesize high-quality visual content, they still struggle with generating rare or fine-grained concepts. To address this challenge, we explore the usage of Retrieval-A...	5.00	0%	See Reviews	View AI Dashboard
Knowledgeless Language Models: Decoupling Linguistic Competence and Factual Knowledge	Language models capture a broad spectrum of human knowledge due to being trained on large and diverse real-world datasets. However, this knowledge is not always necessary for linguistic tasks and can ...	4.50	32%	See Reviews	View AI Dashboard
NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion	Adversarial samples exploit irregularities in the manifold "learned" by deep learning models to cause misclassifications. The study of these adversarial samples provides insight into the features a mo...	5.50	0%	See Reviews	View AI Dashboard
FoleyGenEx: Unified Video-to-Audio Generation with Multi-Modal Control, Temporal Alignment, and Semantic Precision	We introduce FoleyGenEx, a unified framework for video-to-audio (VTA) generation that integrates multi-modal control, frame-level temporal alignment, and fine-grained semantic expressivity, enabling s...	4.50	5%	See Reviews	View AI Dashboard
Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation	Recent advancements in Large Vision-Language Models (LVLMs) have demonstrated remarkable multimodal perception capabilities, garnering significant attention. While numerous evaluation studies have eme...	5.00	7%	See Reviews	View AI Dashboard
The Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare	Large language models (LLMs) are increasingly entrusted with high-stakes decisions that affect human welfare. However, the principles and values that guide these models when distributing scarce societ...	3.50	39%	See Reviews	View AI Dashboard
TangleScore: Tangle-Guided Purge and Imprint for Unstructured Knowledge Editing	Large language models (LLMs) struggle with inaccurate and outdated information, driving the emergence of knowledge editing as a lightweight alternative. Despite their effectiveness in modifying struct...	4.50	15%	See Reviews	View AI Dashboard
Monotone Near-Zero-Sum Games	Zero-sum and non-zero-sum (aka general-sum) games are relevant in a wide range of applications. While general non-zero-sum games are computationally hard, researchers focus on the special class of mo...	6.00	0%	See Reviews	View AI Dashboard
Real-IKEA : Simulating What Robots Will Really See and Touch	Robotic manipulation has greatly benefited from simulated data, yet in contact-rich tasks policies often fail to transfer. We trace this sim-to-real gap to three sources: object assets, physical reali...	3.00	41%	See Reviews	View AI Dashboard
Protein Structure Tokenization via Geometric Byte Pair Encoding	Protein structure is central to biological function, and enabling multimodal protein models requires joint reasoning over sequence, structure, and function. A key barrier is the lack of principled pro...	7.50	0%	See Reviews	View AI Dashboard
CORE: Concept-Oriented Reinforcement for Bridging the Definition–Application Gap in Mathematical Reasoning	Large language models (LLMs) often solve drill-style math exercises yet fail to apply the concept right when the problem requires genuine understanding. Popular outcome-based RL pipelines reinforce fi...	5.00	9%	See Reviews	View AI Dashboard
Split Decisions: VLM-Guided Action Sampling for Efficient RL Exploration	Reinforcement learning (RL) offers a general framework for adapting vision-language-action models (VLAs) to new tasks, but its effectiveness is often bottlenecked by inefficient exploration. Existing ...	2.50	0%	See Reviews	View AI Dashboard
CELAD: Compositional Evaluation for Logical Anomaly Detection	Anomaly detection (AD) has attracted significant research interest and now achieves near-perfect performance on most existing benchmarks. However, the majority of prior work has focused on detecting s...	4.50	0%	See Reviews	View AI Dashboard
Video Diffusion Model for Point Tracking	Point tracking aims to estimate pixel trajectories across video frames but remains challenging under large displacements, occlusion, and real-world artifacts. Conventional trackers, built on image-cen...	2.00	0%	See Reviews	View AI Dashboard
DiagVuln: A Holistic Conversational Benchmark for Evaluating LLMs on Vulnerability Assessment	With over 20,000 Common Vulnerabilities and Exposures (CVEs) reported an- nually, software vulnerabilities represent a critical cybersecurity challenge re- quiring automated assessment tools. While la...	4.00	0%	See Reviews	View AI Dashboard
AntiFault: A Fault-Tolerant and Self-Recoverable Floating-Point Format for Deep Neural Networks	Artificial Intelligence (AI) is increasingly deployed in safety-critical applications, where reliability is crucial. However, these AI-based systems are vulnerable to soft errors, where even a single ...	2.00	1%	See Reviews	View AI Dashboard
Fair Diffusion Sampling without Demographics	Diffusion models have transformed generative tasks. Despite their expressive power, these models are known to amplify social biases. Existing approaches attempt to address bias during training, which ...	2.67	0%	See Reviews	View AI Dashboard
Towards Adversarially Robust CLIP: A Hierarchical Model Fusion Method Using Optimal Transport	In recent years, multimodal models such as CLIP have achieved impressive performance but remain vulnerable to adversarial perturbations. Although adversarial training can enhance robustness, it often ...	4.50	27%	See Reviews	View AI Dashboard
AVERAGE CONTROLLED AND AVERAGE NATURAL MI-CRO DIRECT EFFECTS IN SUMMARY CAUSAL GRAPHS	In this paper, we investigate the identifiability of average controlled direct effects and average natural direct effects in causal systems represented by summary causal graphs, which are abstractions...	5.50	0%	See Reviews	View AI Dashboard

PreviousPage 24 of 390 (19490 total rows)Next