|
UniVA: Universal Video Agents towards Next-Generation Video Intelligence |
Recent breakthroughs in visual AI have largely treated video tasks in isolation, with specialized models excelling at generation, editing, segmentation, or understanding individually. We introduce \te... |
4.50 |
59% |
See Reviews |
View AI Dashboard |
|
Fun2spec: Code Contract Synthesis At Scale |
Specification synthesis -- the problem of inferring program specification from program implementation -- is an undecidable problem. Therefore, machine learning and more specifically, autoregressive la... |
4.00 |
10% |
See Reviews |
View AI Dashboard |
|
What Is Missing: Interpretable Ratings for Large Language Model Outputs |
Current Large Language Model (LLM) preference learning methods such as Proximal Policy Optimization and Direct Preference Optimization rely on direct rankings or numerical ratings of model outputs as ... |
1.50 |
0% |
See Reviews |
View AI Dashboard |
|
Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance |
We propose a step-by-step video-to-audio (V2A) generation method for finer controllability over the generation process and more realistic audio synthesis.
Inspired by traditional Foley workflows, our ... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Seeing What’s Not There: Negation Understanding Needs More Than Training |
Understanding the negation in a sentence is an important part of compositional
understanding and logic in natural language. Many practical AI applications, such
as autonomous driving, include precise ... |
5.60 |
0% |
See Reviews |
View AI Dashboard |
|
Transformers tend to memorize geometrically; it is unclear why. |
We present a clean and analyzable phenomenon that contrasts the predominant *associative* view of Transformer memory with a nascent *geometric* view. Concretely, we present an *in-weights* path-findin... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Enjoy Your Layer Normalization with the Computation Efficiency of RMSNorm |
Layer normalization (LN) is a milestone technique in deep learning and has been widely adopted across various network architectures. However, LN introduces additional computational costs in the infere... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models |
There has been much recent interest in evaluating large language models for uncertainty calibration to facilitate model control and modulate user trust. Inference time uncertainty, which may provide a... |
3.20 |
0% |
See Reviews |
View AI Dashboard |
|
Modality-Aware Quantization: Balancing Visual and Textual Fidelity in Multimodal Compression |
Vision-language models (VLMs) have achieved remarkable capabilities across multimodal tasks, yet their deployment remains constrained by substantial computational requirements. While post-training qua... |
3.33 |
35% |
See Reviews |
View AI Dashboard |
|
When Agents “Misremember” Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems |
Recent advancements in large language models (LLMs) have significantly enhanced the capabilities of collaborative multi-agent systems, enabling them to address complex challenges. However, within thes... |
5.50 |
4% |
See Reviews |
View AI Dashboard |
|
ImmunoTrace: A Meta-Agent for Immune History Tracking |
The adaptive immune system encodes an individual's exposure history in the T-cell receptor (TCR) repertoire. We present ImmunoTrace, an AI agent for immune history tracking that estimates past pathoge... |
3.00 |
40% |
See Reviews |
View AI Dashboard |
|
MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems |
Scaling up data, parameters, and test-time computation has been the mainstream methods to improve LLM systems (LLMsys), but their upper bounds are almost reached due to the gradual depletion of high-q... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Structure Learning from Time-Series Data with Lag-Agnostic Structural Prior |
Learning instantaneous and time-lagged causal relationships from time-series data is essential for uncovering fine-grained, temporally-aware interactions. Although this problem has been formulated as ... |
5.50 |
8% |
See Reviews |
View AI Dashboard |
|
From QKV to K/KV: Investigating Minimalist Attention Mechanisms |
Transformers have become the standard solution for various AI tasks. The widely adopted query, key, and value (QKV) formulation has played a significant role in this. Although the performance of trans... |
2.80 |
14% |
See Reviews |
View AI Dashboard |
|
Discrete Feynman-Kac Correctors |
Discrete diffusion models have recently appeared as a promising alternative to the autoregressive approach for generating discrete sequences. Sample generation via gradual denoising or demasking proce... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Architectural Plasticity for Continual Learning |
Neural networks for continual reinforcement learning (CRL) often suffer from plasticity loss—a progressive decline in their ability to learn new tasks arising from increased churn and Neural Tangent K... |
2.50 |
18% |
See Reviews |
View AI Dashboard |
|
Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs |
As foundation models continue to scale, the size of trained models grows exponentially, presenting significant challenges for their evaluation. Current evaluation practices involve curating increasing... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Eliminating Steady-State Oscillations in Distributed Optimization and Learning via Adaptive Stepsize |
Distributed stochastic optimization and learning is gaining increasing traction due to its ability to enable large-scale data processing and model training across multiple agents without the need for ... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
SA-ResGS: Self-Augmented Residual 3D Gaussian Splatting for Next Best View Selection |
We propose Self-Augmented Residual 3D Gaussian Splatting (SA-ResGS), a novel framework for stabilizing uncertainty quantification and enhancing uncertainty-aware supervision in next-best-view selectio... |
5.00 |
28% |
See Reviews |
View AI Dashboard |
|
Temporal Difference Learning with Constrained Initial Representations |
Recently, there have been numerous attempts to enhance the sample efficiency of off-policy reinforcement learning (RL) agents when interacting with the environment, including architecture improvements... |
2.50 |
0% |
See Reviews |
View AI Dashboard |
|
Equivariant Asynchronous Diffusion: An Adaptive Denoising Schedule for Accelerated Molecular Conformation Generation |
Recent 3D molecular generation methods primarily use asynchronous auto-regressive or synchronous diffusion models. While auto-regressive models build molecules sequentially, they're limited by a short... |
3.50 |
3% |
See Reviews |
View AI Dashboard |
|
Difference-Aware Retrieval Polices for Imitation Learning |
Behavior cloning suffers from poor generalization to out-of-distribution states due to compounding errors during deployment. We present Difference-Aware Retrieval Polices for Imitation Learning (DARP)... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Optimal Dataset Design for Nurture-then-Nature Teaching |
Designing an optimal dataset to teach a target concept to a learner has been a well-studied problem in Machine Learning. Prior works have mostly focused on unconstrained single-phase teaching, where t... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Are Large Vision-Language Models Good Annotators for Image Tagging? |
Image tagging, a fundamental vision task, traditionally relies on human-annotated datasets to train multi-label classifiers, which incurs significant labor and costs, especially for large-scale label ... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Critical Confabulation: Can LLMs Hallucinate for Social Good? |
LLMs hallucinate, yet some confabulations can have social affordances if carefully bounded. We propose critical confabulation (inspired by critical fabulation from African American Studies), the use o... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Addressing divergent representations from causal interventions on neural networks |
A common approach to mechanistic interpretability is to causally manipulate model representations via targeted interventions in order to understand what those representations encode. Here we ask wheth... |
5.20 |
0% |
See Reviews |
View AI Dashboard |
|
Enhancing Tool Calling in LLMs with the International Tool Calling Dataset |
Tool calling allows large language models (LLMs) to interact with external systems like APIs, enabling applications in customer support, data analysis, and dynamic content generation. Despite recent a... |
2.50 |
38% |
See Reviews |
View AI Dashboard |
|
HoloGarment: 360$\degree$ Novel View Synthesis of In-the-Wild Garments |
Novel view synthesis (NVS) of in-the-wild garments is a challenging task due significant occlusions, complex human poses, and cloth deformations. Prior methods rely on synthetic 3D training data consi... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods? |
Current benchmarks are inadequate for evaluating progress in reinforcement learning (RL) for large language models (LLMs).Despite recent benchmark gains reported for RL, we find that training on these... |
3.33 |
38% |
See Reviews |
View AI Dashboard |
|
SPRIG: Improving Large Language Model Performance by System Prompt Optimization |
Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performance depends, in part, on the choice of prompt. Past research has focused on optimizing prompts spec... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time |
Repeated Sampling (RS) is a simple inference-time algorithm that has been shown to improve model performance on complex tasks. Although it is an effective way of scaling inference time, it often strug... |
6.00 |
10% |
See Reviews |
View AI Dashboard |
|
SimCity: Multi-Agent Urban Development Simulation with Rich Interactions |
Large Language Models (LLMs) open new possibilities for constructing realistic and interpretable macroeconomic simulations. We present $\textbf{SimCity}$, a multi-agent framework that leverages LLMs t... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Probing Confidence Regions for Early Exits in Chain-of-Thought |
Chain-of-Thought (CoT) has demonstrated remarkable problem-solving capabilities in many large language models (LLMs), but their reasoning processes often exhibit substantial redundancy. To mitigate th... |
3.50 |
19% |
See Reviews |
View AI Dashboard |
|
MTS-UNMixers: Multivariate Time Series Forecasting via Channel-Time Dual Unmixing |
Multivariate time series data provide a robust framework for future predictions by leveraging information across multiple dimensions, ensuring broad applicability in practical scenarios. However, thei... |
4.50 |
40% |
See Reviews |
View AI Dashboard |
|
Augmenting Research Ideation with Data: An Empirical Investigation in Social Science |
Large Language Models (LLMs) show strong potential for generating novel research ideas, yet such ideas often struggle with feasibility and effectiveness.
In this paper, we investigate whether augment... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval |
Zero-shot anomaly detection (ZSAD) often leverages pretrained vision or vision-language models, but many existing methods use prompt learning or complex modeling to fit the data distribution, resultin... |
6.00 |
3% |
See Reviews |
View AI Dashboard |
|
MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents |
Building general-purpose graphical user interface (GUI) agents has become increasingly promising with the progress in vision language models. However, developing effective mobile GUI agents with reinf... |
4.50 |
14% |
See Reviews |
View AI Dashboard |
|
Discrete Diffusion Models with MLLMs for Unified Medical Multimodal Generation |
Advances in generative medical models are often constrained by modality-specific scenarios that hinder the integration of complementary evidence, such as imaging, pathology, and clinical notes. This f... |
4.50 |
16% |
See Reviews |
View AI Dashboard |
|
MIMIC-VQA: COMPILING AGENTIC REASONERS INTO EFFICIENT DOCUMENT VQA MODELS |
Document Visual Question Answering systems face a fundamental architectural dichotomy: modular agentic frameworks decompose problems into interpretable sub-tasks but incur prohibitive inference latenc... |
3.50 |
69% |
See Reviews |
View AI Dashboard |
|
FingER: Fact-Level Answerability for Explainable Refusals in Multi-Hop RAG |
Large language models (LLMs) are extensively adopted in retrieval-augmented generation (RAG) systems for solving multi-hop reasoning tasks. While prior works effectively utilize retrieved external kno... |
3.00 |
9% |
See Reviews |
View AI Dashboard |
|
TimeFK: Towards Time Series Forecasting via Treating LLMs as Fuzzy Key |
Time series forecasting (TSF) aims to predict future values based on historical data. Recent advancements in large language models (LLMs), which integrate cross-modal information (time series data and... |
3.50 |
22% |
See Reviews |
View AI Dashboard |
|
Efficient Differentiable Contact Model with Long-range Influence |
With the maturation of differentiable physics, its role in various downstream applications—such as model-predictive control, robotic design optimization, and neural PDE solvers—has become increasingly... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Multi-Modal Spiking Neural Network for Efficient and Robust Underwater Object Detection |
Multi-modal artificial neural networks (ANNs) have demonstrated strong performance gains in object detection by leveraging complementary information from diverse data modalities. However, these gains ... |
3.00 |
46% |
See Reviews |
View AI Dashboard |
|
LEVERAGING RECURSION FOR EFFICIENT FEDERATED LEARNING |
Federated learning algorithms perform multiple local updates on clients before communicating with the parameter server to reduce communication overhead and improve overall training efficiency. However... |
3.33 |
0% |
See Reviews |
View AI Dashboard |
|
To the Best of Trust: Full-Stage Trusted Multi-modal Clustering |
Multi-modal clustering (MMC) aims to integrate complementary information from different modalities to uncover latent consistent structures and improve clustering performance.However, existing methods ... |
4.00 |
4% |
See Reviews |
View AI Dashboard |
|
From Motion to Behavior: Hierarchical Modeling of Humanoid Generative Behavior Control |
Human motion generative modeling aims to synthesize complex motions from daily activities. However, current research is fragmented, focusing on either low-level, short-horizon motions or high-level, d... |
4.00 |
21% |
See Reviews |
View AI Dashboard |
|
Collaborative Dual-Size Large Language Models with Dual-Stage Deferral Risk Control |
Large Language Models (LLMs) have demonstrated remarkable capabilities, yet ensuring their safe deployment remains challenging. Existing safety mechanisms, while effective against malicious inputs, of... |
3.00 |
66% |
See Reviews |
View AI Dashboard |
|
CaRe-BN: Precise Moving Statistics for Stabilizing Spiking Neural Networks in Reinforcement Learning |
Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision-making on neuromorphic hardware by mimicking the event-driven dynamics of biological neurons. However, due to the discret... |
4.50 |
32% |
See Reviews |
View AI Dashboard |
|
VerifyThisBench: Generating Code, Specifications, and Proofs All at Once |
Large language models (LLMs) have demonstrated remarkable progress in code generation, but many existing benchmarks are approaching saturation and offer little guarantee on the trustworthiness of the ... |
4.67 |
17% |
See Reviews |
View AI Dashboard |
|
Bayesian Influence Functions for Hessian-Free Data Attribution |
Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bay... |
5.50 |
4% |
See Reviews |
View AI Dashboard |