|
Multi-Source Collaborative Style Augmentation and Domain-Invariant Learning for Federated Domain Generalization |
Federated domain generalization aims to learn a generalizable model from multiple decentralized source domains for deploying on the unseen target domain. Style augmentation approaches have achieved si... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning |
Reinforcement learning with verifiable rewards (RLVR) improves reasoning in LLMs but struggles with exploration, an issue that still persists for Multimodal LLMs (MLLMs). Current methods treat the vis... |
3.00 |
17% |
See Reviews |
View AI Dashboard |
|
RTG: Reverse Trajectory Generation for Reinforcement Learning Under Sparse Reward |
Deep Reinforcement Learning (DRL) under sparse reward conditions remains a long-standing challenge in robotic learning. In such settings, extensive exploration is often required before meaningful rewa... |
5.00 |
12% |
See Reviews |
View AI Dashboard |
|
SumRA: Parameter Efficient Fine-tuning with Singular Value Decomposition and Summed Orthogonal Basis |
Parameter-efficient fine-tuning (PEFT) aims to adapt large pretrained speech models using fewer trainable parameters while maintaining performance. Low-Rank Adaptation (LoRA) achieves this by decompos... |
5.33 |
5% |
See Reviews |
View AI Dashboard |
|
Strategic Filtering for Content Moderation: Free Speech or Free of Distortion? |
User-generated content (UGC) on social media platforms is vulnerable to incitements and manipulations, necessitating effective regulations. To address these challenges, those platforms often deploy au... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Tree Reward-Aligned Search for TReASURe in Masked Diffusion Language Models |
Tree search has recently emerged as a powerful framework for aligning generative models with task-specific rewards at test time.
Applying tree search to Masked Diffusion Language Models, however, int... |
3.33 |
13% |
See Reviews |
View AI Dashboard |
|
Radio Frequency Ray Tracing via Stochastic Geometry |
Radio frequency (RF) propagation modeling is essential for the design, analysis, and optimization of modern wireless sensing and communication systems. However, accurately modeling RF propagation in e... |
4.50 |
4% |
See Reviews |
View AI Dashboard |
|
UniHM: Unified Dexterous Hand Manipulation with Vision Language Model |
Planning physically feasible dexterous hand manipulation is a central challenge in robotic manipulation and Embodied AI. Prior work typically relies on object-centric cues or precise hand-object inter... |
5.50 |
14% |
See Reviews |
View AI Dashboard |
|
Unveiling the Potential of Diffusion Large Language Model in Controllable Generation |
Controllable generation is a fundamental task in NLP with many applications, providing a basis for function calling to agentic communication. However, even state-of-the-art autoregressive Large Langua... |
5.00 |
18% |
See Reviews |
View AI Dashboard |
|
MAV-SLAM: Multi-LLM-Agent Crew for Visual SLAM with 3D Gaussian Splatting |
Visual Simultaneous Localization and Mapping (SLAM) reconstructs the metric structure of the physical world from sensor imagery, enabling precise robotic pose estimation. However, environmentally indu... |
3.00 |
14% |
See Reviews |
View AI Dashboard |
|
Speech Codecs Beyond Compression: Towards Autoregressive Generative Modeling |
Recent advances in speech language models have leveraged discrete speech representations from pretrained codecs to enable scalable training and generation. However, existing codecs are primarily desig... |
3.00 |
37% |
See Reviews |
View AI Dashboard |
|
ROSA: Harnessing Robot States for Vision-Language and Action Alignment |
Vision-Language-Action (VLA) models have recently made significant advance in multi-task, end-to-end robotic control, due to the strong generalization capabilities of Vision-Language Models (VLMs). A ... |
3.00 |
4% |
See Reviews |
View AI Dashboard |
|
DanceTogether: Generating Interactive Multi-Person Video without Identity Drifting |
Controllable video generation (CVG) has advanced rapidly, yet current systems falter when more than one actor must move, interact, and exchange positions under noisy control signals. We address this g... |
4.50 |
13% |
See Reviews |
View AI Dashboard |
|
MTRE: Multi-Token Reliability Estimation for Hallucination Detection in VLMs |
Vision–language models (VLMs) now rival human performance on many multimodal tasks, yet they still hallucinate objects or generate unsafe text. Current hallucination detectors, e.g., single-token line... |
5.00 |
12% |
See Reviews |
View AI Dashboard |
|
Bounding Conditional Value-at-Risk via Auxiliary Distributions with Bounded Discrepancies |
In this paper, we develop a theoretical framework for bounding the CVaR of a random variable $X$ using another related random variable $Y$, under assumptions on their cumulative and density functions.... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Rethinking and Benchmarking Large Language Models for Graph Reasoning |
Large Language Models (LLMs) for Graph Reasoning have been extensively studied over the past two years, involving enabling LLMs to understand graph structures and reason on graphs to solve various gra... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
PATEin: A Privacy-Preserving Framework for Knowledge Integration via Adaptive Teacher Selection in C-LLMs |
In-context learning (ICL) enables task adaptation without modifying model parameters, making it well-suited for commercial large language models (C-LLMs) with closed-source constraints. However, ICL p... |
4.50 |
26% |
See Reviews |
View AI Dashboard |
|
DHEvo: Data-Algorithm Based Heuristic Evolution for Generalizable MILP Solving |
Primal heuristics are crucial for accelerating the solving process of mixed integer programming (MILP) problems. While large language models (LLMs) have shown great promise in generating effective heu... |
4.50 |
5% |
See Reviews |
View AI Dashboard |
|
AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment |
Single-view RGB model-based object pose estimation methods achieve strong generalization performance but are fundamentally limited by depth ambiguity, clutter, and occlusions. Multi-view pose estimati... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Controllable Test-Time Scaling via Sparse Autoencoder‑Based Reasoning Steering |
A common Test-Time Scaling (TTS) strategy for Large Language Models (LLMs) reasoning is allocating additional computation during inference to generate longer Chains-of-Thoughts (CoTs).
However, simpl... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Data-Augmented Few-Shot Neural Emulator for Computer-Model System Identification |
Partial differential equations (PDEs) underpin the modeling of many natural and engineered systems. It can be convenient to express such models as neural PDEs rather than using traditional numerical P... |
3.00 |
7% |
See Reviews |
View AI Dashboard |
|
SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data |
Learning from scarce labeled data with a larger pool of unlabeled samples, known as semi-supervised few-shot learning (SS-FSL), remains critical for applications involving tabular data in domains like... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
MARS: Harmonizing Multimodal Convergence via Adaptive Rank Search |
Fine-tuning Multimodal Large Language Models (MLLMs) with parameter-efficient methods like Low-Rank Adaptation (LoRA) is crucial for task adaptation. However, imbalanced training dynamics across modal... |
4.80 |
38% |
See Reviews |
View AI Dashboard |
|
It takes two for security: A Verifiable Co-Aggregation Protocol for Heterogeneous Federated Distillation |
Federated distillation (FD) enables efficient collaboration among heterogeneous models, yet its rising application in privacy-sensitive fields raises security concerns. Advanced countermeasures have i... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
DuoLink: A Dual Perspective on Link Prediction via Line Graphs |
Link prediction is a fundamental task in network science with broad applications, yet state-of-the-art Graph Neural Networks (GNNs) consistently underperform simple heuristic methods on established be... |
3.00 |
23% |
See Reviews |
View AI Dashboard |
|
FlashRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models |
Recurrent Neural Networks (RNNs) laid the foundation for sequence modeling, but their intrinsic sequential nature restricts parallel computation, creating a fundamental barrier to scaling. This has le... |
6.50 |
0% |
See Reviews |
View AI Dashboard |
|
Multi-Human Interactive Talking Dataset |
Existing studies on talking video generation have predominantly focused on single-person monologues or isolated facial animations, limiting their applicability to realistic multi-human interactions. T... |
2.67 |
0% |
See Reviews |
View AI Dashboard |
|
Ego-VGA: A Compact Multimodal Assistant for Egocentric Video–Grounded Reasoning |
Egocentric AI assistants have emerged as a promising paradigm for real-world human–AI interaction, yet existing approaches face a critical trade-off: large language models provide strong reasoning but... |
5.33 |
37% |
See Reviews |
View AI Dashboard |
|
SelvaBox: A high‑resolution dataset for tropical tree crown detection |
Detecting individual tree crowns in tropical forests is essential to study these complex and crucial ecosystems impacted by human interventions and climate change. However, tropical crowns vary widely... |
7.00 |
0% |
See Reviews |
View AI Dashboard |
|
CryptoX : Compositional Reasoning Evaluation of Large Language Models |
The compositional reasoning ability has long been regarded as critical to the generalization and intelligence emergence of large language models (LLMs). However, despite numerous reasoning-related ben... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Medical Decision Tree-Enhanced LLMs for Interpretable Reasoning |
Large Language Models have made significant strides in medical reasoning. However, challenges remain due to their limited medical knowledge and the risk of hallucinations. While RAG methods can mitiga... |
3.00 |
6% |
See Reviews |
View AI Dashboard |
|
Memorization is Not Learning: Delineated through Features and Labels |
Although deep learning is widely adopted for its capability to fit training data effectively, it often memorizes outliers and/or mislabeled instances, a phenomenon known as label memorization. As for ... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
The Markovian Thinker |
Reasoning LLMs suffer from quadratic compute growth as their context length increases, making reinforcement learning with verifiable rewards (RLVR) and test-time scaling prohibitively expensive. Prior... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Don't Throw Away Your Pretrained Model |
Alignment training has tradeoffs: it helps language models (LMs) gain in reasoning and instruction following but might lose out on skills such as creativity and calibration, where unaligned base model... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL |
Despite the significant advances in Deep Reinforcement Learning (RL) observed in the last decade, the amount of training experience necessary to learn effective policies remains one of the primary con... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models |
Reasoning has emerged as a pivotal capability in Large Language Models (LLMs). Through Reinforcement Learning (RL), typically Group Relative Policy Optimization (GRPO), these models are able to solve ... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
PepGlider: Attribute Regularized VAE for Interpretable and Controllable Peptide Design |
Computational peptide design requires precise control over physicochemical properties that often exhibit complex correlations. Existing generative models rely on simplistic discrete conditioning mecha... |
4.00 |
71% |
See Reviews |
View AI Dashboard |
|
RainDiff: End to End Precipitation Nowcasting Via Token-wise Attention Diffusion |
Precipitation nowcasting, predicting future radar echo sequences from current observations, is a critical yet challenging task due to the inherently chaotic and tightly coupled spatio-temporal dynamic... |
4.00 |
15% |
See Reviews |
View AI Dashboard |
|
EGG-SR: Embedding Symbolic Equivalence into Symbolic Regression via Equality Graph |
Symbolic regression seeks to uncover physical laws from experimental data by searching for closed-form expressions, which is an important task in AI-driven scientific discovery. Yet the exponential gr... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
The Consequences of the Intrinsic Gap Between Reward Beliefs and MDP Rewards |
Deep neural policies have gained the ability to learn and execute sequences of decisions in MDPs that involve complex and high-dimensional states. Despite the growing use of reinforcement learning in ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Reduce What You Use: Input‑Aware Matrix‑Multiplication Pruning for LLMs |
Transformer-based language models achieve strong performance but at high computational cost, raising the question of whether their full dimensional capacity is necessary at inference. We introduce Red... |
3.60 |
37% |
See Reviews |
View AI Dashboard |
|
COFormer: Towards a Foundation Model for Solving Combinatorial Optimization Problems |
Combinatorial Optimization Problems (COP) encompasses a wide range of real-world scenarios. While learning-based methods have achieved notable success on specialized COPs, the development of a unified... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Evidence for Limited Metacognition in LLMs |
The possibility of LLM self-awareness and even sentience is gaining increasing public attention and has major safety and policy implications, but the science of measuring them is still in a nascent st... |
5.33 |
0% |
See Reviews |
View AI Dashboard |
|
Self-Organizing Resonant Network |
We introduce the Self-Organizing Resonant Network (SORN), a novel learning paradigm that operates without backpropagation. To address core challenges in representation quality, learning stability, and... |
4.00 |
96% |
See Reviews |
View AI Dashboard |
|
Circuit Insights: Towards Interpretability Beyond Activations |
The fields of explainable AI and mechanistic interpretability aim to uncover the internal structure of neural networks, with circuit discovery as a central tool for understanding model computations. E... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Toward Resilient Watermark Detection: Stability-Aware Statistical Features for Machine-Generated Text |
The widespread adoption of large language models (LLMs) has intensified the demand for principled methods to distinguish human- from machine-generated text. Watermarking provides a promising avenue, y... |
3.50 |
40% |
See Reviews |
View AI Dashboard |
|
Robobench: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models as Embodied Brain |
Building robots that can perceive, reason, and act in dynamic, unstructured environments remains a core challenge. Recent embodied systems often adopt a dual-system paradigm, where System 2 handles hi... |
3.50 |
7% |
See Reviews |
View AI Dashboard |
|
Private and debiased model training: A fair differential privacy gradient framework |
Deep learning models are vulnerable to leak private information about the training data. Differential privacy (DP) is increasingly implemented in deep learning to preserve the data privacy through dif... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior |
Despite recent advances in leveraging generative prior from pre-trained diffusion models for 3D scene reconstruction, existing methods still face two critical limitations. First, due to the lack of re... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
T1: One-to-One Channel-Head Binding for Multivariate Time-Series Imputation |
Imputing missing values in multivariate time series remains challenging, especially under diverse missing patterns and heavy missingness. Existing methods suffer from suboptimal performance as corrupt... |
5.50 |
33% |
See Reviews |
View AI Dashboard |