|
$DA^2$-VPR: Dynamic Architecture for Domain-Aware Visual Place Recognition |
Visual Place Recognition (VPR) systems struggle with training-to-test domain shifts caused by environmental changes such as lighting, weather, and seasonal variations. Existing methods rely on input-i... |
3.50 |
62% |
See Reviews |
View AI Dashboard |
|
Fed-Duet: Dual Expert-Orchestrated Framework for Continual Federated Vision-Language Learning |
Pretrained vision-language models (VLMs), such as CLIP, have shown promise in federated learning (FL) by bringing strong multimodal representations to edge devices. However, continual adaptation remai... |
5.00 |
38% |
See Reviews |
View AI Dashboard |
|
On the (In)Significance of Feature Selection in High-Dimensional Datasets |
Feature selection (FS) is assumed to improve predictive performance and highlight meaningful features. We systematically evaluate this across $30$ diverse datasets, including RNA-Seq, mass spectrometr... |
4.67 |
10% |
See Reviews |
View AI Dashboard |
|
PolySHAP: Extending KernelSHAP with Interaction-Informed Polynomial Regression |
Shapley values have emerged as a central game-theoretic tool in explainable AI (XAI). However, computing Shapley values exactly requires $2^d$ game evaluations for a model with $d$ features. Lundberg ... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Towards Human-Like Event Boundary Detection in Unstructured Videos through Scene-Action Transition |
Event segmentation research in psychology shows that humans naturally parse continuous activity into meaningful episodes by detecting boundaries marked by changes in perceptual features (e.g., motion)... |
4.40 |
75% |
See Reviews |
View AI Dashboard |
|
Usage-Aware Sentiment Representations in Large Language Models |
Large language models (LLMs) can encode high-level concepts as linear directions in their representation space, and sentiment has been studied in this framework. However, probe-derived sentiment direc... |
4.00 |
36% |
See Reviews |
View AI Dashboard |
|
AC-ODM: Actor–Critic Online Data Mixing for Sample-Efficient LLM Pretraining |
Pretraining data coverage and composition strongly influence the generalization of large language models (LLMs). While recent data-mixing approaches transfer domain weights learned by a small proxy mo... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Training-Free Watermarking for Autoregressive Image Generation |
Invisible image watermarking can protect image ownership and prevent malicious misuse of visual generative models. However, existing generative watermarking methods are mainly designed for diffusion m... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Token-Level Guided Discrete Diffusion for Membrane Protein Design |
Reparameterized diffusion models (RDMs) have recently matched autoregressive methods in protein generation, motivating their use for challenging tasks such as designing membrane proteins, which posses... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
SymLight: Exploring Interpretable and Deployable Symbolic Policies for Traffic Signal Control |
Deep Reinforcement Learning have achieved significant success in automatically devising effective traffic signal control (TSC) policies. Neural policies, however, tend to be over-parameterized and non... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection |
While machine-generated texts (MGTs) offer great convenience, they also pose risks such as disinformation and phishing, highlighting the need for reliable detection. Metric-based methods, which extrac... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Diff-Fair: Mitigating Intersectional Bias Through Diffusion-Driven Fair Representation |
Algorithmic fairness remains a critical challenge in Artificial-Intelligence, particularly for high-stakes domains where biased predictions can have significant societal consequences. While recent adv... |
3.00 |
47% |
See Reviews |
View AI Dashboard |
|
MARS: Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning |
Large Reasoning Models (LRMs) often exhibit a tendency for overanalysis in simple tasks, where the models excessively utilize System 2-type, deliberate reasoning, leading to inefficient token generati... |
3.50 |
46% |
See Reviews |
View AI Dashboard |
|
Learnable Fractional Fourier and Graph Fractional Operators for Nonstationary Graph Signals Validated with EEG Seizure Detection |
Nonstationary graph signals with time-varying spectral properties and evolving network topologies present fundamental challenges for existing deep learning architectures. We introduce learnable fracti... |
2.00 |
44% |
See Reviews |
View AI Dashboard |
|
GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing |
Editing images using natural language instructions has become a natural and expressive way to modify visual content; yet, evaluating the performance of such models remains challenging. Existing evalua... |
4.00 |
12% |
See Reviews |
View AI Dashboard |
|
Benchmarking Bias Mitigation Toward Fairness Without Harm from Vision to LVLMs |
Machine learning models trained on real-world data often inherit and amplify biases against certain social groups, raising urgent concerns about their deployment at scale. While numerous bias mitigati... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Forecasting-Conditioned Reinforcement Learning: Embedding Forecastability as an Inductive Bias |
We introduce Forecasting-Conditioned Reinforcement Learning (FoRL), an extension to model-free Reinforcement Learning (RL) agents that augments the policy with multi-step self-forecasts. FoRL is train... |
4.00 |
37% |
See Reviews |
View AI Dashboard |
|
The Art of Breaking Words: Rethinking Multilingual Tokenizer Design |
While model architecture and training objectives are well-studied, tokenization,
particularly in multilingual contexts, remains a relatively neglected aspect of
Large Language Model (LLM) development.... |
2.50 |
15% |
See Reviews |
View AI Dashboard |
|
Graph-Theoretic Intrinsic Reward: Guiding RL with Effective Resistance |
Exploration of dynamic environments with sparse rewards is a significant challenge in Reinforcement Learning, often leading to inefficient exploration and brittle policies. To address this, we introdu... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning |
While Chain-of-Thought (CoT) prompting enhances the reasoning capabilities of large language models, the faithfulness of the generated rationales remains a critical open problem. We propose a novel th... |
2.00 |
64% |
See Reviews |
View AI Dashboard |
|
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States |
Autoregressive (AR) models remain the standard for natural language generation but still suffer from high latency due to strictly sequential decoding. Recent diffusion-inspired approaches, such as Lla... |
6.50 |
39% |
See Reviews |
View AI Dashboard |
|
Hallucination Detection and Mitigation with Diffusion in Multi-Variate Time-Series Foundation Models |
Foundation models for natural language processing have many coherent definitions of hallucination and methods for its detection and mitigation. However, analogous definitions and methods do not exist ... |
3.33 |
0% |
See Reviews |
View AI Dashboard |
|
Jailbreaking LLMs' Safeguard with Universal Magic Words for Text Embedding Models |
The security issue of large language models (LLMs) has gained wide attention recently, with various defense mechanisms developed to prevent harmful output, among which safeguards based on text embeddi... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs |
Rotary Position Embeddings (RoPE) have become a standard for encoding sequence order in Large Language Models (LLMs) by applying rotations to query and key vectors in the complex plane. Standard imple... |
6.00 |
2% |
See Reviews |
View AI Dashboard |
|
Scaling Laws for Parameter Pruning in LLMs |
Scaling up model parameters and training data consistently improves the performance of large language models (LLMs), but at the cost of rapidly growing memory and compute requirements, which makes dep... |
2.50 |
25% |
See Reviews |
View AI Dashboard |
|
Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique |
Growing concerns over the theft and misuse of Large Language Models (LLMs) underscore the need for effective fingerprinting to link a model to its original version and detect misuse. We define five es... |
5.33 |
19% |
See Reviews |
View AI Dashboard |
|
Focused Diffusion GAN: Object-Centric Image Generation Using Integrated GAN and Diffusion Frameworks |
Generative Adversarial Networks (GANs) and Diffusion Models (DMs) have shown significant progress in synthesizing high-quality object-centric images. However, generating realistic object-centric image... |
3.00 |
77% |
See Reviews |
View AI Dashboard |
|
Long Chain-of-Thought Reasoning Across Languages |
While large reasoning models have shown remarkable ability to generate long chains-of-thought (CoT) in English, we still lack understanding of how these long-form reasoning abilities transfer to the v... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
PaT: Planning-after-Trial for Efficient Code Generation |
Large language models (LLMs) have demonstrated increasingly sophisticated capabilities for code generation. To extend the problem-solving reach of cost-efficient models to complex problems, strategic ... |
3.50 |
3% |
See Reviews |
View AI Dashboard |
|
Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models |
Vision-Language Models (VLMs) in remote sensing often fail at complex analytical tasks, a limitation stemming from their end-to-end training paradigm that bypasses crucial reasoning steps and leads to... |
5.00 |
26% |
See Reviews |
View AI Dashboard |
|
Exploring Non-linearity in Attention |
The representational ability of Transformer architectures arises from two sources of non-linearity: position-wise non-linearity via feed-forward layers and contextual non-linearity through self-attent... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
UNDERSTANDING TRANSFORMERS FOR TIME SEIRES FORECASTING: A CASE STUDY ON MOIRAI |
We give a comprehensive theoretical analysis of transformers as time series pre-
diction models, with a focus on MOIRAI (Woo et al., 2024). We study its ap-
proximation and generalization capabilities... |
5.33 |
0% |
See Reviews |
View AI Dashboard |
|
Helmsman: Autonomous Synthesis of Federated Learning Systems via Multi-Agent Collaboration |
Federated Learning (FL) offers a powerful paradigm for training models on decentralized data, but its promise is often undermined by the immense complexity of designing and deploying robust systems. T... |
4.00 |
84% |
See Reviews |
View AI Dashboard |
|
MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval |
We introduce MRMR, the first expert-level multidisciplinary multimodal retrieval benchmark requiring intensive reasoning. MRMR contains 1,502 queries spanning 23 domains, with positive documents caref... |
6.50 |
0% |
See Reviews |
View AI Dashboard |
|
DUET: DISTILLED LLM UNLEARNING FROM AN EFFICIENTLY CONTEXTUALIZED TEACHER |
LLM unlearning is a technique to remove the impacts of undesirable knowledge from the model without retraining from scratch, which is indispensable towards trustworthy AI. Existing unlearning methods ... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Cost Volume Meets Prompt: Enhancing MVS with Prompts for Autonomous Driving |
Metric depth is foundational for perception, prediction, and planning in autonomous driving.
Recent zero-shot metric depth foundation models still exhibit substantial distortions under large-scale ran... |
4.00 |
23% |
See Reviews |
View AI Dashboard |
|
BEEP3D: Box-Supervised End-to-End Pseudo-Mask Generation for 3D Instance Segmentation |
3D instance segmentation is crucial for understanding complex 3D environments, yet fully supervised methods require dense point-level annotations, resulting in substantial annotation costs and labor o... |
5.00 |
24% |
See Reviews |
View AI Dashboard |
|
Bridging the Distribution Gap to Harness Pretrained Diffusion Priors for Super-Resolution |
Diffusion models, well recognized for their strong generative priors, have recently been increasingly applied to super-resolution (SR) tasks.
However, as diffusion models are trained on Gaussian-corr... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Circuits, Features, and Heuristics in Molecular Transformers |
Transformers generate valid and diverse chemical structures, but little is known about the mechanisms that enable these models to understand the rules of molecular representation. We present a mechani... |
5.00 |
65% |
See Reviews |
View AI Dashboard |
|
MultiCrafter: High-Fidelity Multi-Subject Generation via Spatially Disentangled Attention and Identity-Aware Reinforcement Learning |
Multi-subject image generation aims to synthesize user-provided subjects in a single image while preserving subject fidelity, ensuring prompt consistency, and aligning with human aesthetic preferences... |
4.50 |
9% |
See Reviews |
View AI Dashboard |
|
DeepResearchGuard: Deep Research with Open Domain Evaluation and Multi-Stage Guardrails for Safety |
Current deep research frameworks lack adequate evaluation procedures and stage-specific safeguards. Prior work primarily treats evaluation as question-answering accuracy. It overlooks report quality, ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes |
While knowledge distillation has become a mature field for compressing large language models (LLMs) into smaller ones by aligning their outputs or internal representations, the distillation of LLM-bas... |
2.67 |
54% |
See Reviews |
View AI Dashboard |
|
PitStop: Physics-Informed Training with Gradient Stopping |
Physics-informed learning offers a powerful approach for modeling physical systems by enforcing governing equations directly within the training process. However, optimizing such models remains inhere... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
RDNAS: Robust Dual-Branch Neural Architecture Search |
Deep neural networks have achieved remarkable success but remain highly vulnerable to adversarial perturbations, posing serious challenges in safety-critical applications. We propose **RDNAS**, a robu... |
4.00 |
51% |
See Reviews |
View AI Dashboard |
|
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence |
Recent advances in depth-recurrent language models show that recurrence can decouple train-time compute and parameter count from test-time compute.
In this work, we study how to convert existing pretr... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Adaptive Conformal Guidance for Learning under Uncertainty |
Learning with guidance has proven effective across a wide range of machine learning systems. Guidance may, for example, come from annotated datasets in supervised learning, pseudo-labels in semi-super... |
6.50 |
15% |
See Reviews |
View AI Dashboard |
|
Fragment-Wise Interpretability in Graph Neural Networks via Molecule Decomposition and Contribution Analysis |
Graph neural networks (GNNs) are widely used in the field of predicting molecular properties. However, their black box nature limits their use in critical areas like drug discovery. Moreover, existing... |
3.20 |
7% |
See Reviews |
View AI Dashboard |
|
Exposing Hallucinations To Suppress Them: VLMs Representation Editing With Generative Anchors |
Multimodal large language models (MLLMs) have achieved remarkable success across diverse vision-language tasks, yet they remain highly susceptible to hallucinations, producing content that is fluent b... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Multimodal Few-Shot Point Cloud Segmentation via Agent Adaptation and Discriminative Deconfusion |
Few-shot 3D point cloud segmentation (FS-PCS) aims to leverage a limited amount of annotated data to enable the segmentation of novel categories. Most existing studies rely on single-modal point cloud... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
STARTrack:Learning Spatio-Temporal Representation Evolution for Target-Aware Tracking |
Efficient modeling of spatio-temporal representations in videos is crucial for achieving accurate object tracking. Existing popular one-stream tracking frameworks typically introduce memory mechanisms... |
4.50 |
0% |
See Reviews |
View AI Dashboard |