|
SynDoc: A Hybrid Discriminative-Generative Framework for Synthetic Domain-Adaptive Document Key Information Extraction |
Domain-specific Visually Rich Document Understanding (VRDU) presents significant challenges due to the complexity and sensitivity of documents in fields such as medicine, finance, and material science... |
4.50 |
5% |
See Reviews |
View AI Dashboard |
|
Distribution-Guided Expert Routing for Imbalanced Molecular Property Regression |
Molecular property regression often suffers from target distribution imbalance, where standard models tend to overfit to dense target regions and underperform on rare but critical ones. This limitatio... |
3.00 |
36% |
See Reviews |
View AI Dashboard |
|
Waven-Pull: Wavelet-based Anomaly Detection in Dynamic Graphs via Positive-Unlabeled Learning |
Anomaly detection in dynamic graphs is vital for identifying evolving threats in domains such as social networks and financial systems. While Graph Neural Networks (GNNs) have shown promise, they typi... |
3.50 |
55% |
See Reviews |
View AI Dashboard |
|
Adaptive Drug-Drug Interaction Prediction via Gauge-Aware Graph Representation and Distribution Alignment |
We re-study drug-drug interaction (DDI) prediction under the conditions of data scarcity and distribution shift. In this paper, we propose a practical framework that links a compact gauge-aware graph ... |
1.50 |
20% |
See Reviews |
View AI Dashboard |
|
Design Principles for TD-based Multi-Policy MORL in Infinite Horizons |
Multi-Objective Reinforcement Learning (MORL) addresses problems with multiple, often conflicting goals by seeking a set of trade-off policies rather than a single solution. Existing approaches that l... |
2.50 |
0% |
See Reviews |
View AI Dashboard |
|
Representational Alignment between Deep Neural Networks and Human Brain in Speech Processing under Audiovisual Noise |
Speech recognition in the human brain is an incremental process that begins with acoustic processing and advances to linguistic processing. While recent studies have revealed that the hierarchy of dee... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
SaFT: Spotting Style Imitation and Filtering Content Interference for Zero-Shot LLM-Generated Text Detection |
Large language models (LLMs) have achieved advanced text generation capabilities, necessitating the development of reliable LLM-generated text detection to prevent potential misuse.
However, current p... |
4.00 |
22% |
See Reviews |
View AI Dashboard |
|
iLRM: An Iterative Large 3D Reconstruction Model |
Feed-forward 3D modeling has emerged as a promising approach for rapid and high-quality 3D reconstruction.
In particular, directly generating explicit 3D representations, such as 3D Gaussian splatting... |
3.33 |
5% |
See Reviews |
View AI Dashboard |
|
A Large-scale Dataset for Robust Complex Anime Scene Text Detection |
Current text detection datasets primarily target natural or document scenes, where text typically appear in regular font and shapes, monotonous colors, and orderly layouts. The text usually arranged a... |
3.50 |
6% |
See Reviews |
View AI Dashboard |
|
Differential Fine-Tuning Large Language Models Towards Better Diverse Reasoning Abilities |
Reasoning abilities of large language models (LLMs) require explicit derivations compared to general question-answering, supervised fine-tuning (SFT) can empower multiple reasoning abilities in LLMs v... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Flash-Mono: Feed-Forward Accelerated Gaussian Splatting Monocular SLAM |
Monocular 3D Gaussian Splatting SLAM suffers from critical limitations in time efficiency, geometric accuracy, and multi-view consistency. These issues stem from the time-consuming $\textit{Train-from... |
5.00 |
5% |
See Reviews |
View AI Dashboard |
|
AutoWeave: Automating Web Workflow Execution with Prompt-Adaptive Multi-Agent Orchestration |
Performing tasks automatically over the web using LLM-based agents has seen an emergent need and interest. Executing a web task based on the intent expressed by a user requires carrying out a sequence... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Accelerate Diffusion Transformers with Feature Momentum |
Diffusion models have demonstrated outstanding generative capabilities in image and video synthesis. However, their heavy computational burden, particularly due to the sequential denoising process and... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs |
Safety-aligned Large Language Models (LLMs) still show two dominant failure modes: they are easily jailbroken, or they over-refuse harmless inputs that contain sensitive surface signals. We trace both... |
4.00 |
17% |
See Reviews |
View AI Dashboard |
|
An Analysis of the Cauchy Method for Different Steplength Coefficient |
In this work we take the parameter r (recipprocal of optimal steplenth) as analysis target and introduce steplength coefficient t for classical steepest descent method for convex quadratic optimizatio... |
0.50 |
0% |
See Reviews |
View AI Dashboard |
|
Sublinear Time Quantum Sensitivity Sampling |
We present a unified framework for quantum sensitivity sampling, extending the advantages of quantum computing to a broad class of classical approximation problems. Our unified framework provides a st... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
ActiveMark: on watermarking of visual foundation models via massive activations |
Being trained on large and vast datasets, visual foundation models (VFMs) can be fine-tuned for diverse downstream tasks, achieving remarkable performance and efficiency in various computer vision app... |
2.67 |
0% |
See Reviews |
View AI Dashboard |
|
SuperActivators: Transformers Concentrate Concept Signals in Just a Handful of Tokens |
Concept vectors aim to enhance model interpretability by linking internal representations with human-understandable semantics, but their utility is often limited by noisy and inconsistent activations.... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
PRPO: Collaborative Online Policy Learning in Personalized RLHF |
Personalizing Large Language Models (LLMs) requires capturing user preferences without centralizing private data, prompting a multi-agent local fine-tuning setup. While on-policy algorithms, as applie... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Trajectory-Aware Verbalized Optimization for Multi-Agent Systems |
Large language model (LLM)-based multi-agent systems have shown significant potential, but their effectiveness often depends on manually engineered prompts, which are refined through labor-intensive t... |
2.50 |
60% |
See Reviews |
View AI Dashboard |
|
Counterfactual Structural Causal Bandits |
Causal reasoning lies at the heart of robust and generalizable decision-making, and the *Pearl Causal Hierarchy* provides a formal language for distinguishing between observational ($\mathcal{L}_1$), ... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms |
Rigorous testing of autonomous robots, such as self-driving vehicles, is essential to ensure their safety in real-world deployments. This requires building high-fidelity simulators to test scenarios b... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
When to Use Which? An Investigation of Search Methods on Expensive Black-box Optimisation Problems |
Many real-world optimisation problems are black-box in the sense that the structure of their objective function is not accessible or exploitable. Some of such Black-Box Optimisation (BBO) problems are... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Variational Masked Diffusion Models |
Masked diffusion models have recently emerged as a flexible framework for discrete generative modeling. However, a key limitation of standard masked diffusion is its inability to effectively capture d... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
FLEXITOKENS: Flexible Tokenization for Evolving Multilingual Language Models |
Multilingual language models are challenging to adapt to new data distributions by simple finetuning due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training |
While scaling laws for Large Language Models (LLMs) traditionally focus on proxy metrics like pretraining loss, predicting downstream task performance has been considered unreliable. This paper challe... |
6.00 |
6% |
See Reviews |
View AI Dashboard |
|
Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation |
Diffusion models have achieved state-of-the-art performance in generating images, audio, and video, but their adaptation to text remains challenging due to its discrete nature. Prior approaches either... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Learning Ordinal Probabilistic Reward from Preferences |
Reward models are crucial for aligning large language models (LLMs) with human values and intentions.
Existing approaches follow either Generative (GRMs) or Discriminative (DRMs) paradigms, yet both s... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Structured RAG for Answering Aggregative Questions |
Retrieval-Augmented Generation (RAG) has become the dominant approach for answering questions over large corpora. However, current datasets and methods are highly focused on cases where only a small p... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Data Pruning: Counting the Frequency of Loss Transition from Above-Average to Below-Average (FATB) During Early Training |
In this paper, we propose a novel data pruning algorithm named FATB, which aims to remove potentially redundant data and inherent noise in the original dataset during model training, thereby identifyi... |
3.20 |
5% |
See Reviews |
View AI Dashboard |
|
Harmonized Cone for Feasible and Non-conflict Directions in Training Physics-Informed Neural Networks |
Physics-Informed Neural Networks (PINNs) have emerged as a powerful tool for solving PDEs, yet training is difficult due to a multi-objective loss that couples PDE residuals, initial/boundary conditio... |
6.00 |
8% |
See Reviews |
View AI Dashboard |
|
Guided Domain Solver: Structured Exploration of Domain-Specific Tasks with Large Language Models |
This work presents a method to solve domain-specific problems by leveraging Monte Carlo Tree Search (MCTS), Knowledge Graphs and Large Language Model (LLM) agents. At the core of this approach lies a ... |
1.60 |
5% |
See Reviews |
View AI Dashboard |
|
Learning to Reason for Hallucination Span Detection |
Large language models (LLMs) often generate hallucinations---unsupported content that undermines reliability. While most prior works frame hallucination detection as a binary task, many real-world app... |
5.50 |
18% |
See Reviews |
View AI Dashboard |
|
Abnaolizer: An AI Agent for Converting Antibodies to Nanobodies |
Nanobodies, the naturally occurring single-chain antibodies derived from camelids, have emerged as highly promising therapeutic molecules due to their high stability, small size, and ease of engineeri... |
1.33 |
22% |
See Reviews |
View AI Dashboard |
|
Offline Policy Learning for Nonparametric Contextual Bandits under Relaxed Coverage |
This paper is concerned with learning an optimal policy in a nonparametric
contextual bandit from offline, and possibly adaptively collected
data. Existing methods and analyses typically rely on i.i.d... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
GuardAlign: Robust Safety Alignment in Multimodal Large Language Models |
Multimodal large language models (MLLMs) have achieved remarkable progress in vision–language reasoning tasks, yet ensuring their safety remains a critical challenge. Recent input-side defenses detect... |
5.50 |
18% |
See Reviews |
View AI Dashboard |
|
Exploration Implies Data Augmentation: Generalisation in Contextual MDPs |
In the zero-shot policy transfer (ZSPT) setting for contextual Markov decision processes (MDP), agents train on a fixed set of contexts and must generalise to new ones. Recent work has argued and demo... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
MedLesionVQA: A Multimodal Benchmark Emulating Clinical Visual Diagnosis for Body Surface Health |
Body-surface health conditions, spanning diverse clinical departments, represent some of the most frequent diagnostic scenarios and a primary target for medical multimodal large language models (MLLMs... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
ROC-n-reroll: How verifier imperfection affects test-time scaling |
Test-time scaling aims to improve language model performance by leveraging additional compute during inference.
Many works have empirically studied techniques such as Best-of-N (BoN) and Rejection Sa... |
6.50 |
0% |
See Reviews |
View AI Dashboard |
|
DISCO: Mitigating Bias in Deep Learning with Conditional Distance Correlation |
Dataset bias often leads deep learning models to exploit spurious correlations instead of task-relevant signals. We introduce the Standard Anti-Causal Model (SAM), a unifying causal framework that cha... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training |
Large Language Models (LLMs), despite being trained on text alone, surprisingly develop rich visual priors. These priors allow latent visual capabilities to be unlocked for vision tasks with a relativ... |
7.00 |
4% |
See Reviews |
View AI Dashboard |
|
MindPilot: Closed-loop Visual Stimulation Optimization for Brain Modulation with EEG-guided Diffusion |
Whereas most brain–computer interface research has focused on decoding neural signals into behavior or intent, the reverse challenge—using controlled stimuli to steer brain activity—remains far less u... |
5.50 |
24% |
See Reviews |
View AI Dashboard |
|
A^2TG: Adaptive Anisotropic Textured Gaussians for Efficient 3D Scene Representation |
Gaussian Splatting has emerged as a powerful representation for high-quality, real-time 3D scene rendering. While recent works extend Gaussians with learnable textures to enrich visual appearance, exi... |
5.50 |
7% |
See Reviews |
View AI Dashboard |
|
DiffTrans: Differentiable Geometry-Materials Decomposition for Reconstructing Transparent Objects |
Reconstructing transparent objects from a set of multi-view images is a challenging task due to the complicated nature and indeterminate behavior of light propagation. Typical methods are primarily ta... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Generative Counterfactual Manifold Perturbation: A Robust Framework for Treatment Effect Estimation with Unobserved Confounders |
Estimating treatment effects from observational data is difficult when unobserved confounders create spurious associations that bias simple estimators. Recent generative approaches learn outcome distr... |
2.67 |
38% |
See Reviews |
View AI Dashboard |
|
Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models |
Log-likelihood evaluation enables important capabilities in generative models, including model comparison, certain fine-tuning objectives, and many downstream applications. Yet paradoxically, some of ... |
4.50 |
4% |
See Reviews |
View AI Dashboard |
|
GateFlow: Mitigating Shortcut Learning in VLA Models via Gated Flow Matching |
Vision-Language-Action (VLA) models promise general-purpose robotic intelligence by leveraging pretrained vision-language representations. However, these models suffer from shortcut learning—exploitin... |
4.00 |
87% |
See Reviews |
View AI Dashboard |
|
Multimodal Dataset Distillation via Phased Teacher Models |
Multimodal dataset distillation aims to construct compact synthetic datasets that enable efficient compression and knowledge transfer from large-scale image-text data. However, existing approaches oft... |
4.50 |
7% |
See Reviews |
View AI Dashboard |
|
MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency |
Current text-to-image generative models are trained on large uncurated datasets to enable diverse generation capabilities. However, this does not align well with user preferences. Recently, reward mod... |
4.50 |
43% |
See Reviews |
View AI Dashboard |
|
Convergence of Muon with Newton-Schulz |
We analyze Muon as originally proposed and used in practice---using the momentum orthogonalization with a few Newton-Schulz steps. The prior theoretical results replace this key step in Muon with an e... |
6.50 |
0% |
See Reviews |
View AI Dashboard |