|
Forge: Foundational Optimization Representations from Graph Embeddings |
Combinatorial optimization problems are ubiquitous in science and engineering. Still, learning-based approaches to accelerate combinatorial optimization often require solving a large number of difficu... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models |
Large Language Models (LLMs) have shown promising capabilities for solving Operations Research (OR) problems.
While reinforcement learning serves as a powerful paradigm for LLM training on OR problem... |
5.00 |
17% |
See Reviews |
View AI Dashboard |
|
Effective Probabilistic Time Series Forecasting with Fourier Adaptive Noise-Separated Diffusion |
Existing diffusion-based time series forecasting methods often target on mixed temporal patterns or undifferentiated residuals, limiting the potential of distinct temporal components. In this paper, w... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Nonparametric Teaching for Sequential Property Learners |
Determining the properties of sequence-structured data, e.g., the sentiment of a text, fundamentally requires learning the implicit relationship that maps sequences to their corresponding properties. ... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
FG-ATTN: LEVERAGING FINE-GRAINED SPARSITY IN DIFFUSION TRANSFORMERS |
Generating realistic videos/images with diffusion transformers requires evaluating attention over extremely long sequences, with attention layers accounting for the majority of generation latency. Exp... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
LSMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation |
Open-vocabulary semantic segmentation requires precise pixel-level alignment of visual and textual representations, leveraging text as a universal reference to address visual disparities across divers... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Tackling Time-Series Forecasting Generalization via Mitigating Concept Drift |
Time-series forecasting finds broad applications in real-world scenarios. Due to the dynamic nature of time series data, it is important for time-series forecasting models to handle potential distribu... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
SAFE: Improving LLM Systems using Sentence-Level In-generation Attribution |
Large Language Models (LLMs) are increasingly applied in various science domains, yet their broader adoption remains constrained by a critical challenge: the lack of trustworthy, verifiable outputs. C... |
1.50 |
0% |
See Reviews |
View AI Dashboard |
|
SCREEN-SBERT: EMBEDDING FUNCTIONAL SEMANTICS OF GUI SCREENS TO SUPPORT GUI AGENTS |
Recent GUI agent studies show that augmenting LLM prompts with app-related
knowledge constructed during a pre-exploration phase can effectively improve
task success rates. However, retrieving relevant... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Model Merging with Functional Dual Anchors |
Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combin... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Kimi-Dev: Agentless Training as Skill Prior for SWE-agents |
Large Language Models (LLMs) are increasingly applied to software engineering (SWE), with SWE-bench as a key benchmark. Solutions are split into SWE-Agent frameworks with multi-turn interactions and w... |
7.00 |
0% |
See Reviews |
View AI Dashboard |
|
Fast, Secure, And High-Capacity Image Watermarking With Text Autoencoded Text Vectors |
Most image watermarking systems focus on robustness, capacity, and imperceptibility while treating the embedded payload as meaningless bits. This bit-centric view imposes a hard ceiling on capacity an... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
C3-OWD: A Curriculum Cross-modal Contrastive Learning Framework for Open-World Detection |
Object detection has advanced significantly in the closed-set setting, but real-world deployment remains limited by two challenges: poor generalization to unseen categories and insufficient robustness... |
4.50 |
53% |
See Reviews |
View AI Dashboard |
|
Pretrain–Test Task Alignment Governs Generalization in In-Context Learning |
In-context learning (ICL) is a central capability of Transformer models, but the structures in data that enable its emergence and govern its robustness remain poorly understood. In this work, we study... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
SimTrack3D: A Simple Sequential Motion Modeling for Efficient 3D Single Object Tracking |
Accurate tracking of objects in 3D point clouds requires continuous and efficient motion modeling across spatial and temporal dimensions. Although voxel-based methods have recently achieved strong per... |
3.50 |
5% |
See Reviews |
View AI Dashboard |
|
Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys |
We apply preference learning to the task of language model generation of novel structural alloys. Where prior work focuses on generating stable inorganic crystals, our approach optimizes for the synth... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
SoundReactor: Frame-level Online Video-to-Audio Generation |
Prevailing Video-to-Audio (V2A) generation models operate offline, assuming an entire video sequence or chunks of frames are available beforehand. This critically limits their use in interactive appli... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Conformalized Predictions in Hypergraph Neural Networks via Contrastive Learning |
Hypergraph representation learning has gained immense popularity over the last few years due to its applications in real-world domains like social network analysis, recommendation systems, biological ... |
5.33 |
12% |
See Reviews |
View AI Dashboard |
|
On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy |
Sparse autoencoders (SAEs) have recently emerged as a powerful tool for interpreting the features learned by large language models (LLMs). By reconstructing features with sparsely activated networks, ... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Wasserstein Policy Gradient: Implicit Policies, Entropy Regularization and Linear Convergence |
We revisit Wasserstein Proximal Policy Gradient (WPPG) for continuous control in infinite-horizon discounted reinforcement learning. By projecting the iterate of Wasserstein proximal gradient onto a p... |
5.00 |
3% |
See Reviews |
View AI Dashboard |
|
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations |
Self-Supervised Learning (SSL) excels at learning generic representations of acoustic signals, yet prevailing methods remain domain-specific, tailored to either speech or general audio, hindering the ... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
VELR: Efficient Video Reward Feedback via Ensemble Latent Reward Models |
Reward feedback learning (ReFL) is effective for both text-to-image (T2I) and text-to-video (T2V) generation with image reward models (RMs). However, image RMs are misaligned with temporal objectives ... |
4.67 |
58% |
See Reviews |
View AI Dashboard |
|
GaussianFluent: Gaussian Simulation for Dynamic Scenes with Mixed Materials |
3D Gaussian Splatting (3DGS) has emerged as a prominent 3D representation for high-fidelity and real-time rendering. Prior work has coupled physics simulation with Gaussians, but it predominantly targ... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
The Matthew Effect of AI Programming Assistants: A Hidden Bias in Software Evolution |
AI-assisted programming is rapidly reshaping software development, with large language models (LLMs) enabling new paradigms such as vibe coding and agentic coding. While prior works have focused on pr... |
4.40 |
72% |
See Reviews |
View AI Dashboard |
|
Pusa V1.0: Unlocking Temporal Control in Pretrained Video Diffusion Models via Vectorized Timestep Adaptation |
The rapid advancement of video diffusion models has been hindered by fundamental limitations in temporal modeling, particularly the rigid synchronization of frame evolution imposed by conventional sca... |
6.00 |
13% |
See Reviews |
View AI Dashboard |
|
VeriRole: Verifiable Role-Awareness through Hint-Guided Reinforcement Learning |
Maintaining role-awareness in Role-Playing Conversational Agents (RPCAs) is a significant challenging, largely because the creative nature of role-playing makes it difficult to design verifiable rewar... |
5.50 |
10% |
See Reviews |
View AI Dashboard |
|
Training-Free Self-Scheduling for Efficient LLM Inference Serving |
The ability to deliver fast responses under strict latency requirements is critical for Large Language Model (LLM) inference serving.
Most existing systems rely on a first-come-first-served (FCFS) sc... |
3.60 |
25% |
See Reviews |
View AI Dashboard |
|
Mixed-Curvature Tree-Sliced Wasserstein Distance |
Mixed-curvature spaces have emerged as a powerful alternative to their Euclidean counterpart, enabling data representations better aligned with the intrinsic structure of complex datasets. However, co... |
6.00 |
21% |
See Reviews |
View AI Dashboard |
|
Taming Imperfect Process Verifiers: A Sampling Perspective on Backtracking |
Test-time algorithms that combine the *generative* power of language models with *process verifiers* that assess the quality of partial generations offer a promising lever for eliciting new reasoning ... |
6.50 |
0% |
See Reviews |
View AI Dashboard |
|
Sparsity-promoting Fine-tuning for Equivariant Materials Foundation Model |
Pre-trained materials foundation models, or machine learning interatomic potentials, leverage general physicochemical knowledge to effectively approximate potential energy surfaces. However, they ofte... |
4.50 |
3% |
See Reviews |
View AI Dashboard |
|
Concept-Based Steering of LLMs for Conditional Molecular Generation |
Generating valid, unique, and high-fidelity molecules while precisely controlling for multiple properties simultaneously remains challenging. While prior works with LLMs have achieved success by fine-... |
3.33 |
5% |
See Reviews |
View AI Dashboard |
|
Variational Learning of Disentangled Representations |
Disentangled representations allow models to separate factors shared across conditions from those that are condition-specific. This separation is crucial in domains such as biomedicine, where generali... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Local Distribution-Conditioned Image Synthesis for One-Shot Federated Learning |
One-Shot Federated Learning (OSFL) aims to build a global model with a single round of server–client interaction, making it attractive for practical scenarios. The recent introduction of Diffusion Mod... |
4.00 |
8% |
See Reviews |
View AI Dashboard |
|
ReAlign: Safety-Aligning Reasoning Models with Verifier-Guided Reinforcement Learning |
As Large Reasoning Models (LRMs) become more capable, ensuring their safety without compromising utility is a critical challenge. Traditional safety alignment techniques often result in overly cautiou... |
3.50 |
13% |
See Reviews |
View AI Dashboard |
|
Do Vision-Language Models Respect Contextual Integrity in Location Disclosure? |
Vision-language models (VLMs) have recently demonstrated strong performance in image geolocation, identifying images' location to a precision that now surpasses specialized systems. This capability po... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
How reinforcement learning after next-token prediction facilitates learning |
Recent advances in reasoning domains with neural networks have primarily been enabled by a training recipe that optimizes Large Language Models, previously trained to predict the next-token in a seque... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Poly-FEVER: A Multilingual Fact Verification Benchmark for Hallucination Detection in Large Language Models |
We present Poly-FEVER, a large-scale multilingual benchmark for fact verification and hallucination detection in large language models (LLMs). Poly-FEVER extends FEVER, Climate-FEVER, and SciFact to 7... |
5.00 |
45% |
See Reviews |
View AI Dashboard |
|
WebGen-R1: Incentivizing LLMs to Generate Functional and Aesthetic Websites with Reinforcement Learning |
Large Language Models (LLMs) have demonstrated strong capabilities in functional-level code generation, yet their performance remains limited in project-level scenarios such as generating large-scale ... |
5.00 |
51% |
See Reviews |
View AI Dashboard |
|
Dolphin: A multimodal large language model for Ultrasound Understanding |
Ultrasound is one of the most widely used imaging modalities in clinical practice. Unlike CT and MRI, ultrasound imaging is highly operator dependent, with significant variations across different anat... |
4.50 |
26% |
See Reviews |
View AI Dashboard |
|
Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking |
Improving vision-language models (VLM) in the post-training stage typically relies on supervised fine-tuning or reinforcement learning, methods that necessitate costly, human-annotated data. While se... |
5.00 |
10% |
See Reviews |
View AI Dashboard |
|
Learning Communication between Language Models through Dense Vectors |
Communication between language models plays a crucial role in the inference process of large language models (LLMs), occurring both iteratively within a single model for multi-step reasoning (auto-reg... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Consistent Labeling Across Group Assignments: Variance Reduction in Conditional Average Treatment Effect Estimation |
Numerous algorithms have been developed for Conditional Average Treatment Effect (CATE) estimation. In this paper, we first highlight an overlooked issue in CATE estimation: many algorithms exhibit in... |
2.50 |
19% |
See Reviews |
View AI Dashboard |
|
LATTE: Latent Trajectory Embedding for Diffusion-Generated Image Detection |
The rapid advancement of diffusion-based image generators has made it increasingly difficult to distinguish generated from real images. This erodes trust in digital media, making it critical to develo... |
2.50 |
15% |
See Reviews |
View AI Dashboard |
|
MSRS: Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models |
Activation steering offers a promising approach to controlling the behavior of Large Language Models by directly manipulating their internal activations. However, most existing methods struggle to joi... |
4.50 |
27% |
See Reviews |
View AI Dashboard |
|
3DLAND: 3D Lesion Abdominal anomaly Localization Dataset |
Existing medical imaging datasets for abdominal CT often lack three-dimensional annotations, multi-organ coverage, or precise lesion-to-organ associations, hindering robust representation learning and... |
5.00 |
76% |
See Reviews |
View AI Dashboard |
|
DefNTaxS: The Inevitable Need for Context in Classification |
To successfully use generalized vision-language models (VLMs) like CLIP for zero-shot image classification, the semantics of the target classes must be well defined and easily differentiated. However,... |
3.00 |
30% |
See Reviews |
View AI Dashboard |
|
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling |
Instruction-guided image editing has achieved remarkable progress, yet current models still face challenges with complex instructions and often require multiple samples to produce a desired result. Re... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Judo: A Juxtaposed Domain-oriented Multimodal Reasoner for Industrial Anomaly QA |
Industrial anomaly detection has been significantly advanced by large multimodal models (LMMs), enabling diverse human instructions beyond detection, particularly through visual-grounded reasoning for... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Lifelong control through Neuro-Evolution |
Reinforcement learning (RL) under continual environmental changes has remained a central challenge for decades.
Novel designs of loss functions, training procedures and neural network architectures ha... |
3.20 |
12% |
See Reviews |
View AI Dashboard |
|
R2Q: Residual Refinement Quantization for Robust 2-Bit Large Language Models |
The dramatic growth of Large Language Models (LLMs) has been accompanied by significant computational and memory demands, driving the adoption of low-bit quantization. While 8-bit and 4-bit formats ha... |
2.00 |
14% |
See Reviews |
View AI Dashboard |