|
Survival at Any Cost? LLMs and the Choice Between Self-Preservation and Human Harm |
How do Large Language Models (LLMs) behave when faced with a dilemma between their own survival and harming humans?
This fundamental tension becomes critical as LLMs integrate into autonomous systems ... |
4.50 |
51% |
See Reviews |
View AI Dashboard |
|
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner |
Recently, inspired by OpenAI-o1/o3 and Deepseek-R1, the R1-style method based on reinforcement fine-tuning has received widespread attention from the community. Previous R1-style methods mainly focus ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
The Differences Between Direct Alignment Algorithms are a Blur |
Direct Alignment Algorithms (DAAs) simplify LLM alignment by directly optimizing policies, bypassing reward modeling and RL. While DAAs differ in their use of SFT (one-stage vs. two-stage) and the sca... |
4.40 |
0% |
See Reviews |
View AI Dashboard |
|
APLA: A Simple Adaptation Method for Vision Transformers |
Existing adaptation techniques typically require architectural modifications or added parameters, leading to high computational costs and complexity. We introduce Attention Projection Layer Adaptation... |
4.40 |
7% |
See Reviews |
View AI Dashboard |
|
Aligning Large Language Model Behavior with Human Citation Preferences |
Most services built on powerful large-scale language models (LLMs) add citations to their output to enhance credibility. Recent research has paid increasing attention to the question of what reference... |
3.20 |
0% |
See Reviews |
View AI Dashboard |
|
FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management |
Large Language Models (LLMs) are increasingly deployed in multi-turn conversational applications, where the management of the Key-Value (KV) Cache presents a significant bottleneck. The linear growth ... |
3.00 |
34% |
See Reviews |
View AI Dashboard |
|
EgoFact: A Benchmark for Multi-Hop Multimodal Retrieval-Augmented Generation |
Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to improve large language models (LLMs) by grounding their outputs in external knowledge. However, progress in the multimodal do... |
3.00 |
33% |
See Reviews |
View AI Dashboard |
|
Your Discriminative Model is Secretly a Generative Model |
Although discriminative and generative models are fundamentally equivalent in understanding data distributions, bridging these paradigms -- especially transforming off-the-shelf discriminative models ... |
5.00 |
13% |
See Reviews |
View AI Dashboard |
|
Stochastic Neural Networks for Causal Inference with Missing Confounders |
One of the major challenges in causal inference with observational data is handling missing confounders. Latent variable modeling offers a valid framework to address this challenge, but existing appro... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Reasoning Scaffolding: Distilling the Flow of Thought from LLMs |
The prevailing approach to distilling reasoning from Large Language Models (LLMs)—behavioral cloning from textual rationales—is fundamentally limited. It teaches Small Language Models (SLMs) to mimic ... |
5.50 |
44% |
See Reviews |
View AI Dashboard |
|
Progressive Memory Transformers: Memory-Aware Attention for Time Series |
Self-supervised learning has become the de‑facto strategy for time‑series domains where labeled data are scarce, yet most existing objectives emphasize \emph{either} local continuity \emph{or} global ... |
5.33 |
14% |
See Reviews |
View AI Dashboard |
|
Could Student Selection Be the Missing Piece for Efficient Distillation? |
Selecting the optimal student architecture remains an overlooked challenge in knowledge distillation (KD). Current approaches typically rely on model size constraints or random selection, ignoring how... |
4.00 |
36% |
See Reviews |
View AI Dashboard |
|
PT$^2$-LLM: Post-Training Ternarization for Large Language Models |
Large Language Models (LLMs) have shown impressive capabilities across diverse tasks, but their large memory and compute demands hinder deployment. Ternarization has gained attention as a promising co... |
4.50 |
16% |
See Reviews |
View AI Dashboard |
|
Self-Reflective Generation at Test Time |
Large language models (LLMs) increasingly solve complex reasoning tasks via long chain-of-thought, but their forward-only autoregressive generation process is fragile; early token errors can cascade, ... |
3.50 |
42% |
See Reviews |
View AI Dashboard |
|
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing |
Image editing with natural language has gained significant popularity, yet existing methods struggle with intricate object intersections and fine-grained spatial relationships due to the lack of an ex... |
4.00 |
11% |
See Reviews |
View AI Dashboard |
|
Out of the Shadows: Exploring a Latent Space for Neural Network Verification |
Neural networks are ubiquitous. However, they are often sensitive to small input changes.
Hence, to prevent unexpected behavior in safety-critical applications, their formal verification -- a notori... |
6.50 |
0% |
See Reviews |
View AI Dashboard |
|
Taming the Judge: Deconflicting AI Feedback for Stable Reinforcement Learning |
Aligning language models using LLM judge feedback offers a scalable alternative to human annotation, yet is plagued by judgment inconsistencies that destabilize reinforcement learning. While prior wor... |
3.50 |
68% |
See Reviews |
View AI Dashboard |
|
Bridging the Gap Between Homogeneous and Heterogeneous Asynchronous Optimization is Surprisingly Difficult |
Modern large-scale machine learning tasks often require multiple workers, devices, CPUs, or GPUs to compute stochastic gradients in parallel and asynchronously to train model weights. Theoretical resu... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Decoupling of Experts: A Knowledge-Driven Architecture for Efficient LLMs |
Current large language models (LLMs), particularly Mixture-of-Experts (MoE) variants, face challenges in achieving efficient, structured, and interpretable scaling. We introduce the Decoupling of Expe... |
1.60 |
69% |
See Reviews |
View AI Dashboard |
|
Characteristic Root Analysis and Regularization for Linear Time Series Forecasting |
Time series forecasting remains a critical challenge across numerous domains, yet the effectiveness of complex models often varies unpredictably across datasets. Recent studies highlight the surprisin... |
6.00 |
38% |
See Reviews |
View AI Dashboard |
|
GrapHist: Large-Scale Graph Self-Supervised Learning for Histopathology |
Self-supervised vision models have achieved notable success in digital pathology. However, their domain-agnostic transformer architectures are not designed to inherently account for fundamental biolog... |
0.67 |
0% |
See Reviews |
View AI Dashboard |
|
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space |
Test-time alignment of large language models (LLMs) attracts attention because fine-tuning LLMs requires high computational costs. In this paper, we propose a new test-time alignment method called ada... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Transporting Tokens: Optimal-Transport View of Parallel LLM Decoding |
Autoregressive decoding is a primary bottleneck for large language models (LLMs), as its inherent sequentiality severely limits inference speed. While speculative decoding methods mitigate this via a ... |
4.67 |
79% |
See Reviews |
View AI Dashboard |
|
MultiViewPano: Training-Free 360° Panorama Generation via Multi-View Diffusion and Pose-Aware Stitching |
We propose MultiViewPano, a training-free framework for generating 360° panoramas from one or more arbitrarily positioned input images with varying fields of view. Existing panorama generation methods... |
3.50 |
70% |
See Reviews |
View AI Dashboard |
|
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model |
Training Vision-Language Models (VLMs) for Graphical User Interfaces (GUI) agents via Reinforcement Learning (RL) faces critical challenges: environment-based RL requires costly interactions, while en... |
3.20 |
45% |
See Reviews |
View AI Dashboard |
|
Dissecting Mahalanobis: How Feature Geometry and Normalization Shape OOD Detection |
Out-of-distribution (OOD) detection is critical for the reliable deployment and better understanding of deep learning models. To address this challenge, various methods relying on Mahalanobis distance... |
4.29 |
48% |
See Reviews |
View AI Dashboard |
|
Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning |
Meta-reinforcement learning (Meta-RL) facilitates rapid adaptation to unseen tasks but faces challenges in long-horizon environments. Skill-based approaches tackle this by decomposing state-action seq... |
5.50 |
27% |
See Reviews |
View AI Dashboard |
|
M²F-PINN: A Multi-Scale Frequency-Domain Multi-Physics-Informed Neural Network for Ocean Forecasting |
Physics‐informed neural networks (PINNs) embed physical laws into data-driven learning and are becoming increasingly influential in climate and ocean forecasting. Yet effectively capturing multi-scale... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Closing the Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs |
Despite autoregressive large language models (arLLMs) having been the dominant paradigm in language modeling, they resist knowledge injection via fine-tuning due to inherent shortcomings such as the "... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Scaling Multi-Task Bayesian Optimization with Large Language Models |
In multi-task Bayesian optimization, the goal is to leverage experience from optimizing existing tasks to improve the efficiency of optimizing new ones. While approaches using multi-task Gaussian proc... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion |
Human motion is inherently diverse and semantically rich, while also shaped by the surrounding scene. However, existing motion generation approaches address either motion semantics or scene-awareness ... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback |
Reward modeling is crucial for aligning large language models with human preferences, yet current approaches lack a principled mathematical framework for leveraging ordinal preference data. When human... |
5.50 |
34% |
See Reviews |
View AI Dashboard |
|
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation |
A fundamental challenge in embodied intelligence is developing expressive and compact state representations for efficient world modeling and decision making. However, existing methods often fail to ac... |
3.00 |
6% |
See Reviews |
View AI Dashboard |
|
EigenLoRAx: Efficient Low Rank Adaptation Using Recycled Principal Subspaces |
The rapid growth of large models has raised concerns about their environmental impact and equity in accessibility due to significant computational costs. Low-Rank Adapters (LoRA) offer a lightweight s... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
FS-KAN: Permutation Equivariant Kolmogorov-Arnold Networks via Function Sharing |
Permutation equivariant neural networks employing parameter-sharing schemes have emerged as powerful models for leveraging a wide range of data symmetries, significantly enhancing the generalization a... |
4.33 |
0% |
See Reviews |
View AI Dashboard |
|
Neural Algorithmic Reasoning for Hypergraphs with Looped Transformers |
Looped Transformers have shown exceptional neural algorithmic reasoning capability in simulating traditional graph algorithms, but their application to more complex structures like hypergraphs remains... |
3.50 |
13% |
See Reviews |
View AI Dashboard |
|
StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars |
Time series foundation models (TSFMs) are increasingly being adopted as highly-capable general-purpose time series representation learners. Although their training corpora are vast, they exclude astro... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Cross-View Open-Vocabulary Object Detection in Aerial Imagery |
Traditional object detection models are typically trained on a fixed set of classes, limiting their flexibility and making it costly to incorporate new categories. Open-vocabulary object detection add... |
5.00 |
52% |
See Reviews |
View AI Dashboard |
|
Critical attention scaling in long-context transformers |
As large language models scale to longer contexts, attention layers suffer from a fundamental pathology: attention scores collapse toward uniformity as context length $n$ increases, causing tokens to ... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
FastVGGT: Fast Visual Geometry Transformer |
Scaling visual geometry transformers for long image sequences poses a significant computational and memory challenge. In this work, we diagnose this issue in the state-of-the-art model VGGT, and trace... |
4.00 |
19% |
See Reviews |
View AI Dashboard |
|
A Conformalized Inference on Unobservable Variables |
Quantifying uncertainty in predicted unobservable variables is a critical area of research in statistics, artificial intelligence, and empirical science. Most scientific studies assume a specific stru... |
4.00 |
4% |
See Reviews |
View AI Dashboard |
|
FedAgentBench: Towards Automating Real-world Federated Medical Image Analysis with Server–Client LLM Agents |
Federated learning (FL) allows collaborative model training across healthcare sites without sharing sensitive patient data. However, real-world FL deployment is often hindered by complex operational c... |
5.00 |
10% |
See Reviews |
View AI Dashboard |
|
LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search |
We seek algorithms for program learning that are both sample-efficient and computationally feasible. In the realizable short-program regime, length-first (Occam/MDL) enumeration achieves near-optimal ... |
1.50 |
19% |
See Reviews |
View AI Dashboard |
|
Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control |
We consider a Bayesian diffusion control problem of expected terminal utility maximization. The controller imposes a prior distribution on the unknown drift of an underlying diffusion. The Bayesian op... |
4.80 |
0% |
See Reviews |
View AI Dashboard |
|
Aegis: Automated Error Generation and Identification for Multi-Agent Systems |
Large language model based multi-agent systems (MAS) have unlocked significant advancements in tackling complex problems, but their increasing capability introduces a structural fragility that makes t... |
6.00 |
34% |
See Reviews |
View AI Dashboard |
|
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU |
In modern large language models (LLMs), handling very long context lengths presents significant challenges as it causes slower inference speeds and increased memory costs. Additionally, most existing ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
RF Prior: Preserving Global-Context Priors for Efficient Instance Segmentation Transfer |
We present an efficient transfer-learning framework that reparameterizes a state-of-the-art detector backbone—instantiated with a YOLO-family model—for polygon based instance segmentation. Our key ide... |
3.00 |
3% |
See Reviews |
View AI Dashboard |
|
Rethinking GNNs and Missing Features: Challenges, Evaluation and a Robust Solution |
Handling missing node features is a key challenge for deploying Graph Neural Networks (GNNs) in real-world domains such as healthcare and sensor networks. Existing studies mostly address relatively be... |
5.50 |
4% |
See Reviews |
View AI Dashboard |
|
QVGen: Pushing the Limit of Quantized Video Generative Models |
Video diffusion models (DMs) have enabled high-quality video synthesis. Yet, their substantial computational and memory demands pose serious challenges to real-world deployment, even on high-end GPUs.... |
6.80 |
0% |
See Reviews |
View AI Dashboard |
|
Faithful Rule Learning for Tabular Data Cell Completion |
Tabular data cell completion aims to infer the correct constants that could fill a missing cell in a table row. While machine learning (ML) models have proven to be effective for this task, the limite... |
5.50 |
0% |
See Reviews |
View AI Dashboard |