|
Accurate and Efficient Singular Value Decomposition For LLMs via Decay-aware Rank Allocation and Feature-Preserved Weight Update |
Singular Value Decomposition (SVD) provides a hardware-agnostic and effective paradigm for compressing and accelerating Large Language Models (LLMs) by decomposing and truncating weight matrices, foll... |
5.00 |
12% |
See Reviews |
View AI Dashboard |
|
Soft Instruction De-escalation Defense |
Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment; this makes them susceptible to prompt injections when dealing with untrusted data.... |
6.00 |
8% |
See Reviews |
View AI Dashboard |
|
DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts |
Mixture-of-Experts (MoE) models have become a leading approach for decoupling parameter count from computational cost in large language models. Despite significant progress, effectively scaling MoE pe... |
3.60 |
0% |
See Reviews |
View AI Dashboard |
|
Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding |
Large Language Models (LLMs) often hallucinate, generating content inconsistent with the input. Retrieval-Augmented Generation (RAG) and Reinforcement Learning with Human Feedback (RLHF) can mitigate ... |
5.50 |
34% |
See Reviews |
View AI Dashboard |
|
Sampling On Metric Graphs |
Metric graphs are structures obtained by associating edges in a standard graph with segments of the real line and gluing these segments at the vertices of the graph. The resulting structure has a natu... |
4.50 |
6% |
See Reviews |
View AI Dashboard |
|
Enabling Agents to Communicate Entirely in Latent Space |
While natural language is the de facto communication medium for LLM-based agents, it presents a fundamental constraint. The process of downsampling rich, internal latent states into discrete tokens in... |
3.33 |
0% |
See Reviews |
View AI Dashboard |
|
Mixing Configurations for Downstream Prediction |
Humans possess an innate ability to group objects by similarity—a cognitive mechanism that clustering algorithms aim to emulate. Recent advances in community detection have enabled the discovery of co... |
3.00 |
2% |
See Reviews |
View AI Dashboard |
|
Urban Socio-Semantic Segmentation with Vision-Language Reasoning |
As hubs of human activity, urban surfaces consist of a wealth of semantic entities. Segmenting these various entities from satellite imagery is crucial for a range of downstream applications. Current ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
PROBE: Probing Residual Concept Capacity in Erased Text-to-Video Models |
Text-to-video (T2V) diffusion models have achieved remarkable progress in generating temporally coherent, high-quality videos. However, their ability to generate sensitive or undesired concepts has ra... |
3.00 |
62% |
See Reviews |
View AI Dashboard |
|
Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective |
Reinforcement Learning (RL) has proven highly effective for autoregressive language models, but adapting these methods to diffusion large language models (dLLMs) presents fundamental challenges. The c... |
5.50 |
21% |
See Reviews |
View AI Dashboard |
|
Reducing Hallucinations in Generative Models through Truncated Statistics |
Hallucinations—where generative models produce invalid or nonsensical outputs—remain a critical challenge for reliable deployment. We present the first computationally and query-efficient algorithm th... |
5.33 |
4% |
See Reviews |
View AI Dashboard |
|
GazeVLM: Gaze-Guided Vision-Language Models for Efficient and Robust Inference |
Vision-language models (VLMs) are emerging as a core building block of modern intelligent assistants, enabling real-time human-machine interactions based on natural language and vision. However, the e... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
GLLP: Graph Learning from Label Proportions |
Learning from Label Proportion (LLP) is a weakly supervised learning paradigm in which only aggregated label proportions over collections of instances (i.e., bags) are provided, rather than individual... |
3.50 |
3% |
See Reviews |
View AI Dashboard |
|
From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents |
Despite the impressive performance of LLM-powered agents, their adoption for Electronic Health Record (EHR) data access remains limited by the absence of benchmarks that adequately capture real-world ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Blur to Focus Attention in Fine-Grained Visual Recognition |
Fine-grained visual recognition (FGVR) requires distinguishing categories separated by tiny discriminative cues such as fine textures, part shapes, or color patterns. In typical datasets, discriminati... |
3.00 |
43% |
See Reviews |
View AI Dashboard |
|
Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning |
Continual learning in large language models (LLMs) is prone to catastrophic forgetting, where adapting to new tasks significantly degrades performance on previously learned ones. Existing parameter-ef... |
5.00 |
87% |
See Reviews |
View AI Dashboard |
|
Towards Better Branching Policies: Leveraging the Sequential Nature of Branch-and-Bound Tree |
The branch-and-bound (B\&B) method is a dominant exact algorithm for solving Mixed-Integer Linear Programming problems (MILPs). While recent deep learning approaches have shown promise in learning bra... |
5.00 |
12% |
See Reviews |
View AI Dashboard |
|
Interpretable Transformer Regression for Functional and Longitudinal Covariates |
We consider scalar-on-function prediction from functional covariates that may be measured sparsely and irregularly over time with noise, which is common in longitudinal studies. We propose a dual‑atte... |
4.00 |
4% |
See Reviews |
View AI Dashboard |
|
RAPID: An Efficient Reinforcement Learning Algorithm for Small Language Models |
Reinforcement learning (RL) has emerged as a promising strategy for finetuning small language models (SLMs) to solve targeted tasks such as math and coding. However, RL algorithms tend to be resource-... |
1.50 |
0% |
See Reviews |
View AI Dashboard |
|
VLA-IN-THE-LOOP: ONLINE POLICY CORRECTION WITH WORLD MODELS FOR ROBUST ROBOTIC GRASPING |
Large-scale Vision-Language-Action (VLA) models excel at mapping natural language instructions to robotic action. However, they typically treat actions as terminal outputs with imitation learning ofte... |
5.00 |
52% |
See Reviews |
View AI Dashboard |
|
SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution |
The state space model (SSM) has garnered significant attention recently due to its exceptional long-range modeling capabilities achieved with linear-time complexity, enabling notable success in effici... |
6.00 |
4% |
See Reviews |
View AI Dashboard |
|
Beacon: Thwarting Backdoor Attacks in Cross-Domain Federated Fine-Tuning via Gradient Behavior Decoupling |
Cross-domain federated fine-tuning (CD-FFT) has emerged as a promising paradigm evolving from traditional federated learning (FL), with better alignment to real-world data distributions and enhanced c... |
4.00 |
29% |
See Reviews |
View AI Dashboard |
|
NVE-Adaptor: Novel View Editing Adaptor for Unseen View Consistent 3D Editing |
3D editing aims to transform a given 3D structure according to the user's intent. Multi-view consistent 3D editing has been proposed to ensure consistent editing effects across different views of a 3D... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
PCA Feature Alignment is Sufficient for Building Graph Foundation Models |
Graph foundation models (GFMs) aim to pretrain graph neural networks (GNNs) that can generalize to new graph datasets in a zero-shot manner, requiring little or no additional training. This goal is ch... |
2.00 |
0% |
See Reviews |
View AI Dashboard |
|
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation |
Despite recent advances in 3D human motion generation (MoGen) on standard benchmarks, existing models still face a fundamental bottleneck in their generalization capability. In contrast, adjacent gene... |
5.50 |
38% |
See Reviews |
View AI Dashboard |
|
Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think? |
Hybrid thinking enables LLMs to switch between reasoning and direct answering, offering a balance between efficiency and reasoning capability. Yet our experiments reveal that current hybrid thinking L... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
CartoonSing: Unifying Human and Nonhuman Timbres in Singing Generation |
Singing voice synthesis (SVS) and singing voice conversion (SVC) have achieved remarkable progress in generating natural-sounding human singing. However, existing systems are restricted to human timbr... |
2.50 |
13% |
See Reviews |
View AI Dashboard |
|
Real-VAS: a Realworld Video Amodal Segmentation dataset |
Amodal video object segmentation is fundamentally limited by the absence of datasets that combine real-world complexity with precise ground-truth annotations. To address this, we present Real Video Am... |
4.50 |
22% |
See Reviews |
View AI Dashboard |
|
Winformer: Transcending pairwise similarity for time-series generation |
Time-series generation plays a critical role in data imputation, feature augmentation, domain adaptation, and foundation modeling. However, the cross-domain generation remains a persistent challenge, ... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Divide and Conquer Self-Supervised Learning for High-Content Imaging |
Self-supervised representation learning methods often fail to learn subtle or complex features, which can be dominated by simpler patterns which are much easier to learn. This limitation is particular... |
2.50 |
0% |
See Reviews |
View AI Dashboard |
|
Stable and Diverse Strategy Learning via Diffusion-Based Co-Evolution in StarCraft II Combat |
Effective learning algorithms for agents in multi-agent environments remain a central challenge due to inter-agent dependencies during both training and evaluation. This challenge is amplified by the ... |
1.50 |
16% |
See Reviews |
View AI Dashboard |
|
Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation |
The conditional average treatment effect (CATE) is widely used in personalized medicine to inform therapeutic decisions. However, state-of-the-art methods for CATE estimation (so-called meta-learners)... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
How Can LLMs Serve as Experts in Malicious Code Detection? A Graph Representation Learning Based Approach |
Large Language Models (LLMs) excel in code processing yet encounter challenges in malicious code detection, primarily due to their limited ability to capture long-range dependencies within large and c... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Breaking Safety Alignment in Large Vision-Language Models via Benign-to-Harmful Optimization |
Large vision–language models (LVLMs) achieve remarkable multimodal reasoning capabilities but remain vulnerable to jailbreaks. Recent studies show that a single jailbreak image can universally bypass ... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
MASTARS: Multi-Agent Sequential Trajectory Augmentation with Return-Conditioned Subgoals |
The performance of offline reinforcement learning (RL) critically depends on the quality and diversity of the offline dataset. While diffusion-based data augmentation for offline RL has shown promise ... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Endowing GPT-4 with a Humanoid Body: Building the Bridge Between Off-the-Shelf VLMs and the Physical World |
In this paper, we explore how to empower general-purpose Vision-Language Models (VLMs) to control humanoid agents. General-purpose VLMs (e.g., GPT-4) exhibit strong open-world generalization, and remo... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
SoftPose: Learning Soft Attention for Interaction-Aware Multi-Person Image Generation |
Pose-guided human image generation aims to synthesize images of individuals performing specific actions based on pose conditions and textual descriptions. While current methods achieve promising resul... |
4.50 |
60% |
See Reviews |
View AI Dashboard |
|
Imitation Learning for Multi-turn LM Agents via On-policy Expert Corrections |
A popular paradigm for training LM agents relies on *imitation learning*, fine-tuning on expert trajectories. However, we show that the off-policy nature of imitation learning for multi-turn LM agents... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning |
While large language models show promise in medical applications, achieving expert-level clinical reasoning remains challenging due to the need for both accurate answers and transparent reasoning proc... |
4.00 |
63% |
See Reviews |
View AI Dashboard |
|
Can Data-driven Machine Learning Reach Symbolic-level Logical Reasoning? -- The Limit of the Scaling Law |
With the qualitative extension of embedding representation and the method of explicit model construction, neural networks may achieve the rigour of symbolic level logic reasoning without training data... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models |
Although diffusion-based zero-shot image restoration and enhancement methods have achieved great success, applying them to video restoration or enhancement will lead to severe temporal flickering. In ... |
5.50 |
0% |
See Reviews |
View AI Dashboard |
|
Cell2Text: Multimodal LLM for Generating Single-Cell Descriptions from RNA-Seq Data |
Single-cell RNA sequencing has transformed biology by enabling the measurement of gene expression at cellular resolution, providing information for cell types, states, and disease contexts. Recently, ... |
3.50 |
37% |
See Reviews |
View AI Dashboard |
|
EntryPrune: Neural Network Feature Selection using First Impressions |
There is an ongoing effort to develop feature selection algorithms to improve interpretability, reduce computational resources, and minimize overfitting in predictive models. Neural networks stand out... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
CCC: Prompt Evolution for Video Generation via Structured MLLM Feedback |
Video generation from natural-language prompts has made impressive strides, but current systems frequently misalign outputs with their input descriptions—dropping critical details, hallucinating unint... |
3.50 |
54% |
See Reviews |
View AI Dashboard |
|
Measuring Scarcity–Complexity Collision in Language Model Estimation |
Formal languages are increasingly used to analyze limitations of language–model architectures, via properties of their defining automata (e.g., number of states, transition weights, or out-degree at a... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Black-Box Guardrail Reverse-engineering Attack |
Large language models (LLMs) increasingly employ guardrails to enforce ethical, legal, and application-specific constraints on their outputs. While effective at mitigating harmful or undesirable respo... |
3.50 |
52% |
See Reviews |
View AI Dashboard |
|
Rethinking Regularization in Federated Learning: An Initialization Perspective |
In federated learning, numerous regularization methods have been introduced to alleviate local drift caused by data heterogeneity. While all share the goal of reducing client drift, their effects on c... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
UEval: A Real-World Benchmark for Unified Multimodal Generation |
We introduce UEval, a challenging real-world benchmark for multimodal generation of unified models, i.e., models capable of generating both images and text. UEval comprises 1,000 expert-curated prompt... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
FORCE: Transferable Visual Jailbreaking Attacks via Feature Over-Reliance CorrEction |
The integration of new modalities enhances the capabilities of multimodal large language models (MLLMs) but also introduces additional vulnerabilities.
In particular, simple visual jailbreaking attack... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Learning with Local Search MCMC Layers |
Integrating combinatorial optimization layers into neural networks has recently attracted significant research interest. However, many existing approaches lack theoretical guarantees or fail to perfor... |
5.00 |
0% |
See Reviews |
View AI Dashboard |