|
HURST: Learning Heterogeneity-Adaptive Urban Foundation Models for Spatiotemporal Prediction via Self-Partitional Mixture-of-Spatial-Experts |
Urban foundation models (UFMs) are pre-trained spatiotemporal (ST) prediction models with the ability to generalize to different tasks. Such models have the potential to transform urban intelligence b... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Stable Preference Optimization: Learning preference is more important than imitation |
Direct Preference Optimization (DPO; \citet{rafailov2023direct}) is a widely used method for aligning large language models (LLMs) with human feedback. However, its objective often leads to reward hac... |
2.50 |
50% |
See Reviews |
View AI Dashboard |
|
Robust Latent Neural Operators through Augmented Sparse Observation Encoding |
Neural operator methods have achieved significant success in the efficient simulation and inverse problems of complex systems by learning a mapping between two infinite-dimensional Banach spaces. Howe... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Dyana: Benchmarking Dynamic Hand Intelligence |
Most existing hand grasping benchmarks focus on static objects, which fails to capture the challenges of dynamic, real-world scenarios where targets move and precise timing becomes critical. We first ... |
3.50 |
19% |
See Reviews |
View AI Dashboard |
|
cgDDI: Controllable Generation of Diverse Dermatological Imagery for Fair and Efficient Malignancy Classification |
Skin diseases impact the lives of millions of people around the world from different backgrounds and ethnicities. Therefore, accurate diagnosis in the dermatological domain requires focused work towar... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
Adaptive Logit Adjustment for Debiasing Multimodal Language Models |
Vision-Language Models (VLMs) and Large Multimodal Models (LMMs) have significantly advanced image-to-text generation tasks such as image captioning and visual question answering (VQA).
However, thes... |
5.33 |
14% |
See Reviews |
View AI Dashboard |
|
Not Just a Flash in Time: Interpreting Long Event Streams through Language |
Event cameras operate asynchronously with microsecond-level temporal precision and generate sparse event streams, enabling low-latency visual perception under high dynamic range conditions. However, c... |
4.00 |
5% |
See Reviews |
View AI Dashboard |
|
Training Variable Long Sequences With Data-centric Parallel |
Training deep learning models on variable long sequences poses significant computational challenges. Existing methods force a difficult trade-off between efficiency and ease-of-use. Simple approaches ... |
4.00 |
4% |
See Reviews |
View AI Dashboard |
|
Physics-informed Residual Flows |
Physics-Informed Neural Networks (PINNs) embed physical laws into deep learning models. However, conventional PINNs often suffer from failure modes leading to inaccurate solutions. We trace these fail... |
5.00 |
47% |
See Reviews |
View AI Dashboard |
|
Randomness Helps Rigor: A Probabilistic Learning Rate Scheduler Bridging Theory and Deep Learning Practice |
Learning rate schedulers have shown great success in speeding up the convergence of learning algorithms in practice. However, their convergence to a minimum has not been theoretically proven. This dif... |
3.60 |
0% |
See Reviews |
View AI Dashboard |
|
GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries |
Inverse problems governed by partial differential equations (PDEs) are crucial in science and engineering. They are particularly challenging due to ill-posedness, data sparsity, and the added complexi... |
4.00 |
39% |
See Reviews |
View AI Dashboard |
|
Resolving Extreme Data Scarcity by Explicit Physics Integration: An Application to Groundwater Heat Transport |
Machine learning methods often struggle with real-world applications in science and engineering due to an insufficient amount or quality of training data. In this work, the example of subsurface porou... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
OptimSyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation |
Large language models (LLMs) achieve strong downstream performance largely due to abundant supervised fine-tuning (SFT) data that imparts problem-solving capabilities. However, as applications expand,... |
5.00 |
21% |
See Reviews |
View AI Dashboard |
|
How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability |
Semantic associations such as the link between "bird" and "flew" are foundational for language modeling as they enable models to go beyond memorization and instead generalize and generate coherent tex... |
7.20 |
0% |
See Reviews |
View AI Dashboard |
|
PRISM: Performer RS-IMLE for Single-pass Multisensory Imitation Learning |
Robotic imitation learning typically requires models that capture multimodal action distributions while operating in real-time control rates and accommodating multiple sensing modalities. Although rec... |
4.00 |
18% |
See Reviews |
View AI Dashboard |
|
LLMs as Rules Oracles: Exploring Real-World Multimodal Reasoning in Tabletop Strategy Game Environments |
We introduce **LudoBench**, a multimodal reasoning benchmark that evaluates whether vision-enabled large language models (LMs) can acquire, integrate, and reason over heterogeneous game knowledge in m... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
LLM Probability Concentration: How Alignment Shrinks the Generative Horizon |
Despite their impressive capabilities, aligned large language models (LLMs) often generate outputs that lack diversity. What drives this stability in the generation? We investigate this phenomenon thr... |
3.60 |
9% |
See Reviews |
View AI Dashboard |
|
From movement to cognitive maps: recurrent neural networks reveal how locomotor development shapes hippocampal spatial coding |
The hippocampus contains neurons whose firing correlates with an animal's location and orientation in space. Collectively, these neurons are held to support a cognitive map of the environment, enablin... |
6.50 |
10% |
See Reviews |
View AI Dashboard |
|
Learning Semantics, Not Addresses: Runtime Neural Prefetching for Far Memory |
Memory prefetching has long boosted CPU caches and is increasingly vital for far-memory systems, where large portions of memory are offloaded to cheaper, remote tiers. While effective prefetching requ... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
Hierarchical Contrastive Reinforcement Learning: learn representation more suitable for RL environments |
Goal-conditioned reinforcement learning holds significant importance for real-world environment, but its inherent sparse reward structure brings challenges. In recent years, some researchers have atte... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
CoIn: Coverage and Informativeness-Guided Token Reduction for Efficient Large Multimodal Models |
Large Multimodal Models (LMMs) have shown remarkable success in image understanding tasks. LMMs encode visual and textual inputs into tokens, which are then fed into Large Language Models (LLMs). Howe... |
4.00 |
17% |
See Reviews |
View AI Dashboard |
|
KV-Prune: Key–Value Similarity for Online Structured Pruning for Large Language Models |
Pruning has emerged as a promising direction for accelerating large language model (LLM) inference, yet existing approaches often suffer from instability because they rely on offline calibration data ... |
4.00 |
19% |
See Reviews |
View AI Dashboard |
|
Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning |
Goal-Conditioned Reinforcement Learning (GCRL) mitigates the difficulty of reward design by framing tasks as goal reaching rather than maximizing hand-crafted reward signals. In this setting, the opti... |
6.00 |
3% |
See Reviews |
View AI Dashboard |
|
Dynamic Relational Priming Improves Transformer in Multivariate Time Series |
Standard attention mechanisms in transformers employ static token representations that remain unchanged across all pair-wise computations in each layer. This limits their representational alignment wi... |
4.67 |
0% |
See Reviews |
View AI Dashboard |
|
LLMs Can Get "Brain Rot"! |
We propose and test the **LLM Brain Rot Hypothesis**: continual exposure to *junk web text* induces lasting cognitive decline in large language models (LLMs). To causally isolate data quality, we run ... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Latent Light Source Modeling for Scene Reconstruction under Dynamic Illumination |
Modeling scenes under unknown, varying single-point illumination is crucial for applications such as interactive relighting, augmented reality, and robotics. However, existing dynamic novel-view synth... |
2.50 |
9% |
See Reviews |
View AI Dashboard |
|
T-GINEE: A Tensor-Based Multi-Graph Representation Learning |
While traditional network analysis focuses on single-layer networks, real-world systems often exhibit multiple types of relationships simultaneously, forming multilayer networks. However, existing mul... |
4.00 |
69% |
See Reviews |
View AI Dashboard |
|
DeepSketcher: Internalizing Visual Manipulation for Multimodal Reasoning |
The ''thinking with images'' paradigm represents a pivotal shift in the reasoning of Vision Language Models (VLMs), moving from text-dominant chain-of-thought to image-interactive reasoning. By invoki... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Spherical Watermark: Encryption-Free, Lossless Watermarking for Diffusion Models |
Diffusion models have revolutionized image synthesis but raise concerns around content provenance and authenticity. Digital watermarking offers a means of tracing generated media, yet traditional sche... |
7.50 |
3% |
See Reviews |
View AI Dashboard |
|
LogicSR: A Unified Benchmark for Logical Discovery from Data |
Discovering underlying logical expressions from data is a critical task for interpretable AI and scientific discovery, yet it remains poorly served by existing research infrastructure. The field of Sy... |
6.00 |
10% |
See Reviews |
View AI Dashboard |
|
What Matters for Batch Online Reinforcement Learning in Robotics? |
The ability to learn from large batches of autonomously collected data for policy improvement---a paradigm we refer to as batch online reinforcement learning---holds the promise of enabling truly scal... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Explaining the Reasoning of Large Language Models Using Attribution Graphs |
Large language models (LLMs) exhibit remarkable capabilities, yet their reasoning remains opaque, raising safety and trust concerns. Attribution methods, which assign credit to input features, have pr... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression |
We introduce SIRI, **S**caling **I**terative **R**einforcement Learning with **I**nterleaved Compression, a simple yet effective RL approach for Large Reasoning Models (LRMs) that enables more efficie... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling |
Arbitrary-scale super-resolution (ASSR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs with arbitrary upsampling factors using a single model, addressing the limitatio... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Huxley-G\"odel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine |
Recent studies operationalize self-improvement through coding agents that edit their own codebases, grow a tree of self-modifications through expansion strategies that favor higher software engineerin... |
6.00 |
5% |
See Reviews |
View AI Dashboard |
|
Matched-Pair Experimental Design with Active Learning |
Matched-pair experimental designs aim to detect treatment effects by pairing participants and comparing within-pair outcome differences. In many situations, the overall effect size across the entire p... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Accelerated Parallel Tempering via Neural Transports |
Markov Chain Monte Carlo (MCMC) algorithms are essential tools in computational statistics for sampling from unnormalised probability distributions, but can be fragile when targeting high-dimensional,... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
LLMs Must Think Thrice to Solve Executable Counterfactuals |
Counterfactual reasoning, a hallmark of intelligence, consists of three steps: inferring latent variables from observations (abduction), constructing alternative situations (interventions), and predic... |
6.00 |
0% |
See Reviews |
View AI Dashboard |
|
Structural Prognostic Event Modeling for Multimodal Cancer Survival Analysis |
The integration of histology images and gene profiles has shown great promise for improving survival prediction in cancer. However, current approaches often struggle to model intra- and inter-modal in... |
5.00 |
0% |
See Reviews |
View AI Dashboard |
|
Test-Time Layer Recurrence Enables Ultra-Deep Thinking in LLMs Without Chain-of-Thought |
Transformers possess a \textbf{neural depth} of only $O(1)$, which restricts them to solving primarily \textbf{inductive} reasoning problems of bounded depth. In contrast, recurrent models allow the l... |
2.50 |
66% |
See Reviews |
View AI Dashboard |
|
HoP: Homeomorphic Polar Learning for Hard Constrained Optimization |
Constrained optimization demands highly efficient solvers, which promotes the development of learn-to-optimize (L2O) approaches. As a data-driven method, L2O leverages neural networks to efficiently p... |
3.00 |
3% |
See Reviews |
View AI Dashboard |
|
Accurate Estimation of Mutual Information in High Dimensional Data |
Mutual information (MI) is a fundamental measure of statistical dependence between two variables, yet accurate estimation from finite data remains notoriously difficult. No estimator is universally re... |
3.00 |
0% |
See Reviews |
View AI Dashboard |
|
PhyMAGIC: Physical Motion-Aware Generative Inference with Confidence-guided LLM |
Recent advances in 3D content generation have amplified demand for dynamic models that are both visually realistic and physically consistent. However, state-of-the-art video diffusion models frequentl... |
4.00 |
47% |
See Reviews |
View AI Dashboard |
|
Convergence and Connectivity: Asymptotic Dynamics of Multi-Agent Q-Learning in Random Networks |
Beyond specific settings, many multi-agent learning algorithms fail to converge to an equilibrium solution, instead displaying complex, non-stationary behaviours such as recurrent or chaotic orbits. I... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
MMWebGen: Benchmarking Multimodal Webpage Generation |
Multimodal generative models have advanced text-to-image generation and image
editing. Recent unified models (UMs) can even craft interleaved images and text.
However, the capacity of such models to s... |
3.50 |
0% |
See Reviews |
View AI Dashboard |
|
Elucidating Guidance in Variance Exploding Diffusion Models: Fast Convergence and Better Diversity |
Recently, the conditional diffusion models have shown an impressive performance in many areas, such as text-to-image, 3D, and video. To achieve a better alignment with the given condition, guidance-ba... |
4.50 |
0% |
See Reviews |
View AI Dashboard |
|
Autoregressive Direct Preference Optimization |
Direct preference optimization (DPO) has emerged as a promising approach for aligning large language models (LLMs) with human preferences. However, the widespread reliance on the response-level Bradle... |
4.00 |
0% |
See Reviews |
View AI Dashboard |
|
Reward Shaping Control Variates for Off-Policy Evaluation Under Sparse Rewards |
Off-policy evaluation (OPE) is essential for deploying reinforcement learning in safety-critical settings, yet existing estimators such as importance sampling and doubly robust (DR) often exhibit proh... |
4.00 |
26% |
See Reviews |
View AI Dashboard |
|
Story-Iter: A Training-free Iterative Paradigm for Long Story Visualization |
This paper introduces **Story-Iter**, a new training-free iterative paradigm to enhance long-story generation. Unlike existing methods that rely on fixed reference images to construct a complete story... |
5.33 |
0% |
See Reviews |
View AI Dashboard |
|
FAFO: Lossy KV Cache Compression for Lossless Inference Acceleration via Draftless Fumble Decoding |
Lossy KV cache compression is a well-explored subfield of machine learning efficiency, with improved latency being one of its major gains. However, lossy compression techniques can fumble from time to... |
4.50 |
0% |
See Reviews |
View AI Dashboard |