Submitted by fuvty 63 Cache-to-Cache: Direct Semantic Communication Between Large Language Models Tsinghua-NICS-EFC 23 2
Submitted by forde450 59 Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer inclusionAI 62 1
Submitted by taesiri 39 Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Alpha-VLLM 743 1
Submitted by dcml0714 32 SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models · 10 authors 1
Submitted by zoeyuchao 30 RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training RLinf 475 1
Submitted by taesiri 29 MATRIX: Mask Track Alignment for Interaction-aware Video Generation · 8 authors 22 1
Submitted by whyu 20 Artificial Hippocampus Networks for Efficient Long-Context Modeling ByteDance Seed 28 1
Submitted by imsheriff 18 The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP University of California, Los Angeles 1
Submitted by amphora 18 Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought KO-REAson 1
Submitted by huggingaaaaa 17 Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Tsinghua University 2 1
Submitted by tangzhy 17 CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling · 12 authors 1
Submitted by FSCCS 14 OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot Westlake University 24 1
Submitted by kazemnejad 14 The Markovian Thinker Mila – Quebec Artificial Intelligence Institute 13 1
Submitted by ZetangForward 13 Revisiting Long-context Modeling from Context Denoising Perspective Soochow University 2 2
Submitted by XinXuNLPer 12 When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation McAuley-Lab 3 1
Submitted by Chenfei-Liao 10 Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods · 13 authors 1
Submitted by MingyuLiu 10 StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation Zhejiang University 2
Submitted by JimmyMa99 9 Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs · 14 authors 12 1
Submitted by taesiri 8 TTRV: Test-Time Reinforcement Learning for Vision Language Models · 10 authors 5 1
Submitted by XuWuLingYu 5 WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation Peking University 1
Submitted by myownskyW7 5 G^2RPO: Granular GRPO for Precise Reward in Flow Models IXCLab@Shanghai AI Lab 17 1
Submitted by talzoomanzoo 4 Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces Yonsei University 1
Submitted by taesiri 3 AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning · 17 authors 7 1
Submitted by taesiri 2 U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking · 10 authors 28 1
Submitted by RajveeSheth 2 Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models Lingo Research Group 1 1
Submitted by EddyLuo 2 Code Agent can be an End-to-end System Hacker: Benchmarking Real-world Threats of Computer-use Agent Momoka 1
Submitted by yasNing 2 DeepTravel: An End-to-End Agentic Reinforcement Learning Framework for Autonomous Travel Planning Agents Didi Chuxing 1
Submitted by Yanran21 1 D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection · 8 authors 3 1
Submitted by sam-motamed - TRAVL: A Recipe for Making Video-Language Models Better Judges of Physics Implausibility Institute for Computer Science, Artificial intelligence and Technology 1
Submitted by Dragongon - PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles · 9 authors 1
Submitted by Dragongon - FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering · 5 authors 1