EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published 2 days ago • 27
MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query Paper • 2506.03144 • Published Jun 3 • 5
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras Paper • 2507.17664 • Published Jul 23 • 1
Grounding Multilingual Multimodal LLMs With Cultural Knowledge Paper • 2508.07414 • Published Aug 10 • 1
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback Paper • 2507.15024 • Published Jul 20 • 13
LBM: Latent Bridge Matching for Fast Image-to-Image Translation Paper • 2503.07535 • Published Mar 10 • 3
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Paper • 2506.13458 • Published Jun 16
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation Paper • 2506.01144 • Published Jun 1 • 14
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 182
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction Paper • 2505.23416 • Published May 29 • 11
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity Paper • 2102.03065 • Published Feb 5, 2021
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published May 24 • 24
Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning Paper • 2505.16088 • Published May 22 • 3