Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models Paper • 2504.07951 • Published 28 days ago • 27
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency Paper • 2503.20785 • Published Mar 26 • 21
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals Paper • 2503.19953 • Published Mar 25 • 3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement Paper • 2503.04919 • Published Mar 6 • 8
Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published Mar 26 • 46
Cosmos Transfer1 Collection Multimodal Conditional World Generation for World2World Transfer • 6 items • Updated 3 days ago • 15
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24 • 19
Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning Paper • 2503.18013 • Published Mar 23 • 19
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering Paper • 2503.16422 • Published Mar 20 • 14
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 40
Physical AI Collection Collection of commercial-grade datasets for physical AI developers • 10 items • Updated 3 days ago • 50