InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts Paper • 2509.10813 • Published 6 days ago • 29
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published 3 days ago • 91
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence Paper • 2509.12203 • Published 3 days ago • 15
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning Paper • 2509.11543 • Published 4 days ago • 42
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools Paper • 2509.09734 • Published 9 days ago • 14
World Modeling with Probabilistic Structure Integration Paper • 2509.09737 • Published 8 days ago • 10
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading Paper • 2509.09995 • Published 7 days ago • 7
Inpainting-Guided Policy Optimization for Diffusion Large Language Models Paper • 2509.10396 • Published 6 days ago • 14
HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering Paper • 2509.09713 • Published 11 days ago • 22
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis Paper • 2509.10441 • Published 6 days ago • 27
X-Part: high fidelity and structure coherent shape decomposition Paper • 2509.08643 • Published 9 days ago • 23
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs Paper • 2509.09677 • Published 7 days ago • 28
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing Paper • 2509.08721 • Published 8 days ago • 576
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published 8 days ago • 185
Can Understanding and Generation Truly Benefit Together -- or Just Coexist? Paper • 2509.09666 • Published 7 days ago • 32
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published 9 days ago • 113
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations Paper • 2509.09676 • Published 7 days ago • 28
FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark Paper • 2509.09680 • Published 7 days ago • 36