A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 252
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30 • 98
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 19 days ago • 185
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published 4 days ago • 54
Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published 4 days ago • 71