MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published 11 days ago • 60
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28 • 50
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24 • 64
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published Apr 29 • 44
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 194
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection Paper • 2409.08513 • Published Sep 13, 2024 • 15
Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos Paper • 2409.08353 • Published Sep 12, 2024 • 13
Apollo: Band-sequence Modeling for High-Quality Audio Restoration Paper • 2409.08514 • Published Sep 13, 2024 • 12
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Paper • 2409.08947 • Published Sep 13, 2024 • 14
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation Paper • 2409.08240 • Published Sep 12, 2024 • 22
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Paper • 2409.08239 • Published Sep 12, 2024 • 21
TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text Encoder Paper • 2409.08248 • Published Sep 12, 2024 • 16
Can OOD Object Detectors Learn from Foundation Models? Paper • 2409.05162 • Published Sep 8, 2024 • 9
DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors Paper • 2409.08278 • Published Sep 12, 2024 • 15
PiTe: Pixel-Temporal Alignment for Large Video-Language Model Paper • 2409.07239 • Published Sep 11, 2024 • 15