Collections
Discover the best community collections!
Collections including paper arxiv:2504.11427
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 10 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 42 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 84
-
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Paper • 2504.01016 • Published • 29 -
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Paper • 2503.05638 • Published • 18 -
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Paper • 2409.07447 • Published -
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Paper • 2409.02095 • Published • 37
-
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
Paper • 2503.04504 • Published • 2 -
Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
Paper • 2503.15851 • Published • 10 -
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Paper • 2504.11427 • Published • 17
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 17 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13