Collections
Discover the best community collections!
Collections including paper arxiv:2401.01256
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 74 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33
-
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Paper • 2311.09257 • Published • 48 -
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Paper • 2312.14125 • Published • 47 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21
-
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper • 2401.01647 • Published • 13 -
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Paper • 2401.01827 • Published • 18 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21 -
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
Paper • 2401.00896 • Published • 16
-
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36 -
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Paper • 2312.12423 • Published • 13 -
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Paper • 2312.11392 • Published • 20 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 492k • 3.15k
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 58 -
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture
Paper • 2311.10123 • Published • 18 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 39
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 47 -
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Paper • 2310.16825 • Published • 36 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 43 -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Paper • 2311.04145 • Published • 35
-
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper • 2401.01647 • Published • 13 -
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Paper • 2401.01827 • Published • 18 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21 -
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
Paper • 2401.00896 • Published • 16
-
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36 -
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Paper • 2312.12423 • Published • 13 -
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Paper • 2312.11392 • Published • 20 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 492k • 3.15k
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 74 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 58 -
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture
Paper • 2311.10123 • Published • 18 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 15 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 39
-
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Paper • 2311.09257 • Published • 48 -
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Paper • 2312.14125 • Published • 47 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 47 -
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Paper • 2310.16825 • Published • 36 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 43 -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Paper • 2311.04145 • Published • 35