gao

ym9

AI & ML interests

None yet

Recent Activity

upvoted a paper 23 days ago

Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

upvoted a paper 23 days ago

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

upvoted a paper 25 days ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

View all activity

Organizations

upvoted 2 papers 23 days ago

Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

Paper • 2509.09595 • Published 24 days ago • 47

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Paper • 2509.08519 • Published 25 days ago • 124

upvoted 2 papers 25 days ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published 25 days ago • 55

RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published 25 days ago • 69

upvoted a paper about 2 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14 • 142

upvoted 2 papers 2 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 256

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Paper • 2507.20939 • Published Jul 28 • 56

upvoted 3 papers 3 months ago

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Paper • 2506.21416 • Published Jun 26 • 28

GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning

Paper • 2506.16141 • Published Jun 19 • 27

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 59

upvoted a paper 6 months ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published Apr 3 • 57

liked 2 Spaces 6 months ago

9.1k

FLUX.1 [dev]

🖥

Generate images from text prompts

2.04k

PuLID-FLUX

🤗

Generate images from text prompts and ID images

upvoted 3 papers 7 months ago

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18 • 27

Unleashing Vecset Diffusion Model for Fast Shape Generation

Paper • 2503.16302 • Published Mar 20 • 43

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Paper • 2503.08677 • Published Mar 11 • 29

liked a Space 7 months ago

141

Image Generation & Editing

🖼

Generate and Edit images with Gemini 2.0

upvoted 2 papers 7 months ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12 • 45

How far can we go with ImageNet for Text-to-Image generation?

Paper • 2502.21318 • Published Feb 28 • 26

upvoted a paper 8 months ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published Feb 6 • 51

gao