xiaobin zhuang's picture

2 6

xiaobin zhuang

xiaobinzhuang

·

https://scholar.google.com/citations?user=a-crUqgAAAAJ&hl=zh-CN

auzxb

AI & ML interests

multi modal; audio generation; posting training

Recent Activity

authored a paper 9 days ago

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

authored a paper 9 days ago

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

authored a paper 9 days ago

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

View all activity

Organizations

None yet

xiaobinzhuang's activity

authored 4 papers 9 days ago

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

Paper • 2406.02430 • Published Jun 4, 2024 • 37

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Paper • 2502.03930 • Published Feb 6

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 18 days ago • 122

KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

Paper • 2110.09121 • Published Oct 18, 2021

liked a dataset about 1 month ago

google/Synthetic-Persona-Chat

Viewer • Updated Mar 1, 2024 • 10.9k • 895 • 107

liked a model 5 months ago

THUDM/cogvlm2-llama3-caption

Video-Text-to-Text • Updated Jan 22 • 7.26k • 95

New activity in Sunbread/isekai-rolename-vae 8 months ago

a question about reparameterize

#1 opened 8 months ago by

liked a model 10 months ago

apple/DFN5B-CLIP-ViT-H-14

Updated Feb 28 • 58.2k • 42

liked a Space 11 months ago

Enclap

liked a model 11 months ago

THUDM/glm-4-9b-chat

Updated Mar 13 • 220k • 677

updated a model 11 months ago

xiaobinzhuang/videofoley

Text-to-Image • Updated Jun 2, 2024 • 10

liked a Space about 1 year ago

NaturalSpeech3 FACodec

Convert and reconstruct speech files