link to HF space
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ pipeline_tag: text-to-speech
|
|
41 |
|
42 |
GitHub project: https://github.com/DanRuta/xVA-Synth
|
43 |
|
44 |
-
The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself is used by the xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta.
|
45 |
|
46 |
`The v3 model now uses a slightly custom tweaked VITS/YourTTS model. Tweaks including larger capacity, bigger lang embedding, custom symbol set (a custom spec of ARPAbet with some more phonemes to cover other languages), and I guess a different training script.` - Dan Ruta
|
47 |
|
@@ -52,6 +52,8 @@ xVAPitch_5820651 model sample: <audio controls>
|
|
52 |
Your browser does not support the audio element.
|
53 |
</audio>
|
54 |
|
|
|
|
|
55 |
Papers:
|
56 |
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103
|
57 |
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
|
|
|
41 |
|
42 |
GitHub project: https://github.com/DanRuta/xVA-Synth
|
43 |
|
44 |
+
The base model for training other [🤗 xVASynth's](https://huggingface.co/spaces/Pendrokar/xVASynth-TTS) "xVAPitch" type models (v3). Model itself is used by the xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta.
|
45 |
|
46 |
`The v3 model now uses a slightly custom tweaked VITS/YourTTS model. Tweaks including larger capacity, bigger lang embedding, custom symbol set (a custom spec of ARPAbet with some more phonemes to cover other languages), and I guess a different training script.` - Dan Ruta
|
47 |
|
|
|
52 |
Your browser does not support the audio element.
|
53 |
</audio>
|
54 |
|
55 |
+
There are hundreds of fine-tuned models on the web. But most of them use non-permissive datasets.
|
56 |
+
|
57 |
Papers:
|
58 |
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103
|
59 |
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418
|