Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-4.0
|
3 |
+
datasets:
|
4 |
+
- OOPPEENN/56697375616C4E6F76656C5F44617461736574
|
5 |
+
- amphion/Emilia-Dataset
|
6 |
+
- litagin/ehehe-corpus
|
7 |
+
- joujiboi/japanese-anime-speech
|
8 |
+
language:
|
9 |
+
- ja
|
10 |
+
base_model:
|
11 |
+
- HKUSTAudio/Llasa-1B-Multilingual
|
12 |
+
pipeline_tag: text-to-speech
|
13 |
+
---
|
14 |
+
|
15 |
+
# Galgame-Llasa-1B-v3
|
16 |
+
|
17 |
+
## Overview
|
18 |
+
|
19 |
+
This is the version 3 of the Galgame-Llasa-1B, a Text-to-Speech (TTS) model fine-tuned for Japanese. This model is based on [HKUSTAudio/Llasa-1B-Multilingual](https://huggingface.co/HKUSTAudio/Llasa-1B-Multilingual).
|
20 |
+
|
21 |
+
## What's New in v3?
|
22 |
+
|
23 |
+
The primary improvement in v3 is the **modification of the text normalization process** during training.
|
24 |
+
|
25 |
+
This update leads to more consistent and accurate speech synthesis, further improving upon the advances made in v2.
|
26 |
+
|
27 |
+
## What's New in v2 (from v1)?
|
28 |
+
|
29 |
+
Version 2 was trained on a larger and more diverse dataset, including the original Galgame dataset and other sources.
|
30 |
+
|
31 |
+
As a result, v2 offered several key improvements over the original version:
|
32 |
+
|
33 |
+
- **Improved Kanji Reading:** The model handled the reading of Kanji characters more accurately.
|
34 |
+
- **Enhanced Prosody:** The generated speech had more natural intonation and expressiveness.
|
35 |
+
- **Greater Voice Diversity:** The model could produce a wider range of voice styles than the previous version.
|
36 |
+
|
37 |
+
## License
|
38 |
+
|
39 |
+
This model is licensed under the **CC-BY-NC-4.0**.
|