refs etc
Browse files
README.md
CHANGED
@@ -14,13 +14,33 @@ datasets:
|
|
14 |
---
|
15 |
|
16 |
# Non-timbral Embeddings extractor
|
17 |
-
This model
|
18 |
-
|
|
|
19 |
|
20 |
-
|
21 |
|
22 |
-
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
# Usage
|
26 |
code
|
|
|
14 |
---
|
15 |
|
16 |
# Non-timbral Embeddings extractor
|
17 |
+
This model produces embeddings that represent the non-timbral traits (prosody, accent, ...) of a speaker's voice. These embeddings can be used the same way as for a classical
|
18 |
+
speaker verification (ASV): to compare two voice signals, extract an embeddings for each of them and compute the cosine similarity between the two embeddings.
|
19 |
+
The main difference with classical ASV embeddings is that here only the non-timbral traits are compared.
|
20 |
|
21 |
+
The model has been derived from the self-supervised pretrained model [WavLM-large](https://huggingface.co/microsoft/wavlm-large).
|
22 |
|
23 |
+
See section below for an eplanation on how to compute the non-timbral embeddings.
|
24 |
+
|
25 |
+
# Publication
|
26 |
+
Details about the method used to build this model have been published at Interspeech 2024 in the paper entitled
|
27 |
+
[Disentangling prosody and timbre embeddings via voice conversion](https://www.isca-archive.org/interspeech_2024/gengembre24_interspeech.pdf).
|
28 |
+
|
29 |
+
## Citation
|
30 |
+
Gengembre, N., Le Blouch, O., Gendrot, C. (2024) Disentangling prosody and timbre embeddings via voice conversion. Proc. Interspeech 2024, 2765-2769, doi: 10.21437/Interspeech.2024-207
|
31 |
+
|
32 |
+
## BibteX citation
|
33 |
+
'''
|
34 |
+
@inproceedings{gengembre24_interspeech,
|
35 |
+
title = {Disentangling prosody and timbre embeddings via voice conversion},
|
36 |
+
author = {Nicolas Gengembre and Olivier {Le Blouch} and Cédric Gendrot},
|
37 |
+
year = {2024},
|
38 |
+
booktitle = {Interspeech 2024},
|
39 |
+
pages = {2765--2769},
|
40 |
+
doi = {10.21437/Interspeech.2024-207},
|
41 |
+
issn = {2958-1796},
|
42 |
+
}
|
43 |
+
'''
|
44 |
|
45 |
# Usage
|
46 |
code
|