Orange
/

Speaker-wavLM-pro

Model card Files Files and versions Community

ggmbr commited on Feb 5

Commit

00bcf21

·

1 Parent(s): bbef05c

refs etc

Files changed (1) hide show

README.md +25 -5

README.md CHANGED Viewed

@@ -14,13 +14,33 @@ datasets:
 ---
 # Non-timbral Embeddings extractor
-This model has been derived from the self-supervised pretrained model WavLM-large [lien]. It produces embeddings that represent the non-timbral traits (prosody, accent, ...) of a speaker,
-which can be used the same way as for a classical ASV (automatic speaker verification) embeddings, except that only the non-timbral traits are compared.
-See section below for an eplanation on how to use these embeddings.
-# Citation
-paper
 # Usage
 code

 ---
 # Non-timbral Embeddings extractor
+This model produces embeddings that represent the non-timbral traits (prosody, accent, ...) of a speaker's voice. These embeddings can be used the same way as for a classical
+speaker verification (ASV): to compare two voice signals, extract an embeddings for each of them and compute the cosine similarity between the two embeddings.
+The main difference with classical ASV embeddings is that here only the non-timbral traits are compared.
+The model has been derived from the self-supervised pretrained model [WavLM-large](https://huggingface.co/microsoft/wavlm-large).
+See section below for an eplanation on how to compute the non-timbral embeddings.
+# Publication
+Details about the method used to build this model have been published at Interspeech 2024 in the paper entitled
+[Disentangling prosody and timbre embeddings via voice conversion](https://www.isca-archive.org/interspeech_2024/gengembre24_interspeech.pdf).
+## Citation
+Gengembre, N., Le Blouch, O., Gendrot, C. (2024) Disentangling prosody and timbre embeddings via voice conversion. Proc. Interspeech 2024, 2765-2769, doi: 10.21437/Interspeech.2024-207
+## BibteX citation
+'''
+@inproceedings{gengembre24_interspeech,
+  title     = {Disentangling prosody and timbre embeddings via voice conversion},
+  author    = {Nicolas Gengembre and Olivier {Le Blouch} and Cédric Gendrot},
+  year      = {2024},
+  booktitle = {Interspeech 2024},
+  pages     = {2765--2769},
+  doi       = {10.21437/Interspeech.2024-207},
+  issn      = {2958-1796},
+}
+'''
 # Usage
 code