Saba ๐๐๐ฎ๐น
About the model
Saba is a BERT model for Italian poetry.
It was obtained via continued pretraining of dbmdz/bert-base-italian-xxl-cased
on ~40k Italian song lyrics from Wikisource and Biblioteca Italiana.
The objective was Masked Language Modeling (MLM).
The training code is available on GitHub.
Evaluation
The base model and the adapted model were tested on a held-out set of ~1k poems with the following results:
Model | MLM Loss | Perplexity |
---|---|---|
Base | 3.39 | 29.56 |
Saba | 1.94 | 6.94 |
Evaluation of the learned representations will be made available in the future, once a suitable dataset has been created / identified.
Why Saba?
Following the tradition of giving Italian names to BERT models for the Italian language (see AlBERTo, GilBERTo, UmBERTo), we dedicate this model to the Italian poet and novelist Umberto Saba (9 March 1883 โ 25 August 1957).
- Downloads last month
- 19
Model tree for mattiaferrarini/saba
Base model
dbmdz/bert-base-italian-xxl-cased