Update README.md
Browse files
README.md
CHANGED
@@ -16,17 +16,17 @@ pipeline_tag: text-generation
|
|
16 |
|
17 |
The **Llama-3.1-8B-Adapted** collection of large language models (LLMs), is a collection of adapted generative models in 8B (text in/text out), adapted models from **Llama-3.1-8B**.
|
18 |
|
19 |
-
*Llama-3.1-8B-Italian-FVT* is a continually trained
|
20 |
|
21 |
The tokenizer of this model after adaptation is the same as [Minverva-3B](https://huggingface.co/sapienzanlp/Minerva-3B-base-v1.0).
|
22 |
|
23 |
**Model developer:** SapienzaNLP, ISTI-CNR, ILC-CNR
|
24 |
|
25 |
-
**Model Architecture:**
|
26 |
|
27 |
## Data used for the adaptation
|
28 |
|
29 |
-
The **
|
30 |
The data was extracted to be skewed toward Italian language with a ratio of one over four. Extracting the first 9B tokens from the Italian part of CulturaX and the first 3B tokens from the English part of CulturaX.
|
31 |
|
32 |
|
@@ -49,6 +49,8 @@ pipeline = transformers.pipeline(
|
|
49 |
pipeline("Cosa si può fare in una bella giornata di sole?")
|
50 |
```
|
51 |
|
|
|
|
|
52 |
## Citation
|
53 |
|
54 |
If you use any part of this work, please consider citing the paper as follows:
|
|
|
16 |
|
17 |
The **Llama-3.1-8B-Adapted** collection of large language models (LLMs), is a collection of adapted generative models in 8B (text in/text out), adapted models from **Llama-3.1-8B**.
|
18 |
|
19 |
+
*Llama-3.1-8B-Italian-FVT* is a continually trained Llama model, after tokenizer substitution.
|
20 |
|
21 |
The tokenizer of this model after adaptation is the same as [Minverva-3B](https://huggingface.co/sapienzanlp/Minerva-3B-base-v1.0).
|
22 |
|
23 |
**Model developer:** SapienzaNLP, ISTI-CNR, ILC-CNR
|
24 |
|
25 |
+
**Model Architecture:** Llama-3.1-8B-Adapted is an auto-regressive language model that uses an optimized transformer architecture.
|
26 |
|
27 |
## Data used for the adaptation
|
28 |
|
29 |
+
The **Llama-3.1-8B-Adapted** model was trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX).
|
30 |
The data was extracted to be skewed toward Italian language with a ratio of one over four. Extracting the first 9B tokens from the Italian part of CulturaX and the first 3B tokens from the English part of CulturaX.
|
31 |
|
32 |
|
|
|
49 |
pipeline("Cosa si può fare in una bella giornata di sole?")
|
50 |
```
|
51 |
|
52 |
+
Code: https://github.com/SapienzaNLP/sava
|
53 |
+
|
54 |
## Citation
|
55 |
|
56 |
If you use any part of this work, please consider citing the paper as follows:
|