Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,8 @@ base_model:
|
|
13 |
|
14 |
<img src="https://huggingface.co/datasets/tokyotech-llm/swallow-math/resolve/main/figures/swallow-code-math-log.png" alt="SwallowMath Icon" width="600">
|
15 |
|
|
|
|
|
16 |
## Model Summary
|
17 |
|
18 |
This model is a continual pre-training of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on a mix of the Python subset of [The-Stack-v2-train-smol-ids](https://huggingface.co/datasets/bigcode/the-stack-v2-train-smol-ids) (from [SwallowCode, Experiment 1](https://huggingface.co/datasets/tokyotech-llm/swallow-code)) and multilingual text datasets.
|
|
|
13 |
|
14 |
<img src="https://huggingface.co/datasets/tokyotech-llm/swallow-math/resolve/main/figures/swallow-code-math-log.png" alt="SwallowMath Icon" width="600">
|
15 |
|
16 |
+
<img src="https://huggingface.co/datasets/tokyotech-llm/swallow-code/resolve/main/assets/experiments.png" width="800">
|
17 |
+
|
18 |
## Model Summary
|
19 |
|
20 |
This model is a continual pre-training of [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on a mix of the Python subset of [The-Stack-v2-train-smol-ids](https://huggingface.co/datasets/bigcode/the-stack-v2-train-smol-ids) (from [SwallowCode, Experiment 1](https://huggingface.co/datasets/tokyotech-llm/swallow-code)) and multilingual text datasets.
|