WonsukYangTL nielsr HF Staff commited on
Commit
bc1650c
·
verified ·
1 Parent(s): e6ca2fd

Add paper link to model card (#5)

Browse files

- Add paper link to model card (797ecea8a3e604c7070e169fc1354b49fce51349)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -1,15 +1,15 @@
1
  ---
2
- license: apache-2.0
3
- tags:
4
- - finetuned
5
- - chat
6
  language:
7
  - en
8
  - ko
9
  - ja
10
  - zh
11
- pipeline_tag: text-generation
12
  library_name: transformers
 
 
 
 
 
13
  ---
14
 
15
  # Trillion-7B-preview
@@ -22,7 +22,7 @@ library_name: transformers
22
 
23
  ## Introduction
24
 
25
- We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance.
26
 
27
 
28
  When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving around 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
@@ -240,4 +240,4 @@ This model repository is licensed under the Apache-2.0 License.
240
  }
241
  ```
242
  ## Contact
243
- For inquiries, please contact: info@trillionlabs.co
 
1
  ---
 
 
 
 
2
  language:
3
  - en
4
  - ko
5
  - ja
6
  - zh
 
7
  library_name: transformers
8
+ license: apache-2.0
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - finetuned
12
+ - chat
13
  ---
14
 
15
  # Trillion-7B-preview
 
22
 
23
  ## Introduction
24
 
25
+ We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance. This model is presented in the paper: [Trillion-7B-preview](https://huggingface.co/papers/2504.15431).
26
 
27
 
28
  When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving around 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
 
240
  }
241
  ```
242
  ## Contact
243
+ For inquiries, please contact: info@trillionlabs.co