Correct pipeline tag and add Github link
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -1,11 +1,11 @@
|
|
1 |
---
|
2 |
-
library_name: transformers
|
3 |
-
license: apache-2.0
|
4 |
-
language:
|
5 |
-
- en
|
6 |
base_model:
|
7 |
- answerdotai/ModernBERT-large
|
8 |
-
|
|
|
|
|
|
|
|
|
9 |
author: Shreyan C (@thethinkmachine)
|
10 |
---
|
11 |
|
@@ -88,9 +88,7 @@ print("Scaled Complexity Score:", get_scaled_complexity_score(query))
|
|
88 |
|
89 |
### Training Data
|
90 |
|
91 |
-
We use [BhabhaAI/DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity) 'deita'set for training the model. The dataset contains 66.5K diverse English instructions along with their complexity scores computed using the DEITA-Evol-Complexity scoring scheme which uses an LLM-judge to rank a sextuple containing 1 seed + 5 progressively complexified (*evolved*) instructions based on their complexity & difficulty. The scheme assigns scores within [1, 6] range, with 1 being the least complex and 6 being the most complex.
|
92 |
-
|
93 |
-
However, the training dataset used was observed to have instruction-score pairs across a diversity of scores within the range [0,9]. We suspect that this range includes scoring errors, as anomalous scores (0, 7, 8, 9) account for less than 1% of the total instructions.
|
94 |
|
95 |
The distribution of scores within the dataset is as follows:
|
96 |
| Score | Frequency | Relative Freq. |
|
@@ -142,7 +140,7 @@ You are advised to use the model keeping these factors in mind.
|
|
142 |
|
143 |
### CO2 Emissions
|
144 |
|
145 |
-
Experiments were conducted using Google Cloud Platform in region asia-south1, which has a carbon efficiency of 0.92 kgCO2eq/kWh. A cumulative of 13.24 hours of computation was performed on hardware of type L4 (TDP of 72W)
|
146 |
|
147 |
Total emissions are estimated to be 0.87 kgCO2eq of which 100% was directly offset by the cloud provider.
|
148 |
|
@@ -164,4 +162,5 @@ For any queries, suggestions or feedback, please contact Shreyan C at *shreyan(a
|
|
164 |
- [[2312.15685] What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning](https://arxiv.org/abs/2312.15685)
|
165 |
- [[2404.02948] PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models](https://arxiv.org/abs/2404.02948)
|
166 |
- [DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity)
|
167 |
-
- [ModernBERT-Large](https://huggingface.co/answerdotai/ModernBERT-large)
|
|
|
|
1 |
---
|
|
|
|
|
|
|
|
|
2 |
base_model:
|
3 |
- answerdotai/ModernBERT-large
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
library_name: transformers
|
7 |
+
license: apache-2.0
|
8 |
+
pipeline_tag: text-generation
|
9 |
author: Shreyan C (@thethinkmachine)
|
10 |
---
|
11 |
|
|
|
88 |
|
89 |
### Training Data
|
90 |
|
91 |
+
We use [BhabhaAI/DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity) 'deita'set for training the model. The dataset contains 66.5K diverse English instructions along with their complexity scores computed using the DEITA-Evol-Complexity scoring scheme which uses an LLM-judge to rank a sextuple containing 1 seed + 5 progressively complexified (*evolved*) instructions based on their complexity & difficulty. The scheme assigns scores within [1, 6] range, with 1 being the least complex and 6 being the most complex. However, the training dataset used was observed to have instruction-score pairs across a diversity of scores within the range [0,9]. We suspect that this range includes scoring errors, as anomalous scores (0, 7, 8, 9) account for less than 1% of the total instructions.
|
|
|
|
|
92 |
|
93 |
The distribution of scores within the dataset is as follows:
|
94 |
| Score | Frequency | Relative Freq. |
|
|
|
140 |
|
141 |
### CO2 Emissions
|
142 |
|
143 |
+
Experiments were conducted using Google Cloud Platform in region asia-south1, which has a carbon efficiency of 0.92 kgCO2eq/kWh. A cumulative of 13.24 hours of computation was performed on hardware of type L4 (TDP of 72W).\
|
144 |
|
145 |
Total emissions are estimated to be 0.87 kgCO2eq of which 100% was directly offset by the cloud provider.
|
146 |
|
|
|
162 |
- [[2312.15685] What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning](https://arxiv.org/abs/2312.15685)
|
163 |
- [[2404.02948] PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models](https://arxiv.org/abs/2404.02948)
|
164 |
- [DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity)
|
165 |
+
- [ModernBERT-Large](https://huggingface.co/answerdotai/ModernBERT-large)
|
166 |
+
- [Github](https://github.com/thethinkmachine/Maxwell-Task-Complexity-Scorer)
|