Update README.md
Browse files
README.md
CHANGED
@@ -6,16 +6,8 @@ library_name: transformers
|
|
6 |
|
7 |

|
8 |
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
Linear models offer a promising approach to significantly reduce computational costs at scale, particularly for large context lengths. Enabling a >1000x improvement in inference costs, enabling o1 inference time thinking and wider AI accessibility.
|
13 |
-
|
14 |
-
As demonstrated with our Qwerky-72B-Preview and prior models such as QRWKV6-32B Instruct Preview, we have successfully converted Qwen 2.5 QwQ 32B into a RWKV variant without requiring a pretrain on the base model or retraining the model from scratch. Enabling us to test and validate the more efficient RWKV Linear attention with a much smaller budget. Since our preview, we have continued to refine our technique and managed to improve the model over the preview model iteration.
|
15 |
-
|
16 |
-
As with our previous models, the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
|
17 |
-
|
18 |
-
You may find our details of the process from our previous release, [here](https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1).
|
19 |
|
20 |
Benchmarks is as follows for both Qwerky-QwQ-32B and Qwerky-72B models:
|
21 |
|
@@ -80,4 +72,14 @@ generated_ids = [
|
|
80 |
]
|
81 |
|
82 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
83 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |

|
8 |
|
9 |
+
- Try out the model on [](https://featherless.ai/models/featherless-ai/Qwerky-QwQ-32B)
|
10 |
+
- Model details can be found on our blog post here! [](https://substack.recursal.ai/p/qwerky-72b-and-32b-training-large)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
Benchmarks is as follows for both Qwerky-QwQ-32B and Qwerky-72B models:
|
13 |
|
|
|
72 |
]
|
73 |
|
74 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
75 |
+
```
|
76 |
+
|
77 |
+
## Model notes
|
78 |
+
|
79 |
+
Linear models offer a promising approach to significantly reduce computational costs at scale, particularly for large context lengths. Enabling a >1000x improvement in inference costs, enabling o1 inference time thinking and wider AI accessibility.
|
80 |
+
|
81 |
+
As demonstrated with our Qwerky-72B-Preview and prior models such as QRWKV6-32B Instruct Preview, we have successfully converted Qwen 2.5 QwQ 32B into a RWKV variant without requiring a pretrain on the base model or retraining the model from scratch. Enabling us to test and validate the more efficient RWKV Linear attention with a much smaller budget. Since our preview, we have continued to refine our technique and managed to improve the model over the preview model iteration.
|
82 |
+
|
83 |
+
As with our previous models, the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
|
84 |
+
|
85 |
+
You may find our details of the process from our previous release, [here](https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1).
|