Update README.md
Browse files
README.md
CHANGED
@@ -9,13 +9,18 @@ tags:
|
|
9 |
- reasoning
|
10 |
- llm
|
11 |
- DIRA
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
# Diraya-3B-Instruct-Ar
|
15 |
|
16 |
## Model Description
|
17 |
|
18 |
-
Diraya-3B-Instruct-Ar is an Arabic reasoning-specialized language model fine-tuned from Qwen2.5-3B.
|
|
|
|
|
19 |
|
20 |
## Key Features
|
21 |
|
@@ -31,7 +36,6 @@ Diraya-3B-Instruct-Ar is an Arabic reasoning-specialized language model fine-tun
|
|
31 |
|
32 |
**Model Type**: Instruction-tuned causal language model
|
33 |
|
34 |
-
**Parameter Count**: 3.09B (2.77B non-embedding)
|
35 |
|
36 |
**Architecture**:
|
37 |
- 36 transformer layers
|
@@ -40,7 +44,7 @@ Diraya-3B-Instruct-Ar is an Arabic reasoning-specialized language model fine-tun
|
|
40 |
- Context length: 32,768 tokens
|
41 |
|
42 |
**Training Approach**:
|
43 |
-
- Fine-tuned using GPRO
|
44 |
- Training focused on structured reasoning output format using XML tags
|
45 |
- Optimized for mathematical reasoning using the Arabic GSM8K dataset
|
46 |
- Multiple reward functions including correctness, format adherence, and output structure
|
@@ -77,12 +81,16 @@ The model is designed to output structured reasoning in the following format:
|
|
77 |
### Example Usage
|
78 |
|
79 |
```python
|
80 |
-
from
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
|
82 |
-
# Load the model and tokenizer
|
83 |
-
model_name = "Omartificial-Intelligence-Space/Diraya-3B-Instruct-Ar"
|
84 |
-
model = AutoModelForCausalLM.from_pretrained(model_name)
|
85 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
86 |
|
87 |
# System prompt to enforce XML structure
|
88 |
system_prompt = """
|
@@ -121,8 +129,8 @@ print(response)
|
|
121 |
|
122 |
This model was primarily fine-tuned on:
|
123 |
|
124 |
-
- **Arabic GSM8K Dataset
|
125 |
-
-
|
126 |
|
127 |
## Training and Evaluation Results
|
128 |
|
@@ -146,23 +154,6 @@ The model demonstrates strong performance on Arabic mathematical reasoning tasks
|
|
146 |
- Following the required XML output format
|
147 |
- Arriving at correct numerical answers for multi-step problems
|
148 |
|
149 |
-
## Limitations
|
150 |
-
|
151 |
-
- Specialized for reasoning tasks and may not perform as well on general conversational tasks
|
152 |
-
- Performance may vary on complex mathematical problems beyond grade-school level
|
153 |
-
- Limited to the Arabic language
|
154 |
-
|
155 |
-
## Responsible Use
|
156 |
-
|
157 |
-
This model is intended for educational and research purposes. While it excels at mathematical reasoning, please note:
|
158 |
-
- It should not replace human judgment for critical decisions
|
159 |
-
- Results should be verified when used in educational contexts
|
160 |
-
- The model inherits limitations from its base model Qwen2.5-3B
|
161 |
-
|
162 |
-
## Related Resources
|
163 |
-
|
164 |
-
This model is part of the DIRA (Diraya Arabic Reasoning AI) collection:
|
165 |
-
- [Arabic GSM8K Dataset](https://huggingface.co/datasets/Omartificial-Intelligence-Space/Arabic-gsm8k): The dataset used for training this model
|
166 |
|
167 |
## Citation
|
168 |
|
@@ -196,4 +187,4 @@ This model builds upon the Qwen2.5-3B model by the Qwen Team and utilizes optimi
|
|
196 |
journal={arXiv preprint arXiv:2407.10671},
|
197 |
year={2024}
|
198 |
}
|
199 |
-
```
|
|
|
9 |
- reasoning
|
10 |
- llm
|
11 |
- DIRA
|
12 |
+
- qwen
|
13 |
+
- unsloth
|
14 |
+
- transformers
|
15 |
---
|
16 |
|
17 |
# Diraya-3B-Instruct-Ar
|
18 |
|
19 |
## Model Description
|
20 |
|
21 |
+
**Diraya-3B-Instruct-Ar** is an `Arabic` reasoning-specialized language model fine-tuned from `Qwen2.5-3B` .
|
22 |
+
|
23 |
+
This model is part of the **DIRA (Diraya Arabic Reasoning AI)** collection, which focuses on enhancing the logical inference and mathematical reasoning capabilities of **Arabic** language models.
|
24 |
|
25 |
## Key Features
|
26 |
|
|
|
36 |
|
37 |
**Model Type**: Instruction-tuned causal language model
|
38 |
|
|
|
39 |
|
40 |
**Architecture**:
|
41 |
- 36 transformer layers
|
|
|
44 |
- Context length: 32,768 tokens
|
45 |
|
46 |
**Training Approach**:
|
47 |
+
- Fine-tuned using `GPRO`
|
48 |
- Training focused on structured reasoning output format using XML tags
|
49 |
- Optimized for mathematical reasoning using the Arabic GSM8K dataset
|
50 |
- Multiple reward functions including correctness, format adherence, and output structure
|
|
|
81 |
### Example Usage
|
82 |
|
83 |
```python
|
84 |
+
from unsloth import FastLanguageModel
|
85 |
+
|
86 |
+
model, tokenizer = FastLanguageModel.from_pretrained(
|
87 |
+
model_name = "Omartificial-Intelligence-Space/Diraya-3B-Instruct-Ar",
|
88 |
+
max_seq_length = max_seq_length,
|
89 |
+
load_in_4bit = True, # False for LoRA 16bit
|
90 |
+
fast_inference = True, # Enable vLLM fast inference
|
91 |
+
max_lora_rank = lora_rank,
|
92 |
+
)
|
93 |
|
|
|
|
|
|
|
|
|
94 |
|
95 |
# System prompt to enforce XML structure
|
96 |
system_prompt = """
|
|
|
129 |
|
130 |
This model was primarily fine-tuned on:
|
131 |
|
132 |
+
- [**Arabic GSM8K Dataset**](https://huggingface.co/datasets/Omartificial-Intelligence-Space/Arabic-gsm8k):
|
133 |
+
- A comprehensive collection of grade school math problems translated to Arabic, requiring multi-step reasoning
|
134 |
|
135 |
## Training and Evaluation Results
|
136 |
|
|
|
154 |
- Following the required XML output format
|
155 |
- Arriving at correct numerical answers for multi-step problems
|
156 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
157 |
|
158 |
## Citation
|
159 |
|
|
|
187 |
journal={arXiv preprint arXiv:2407.10671},
|
188 |
year={2024}
|
189 |
}
|
190 |
+
```
|