Update README.md
Browse files
README.md
CHANGED
@@ -9,18 +9,18 @@ pipeline_tag: question-answering
|
|
9 |
|
10 |
# Model Details
|
11 |
|
12 |
-
This model is a fine-tuned version of base model [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) using LoRA on the [
|
13 |
|
14 |
-
Compared with [TOMMI-0.
|
15 |
|
16 |
## **Hyperparameters**
|
17 |
|
18 |
-
* learning_rate: 5e-
|
19 |
* gradient_accumulation_steps: 2
|
20 |
* epoch: 5
|
21 |
-
* r (lora rank):
|
22 |
-
* lora_alpha:
|
23 |
-
* lora_dropout: 0.
|
24 |
* target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
|
25 |
|
26 |
## **Usage**
|
|
|
9 |
|
10 |
# Model Details
|
11 |
|
12 |
+
This model is a fine-tuned version of base model [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) using LoRA on the [train_qa_wo_students.csv](https://drive.google.com/file/d/1uv-kVP0z3E8u9-u8PWAKA9tkr3ENHeZv/view?usp=sharing) dataset combing materials from FEM courses of [Prof. Krishna Garikipati](https://viterbi.usc.edu/directory/faculty/Garikipati/Krishna).
|
13 |
|
14 |
+
Compared with [TOMMI-0.35](https://huggingface.co/my-ai-university/TOMMI-0.35/), TOMMI-1.0 uses the optimal hyperparameters (without student asked QA pairs) and increased token length of 700 from 500.
|
15 |
|
16 |
## **Hyperparameters**
|
17 |
|
18 |
+
* learning_rate: 5e-5
|
19 |
* gradient_accumulation_steps: 2
|
20 |
* epoch: 5
|
21 |
+
* r (lora rank): 45
|
22 |
+
* lora_alpha: 65
|
23 |
+
* lora_dropout: 0.05
|
24 |
* target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
|
25 |
|
26 |
## **Usage**
|