Update README.md
Browse files
README.md
CHANGED
@@ -42,26 +42,43 @@ In order to train the model the following dataset where used:
|
|
42 |
- ABSC amazon review
|
43 |
- SST2
|
44 |
|
45 |
-
# Evaluation and comparison with GPT-4o model:
|
46 |
|
47 |
| Dataset | Model | F1 | Accuracy |
|
48 |
|------------------|--------|--------|----------|
|
|
|
49 |
| **sst2** | Our | 0.6161 | 0.9231 |
|
50 |
| | GPT-4 | 0.6113 | 0.8605 |
|
|
|
|
|
51 |
| **sent-eng** | Our | 0.6289 | 0.6470 |
|
52 |
| | GPT-4 | 0.4611 | 0.5870 |
|
|
|
|
|
53 |
| **sent-twi** | Our | 0.3368 | 0.3488 |
|
54 |
| | GPT-4 | 0.5049 | 0.5385 |
|
|
|
|
|
55 |
| **mixed** | Our | 0.5644 | 0.7786 |
|
56 |
| | GPT-4 | 0.5336 | 0.6863 |
|
|
|
|
|
57 |
| **absc-laptop** | Our | 0.5513 | 0.6682 |
|
58 |
| | GPT-4 | 0.6679 | 0.7642 |
|
|
|
|
|
59 |
| **absc-rest** | Our | 0.6149 | 0.7726 |
|
60 |
| | GPT-4 | 0.7057 | 0.8385 |
|
|
|
|
|
61 |
| **stanford** | Our | 0.8352 | 0.8353 |
|
62 |
| | GPT-4 | 0.8045 | 0.8032 |
|
|
|
|
|
63 |
| **amazon-var** | Our | 0.6432 | 0.9647 |
|
64 |
-
| | GPT-4 |
|
|
|
|
|
65 |
|
66 |
# Source code
|
67 |
-
[
|
|
|
42 |
- ABSC amazon review
|
43 |
- SST2
|
44 |
|
45 |
+
# Evaluation and comparison with Vanilla and GPT-4o model:
|
46 |
|
47 |
| Dataset | Model | F1 | Accuracy |
|
48 |
|------------------|--------|--------|----------|
|
49 |
+
| | Vanilla| 0.0000 | 0.0000 |
|
50 |
| **sst2** | Our | 0.6161 | 0.9231 |
|
51 |
| | GPT-4 | 0.6113 | 0.8605 |
|
52 |
+
|---|---|---|---|
|
53 |
+
| | Vanilla| 0.2453 | 0.5820 |
|
54 |
| **sent-eng** | Our | 0.6289 | 0.6470 |
|
55 |
| | GPT-4 | 0.4611 | 0.5870 |
|
56 |
+
|---|---|---|---|
|
57 |
+
| | Vanilla| 0.0889 | 0.1538 |
|
58 |
| **sent-twi** | Our | 0.3368 | 0.3488 |
|
59 |
| | GPT-4 | 0.5049 | 0.5385 |
|
60 |
+
|---|---|---|---|
|
61 |
+
| | Vanilla| 0.0000 | 0.0000 |
|
62 |
| **mixed** | Our | 0.5644 | 0.7786 |
|
63 |
| | GPT-4 | 0.5336 | 0.6863 |
|
64 |
+
|---|---|---|---|
|
65 |
+
| | Vanilla| 0.1475 | 0.2842 |
|
66 |
| **absc-laptop** | Our | 0.5513 | 0.6682 |
|
67 |
| | GPT-4 | 0.6679 | 0.7642 |
|
68 |
+
|---|---|---|---|
|
69 |
+
| | Vanilla| 0.1045 | 0.1858 |
|
70 |
| **absc-rest** | Our | 0.6149 | 0.7726 |
|
71 |
| | GPT-4 | 0.7057 | 0.8385 |
|
72 |
+
|---|---|---|---|
|
73 |
+
| | Vanilla| 0.1455 | 0.2791 |
|
74 |
| **stanford** | Our | 0.8352 | 0.8353 |
|
75 |
| | GPT-4 | 0.8045 | 0.8032 |
|
76 |
+
|---|---|---|---|
|
77 |
+
| | Vanilla| 0.0000 | 0.0000 |
|
78 |
| **amazon-var** | Our | 0.6432 | 0.9647 |
|
79 |
+
| | GPT-4 | ----- | 0.9450 |
|
80 |
+
|
81 |
+
F1 score is measured with macro average computation parameter.
|
82 |
|
83 |
# Source code
|
84 |
+
[GitHub Repository](https://github.com/alexdrk14/mDeBERTa-v3-multi-sent)
|