File size: 1,955 Bytes
02b9712 692cb53 02b9712 f0c1028 692cb53 cd321af 02b9712 f0c1028 02b9712 692cb53 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
license: mit
datasets:
- tyqiangz/multilingual-sentiments
- cardiffnlp/tweet_sentiment_multilingual
- mteb/tweet_sentiment_multilingual
- Sp1786/multiclass-sentiment-analysis-dataset
- stanfordnlp/sst2
language:
- en
- de
- es
- fr
- ja
- zh
- id
- ar
- hi
- it
- ms
- pt
metrics:
- accuracy
- f1
base_model:
- microsoft/deberta-v3-base
tags:
- sentiment
---
# Model
Multi-language sentiment classification model developed over the Microsoft [DeBERTa-v3 base model](https://huggingface.co/microsoft/deberta-v3-base).
Model where trained on mulitple datasets with multiple languages with additional weights over class (sentiment categories: Negative, Positive, Neutral).
In order to train the model the following dataset where used:
- tyqiangz/multilingual-sentiments
- cardiffnlp/tweet_sentiment_multilingual
- mteb/tweet_sentiment_multilingual
- Sp1786/multiclass-sentiment-analysis-dataset
- ABSC amazon review
- SST2
# Evaluation and comparison with GPT-4o model:
| Dataset | Model | F1 | Accuracy |
|------------------|--------|--------|----------|
| **sst2** | Our | 0.6161 | 0.9231 |
| | GPT-4 | 0.6113 | 0.8605 |
| **sent-eng** | Our | 0.6289 | 0.6470 |
| | GPT-4 | 0.4611 | 0.5870 |
| **sent-twi** | Our | 0.3368 | 0.3488 |
| | GPT-4 | 0.5049 | 0.5385 |
| **mixed** | Our | 0.5644 | 0.7786 |
| | GPT-4 | 0.5336 | 0.6863 |
| **absc-laptop** | Our | 0.5513 | 0.6682 |
| | GPT-4 | 0.6679 | 0.7642 |
| **absc-rest** | Our | 0.6149 | 0.7726 |
| | GPT-4 | 0.7057 | 0.8385 |
| **stanford** | Our | 0.8352 | 0.8353 |
| | GPT-4 | 0.8045 | 0.8032 |
| **amazon-var** | Our | 0.6432 | 0.9647 |
| | GPT-4 | 0.0000 | 0.9450 |
# Source code
[Repo](https://github.com/alexdrk14/DeBerta-v3-base-Sent) |