File size: 1,955 Bytes

---
license: mit
datasets:
- tyqiangz/multilingual-sentiments
- cardiffnlp/tweet_sentiment_multilingual
- mteb/tweet_sentiment_multilingual
- Sp1786/multiclass-sentiment-analysis-dataset
- stanfordnlp/sst2
language:
- en
- de
- es
- fr
- ja
- zh
- id
- ar
- hi
- it
- ms
- pt
metrics:
- accuracy
- f1
base_model:
- microsoft/deberta-v3-base
tags:
- sentiment
---

# Model 

Multi-language sentiment classification model developed over the Microsoft [DeBERTa-v3 base model](https://huggingface.co/microsoft/deberta-v3-base). 
Model where trained on mulitple datasets with multiple languages with additional weights over class (sentiment categories: Negative, Positive, Neutral).
In order to train the model the following dataset where used:
 - tyqiangz/multilingual-sentiments
 - cardiffnlp/tweet_sentiment_multilingual
 - mteb/tweet_sentiment_multilingual
 - Sp1786/multiclass-sentiment-analysis-dataset
 - ABSC amazon review
 - SST2

# Evaluation and comparison with GPT-4o model:

| Dataset          | Model  | F1     | Accuracy |
|------------------|--------|--------|----------|
| **sst2**         | Our    | 0.6161 | 0.9231   |
|                  | GPT-4  | 0.6113 | 0.8605   |
| **sent-eng**     | Our    | 0.6289 | 0.6470   |
|                  | GPT-4  | 0.4611 | 0.5870   |
| **sent-twi**     | Our    | 0.3368 | 0.3488   |
|                  | GPT-4  | 0.5049 | 0.5385   |
| **mixed**        | Our    | 0.5644 | 0.7786   |
|                  | GPT-4  | 0.5336 | 0.6863   |
| **absc-laptop**  | Our    | 0.5513 | 0.6682   |
|                  | GPT-4  | 0.6679 | 0.7642   |
| **absc-rest**    | Our    | 0.6149 | 0.7726   |
|                  | GPT-4  | 0.7057 | 0.8385   |
| **stanford**     | Our    | 0.8352 | 0.8353   |
|                  | GPT-4  | 0.8045 | 0.8032   |
| **amazon-var**   | Our    | 0.6432 | 0.9647   |
|                  | GPT-4  | 0.0000 | 0.9450   |

# Source code
[Repo](https://github.com/alexdrk14/DeBerta-v3-base-Sent)