alexander-sh's picture
Update README.md
692cb53 verified
|
raw
history blame
1.96 kB
metadata
license: mit
datasets:
  - tyqiangz/multilingual-sentiments
  - cardiffnlp/tweet_sentiment_multilingual
  - mteb/tweet_sentiment_multilingual
  - Sp1786/multiclass-sentiment-analysis-dataset
  - stanfordnlp/sst2
language:
  - en
  - de
  - es
  - fr
  - ja
  - zh
  - id
  - ar
  - hi
  - it
  - ms
  - pt
metrics:
  - accuracy
  - f1
base_model:
  - microsoft/deberta-v3-base
tags:
  - sentiment

Model

Multi-language sentiment classification model developed over the Microsoft DeBERTa-v3 base model. Model where trained on mulitple datasets with multiple languages with additional weights over class (sentiment categories: Negative, Positive, Neutral). In order to train the model the following dataset where used:

  • tyqiangz/multilingual-sentiments
  • cardiffnlp/tweet_sentiment_multilingual
  • mteb/tweet_sentiment_multilingual
  • Sp1786/multiclass-sentiment-analysis-dataset
  • ABSC amazon review
  • SST2

Evaluation and comparison with GPT-4o model:

Dataset Model F1 Accuracy
sst2 Our 0.6161 0.9231
GPT-4 0.6113 0.8605
sent-eng Our 0.6289 0.6470
GPT-4 0.4611 0.5870
sent-twi Our 0.3368 0.3488
GPT-4 0.5049 0.5385
mixed Our 0.5644 0.7786
GPT-4 0.5336 0.6863
absc-laptop Our 0.5513 0.6682
GPT-4 0.6679 0.7642
absc-rest Our 0.6149 0.7726
GPT-4 0.7057 0.8385
stanford Our 0.8352 0.8353
GPT-4 0.8045 0.8032
amazon-var Our 0.6432 0.9647
GPT-4 0.0000 0.9450

Source code

Repo