Sakinah-AI: Optimized CAMeL-BERT for Arabic Mental Health Question Classification

Sakinah-AI Project Banner

This repository contains the official fine-tuned model Sakinah-AI-CAMEL-BERT-Optimized, our submission to the MentalQA 2025 Shared Task (Track 1).

By: Fatimah Emad Elden & Mumina Abukar

Cairo University & The University of South Wales

Paper Code HuggingFace Collection License


Model Description

This model is a fine-tuned version of CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment for multi-label classification of Arabic questions related to mental health. It was trained on the AraHealthQA dataset.

Our approach involved a comprehensive hyperparameter search using the Optuna framework to find the optimal configuration. To address the inherent class imbalance in the dataset, the model was trained using a custom Focal Loss function. This strategy proved highly effective, making this model our best-performing fine-tuned system, achieving a Weighted F1-score of 0.597 on the official blind test set.

The model predicts one or more of the following labels for a given question:

  • A: Diagnosis (Interpreting symptoms)
  • B: Treatment (Seeking therapies or medications)
  • C: Anatomy and Physiology (Basic medical knowledge)
  • D: Epidemiology (Course, prognosis, causes of diseases)
  • E: Healthy Lifestyle (Diet, exercise, mood control)
  • F: Provider Choices (Recommendations for doctors)
  • Z: Other (Does not fit other categories)

πŸš€ How to Use

You can use this model directly with the transformers library pipeline for text-classification.

from transformers import pipeline

# Load the classification pipeline
classifier = pipeline(
    "text-classification",
    model="FatimahEmadEldin/Sakinah-AI-CAMEL-BERT-Optimized",
    return_all_scores=True # Set to True for multi-label output
)

# Example question in Arabic
question = "Ω…Ψ§ Ω‡ΩŠ Ψ£ΨΉΨ±Ψ§ΨΆ Ψ§Ω„Ψ§ΩƒΨͺΨ¦Ψ§Ψ¨ ΩˆΩƒΩŠΩ ΩŠΩ…ΩƒΩ† ΨΉΩ„Ψ§Ψ¬Ω‡ΨŸ"
# (Translation: "What are the symptoms of depression and how can it be treated?")

results = classifier(question)

# --- Post-processing to get final labels ---
# The optimal threshold found during tuning was ~0.25
threshold = 0.246
predicted_labels = [item['label'] for item in results[0] if item['score'] > threshold]

print(f"Question: {question}")
# Expected output for this example would likely include 'Diagnosis' and 'Treatment'
print(f"Predicted Labels: {predicted_labels}")
# [['A', 'B']]

βš™οΈ Training Procedure

This model was fine-tuned using a rigorous hyperparameter optimization process.

Hyperparameters

The best hyperparameters found by Optuna and used for this model are:

Hyperparameter Value
learning_rate 6.416e-05
num_train_epochs 14
weight_decay 0.0480
focal_alpha 1.2320
focal_gamma 2.6240
base_threshold 0.2462

Frameworks

  • PyTorch
  • Hugging Face Transformers
  • Optuna

πŸ“Š Evaluation Results

The model was evaluated on the blind test set provided by the MentalQA organizers.

Final Test Set Scores

Metric Score
Weighted F1-Score 0.5972
Jaccard Score 0.4502

Per-Label Performance (Test Set)

              precision    recall  f1-score   support
           A       0.61      0.96      0.75        84
           B       0.56      0.98      0.72        85
           C       0.14      0.40      0.21        10
           D       0.25      0.91      0.39        34
           E       0.34      0.53      0.41        38
           F       0.07      0.33      0.11         6
           Z       0.00      0.00      0.00         3

   micro avg       0.42      0.85      0.57       260
   macro avg       0.28      0.59      0.37       260
weighted avg       0.47      0.85      0.60       260
 samples avg       0.44      0.88      0.56       260

πŸ“œ Citation

If you use our work, please cite our paper:

@inproceedings{elden2025sakinahai,
    title={{Sakinah-AI at MentalQA: A Comparative Study of Few-Shot, Optimized, and Ensemble Methods for Arabic Mental Health Question Classification}},
    author={Elden, Fatimah Emad and Abukar, Mumina},
    year={2025},
    booktitle={Proceedings of the MentalQA 2025 Shared Task},
    eprint={25XX.XXXXX},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FatimahEmadEldin/Sakinah-AI-CAMEL-BERT-Optimized

Finetuned
(5)
this model

Collection including FatimahEmadEldin/Sakinah-AI-CAMEL-BERT-Optimized