AhmedZaky1
/

DIMI-embedding-matryoshka-arabic

@@ -1,458 +1,216 @@
 ---
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
-- generated_from_trainer
-- dataset_size:457216
-- loss:MatryoshkaLoss
-- loss:MultipleNegativesRankingLoss
-base_model: aubmindlab/bert-base-arabertv02
-widget:
-- source_sentence: الناس يسيرون
-  sentences:
-  - شخصان يصعدان على الدرج
-  - الناس يجلسون
-  - رجل يجلس ويستمع للمحادثات
-- source_sentence: لاعب كرة قدم يرتدي زيًا أحمر وأسود يحمل الرقم 3 وخوذة سوداء يحمل
-    الكرة ويحيط به لاعبون معارضون يرتدون زيًا أبيض وأرجواني بيكسفيل.
-  sentences:
-  - لاعب كرة قدم يحمل كرة
-  - الرجل مستعد لالتقاط كرة القدم
-  - الكلاب بالخارج
-- source_sentence: بعثة لوس أنجلوس هي عيادة مجانية
-  sentences:
-  - إنها مساعدة ممرضة في بعثة لوس أنجلوس
-  - تعمل كطبيبة رئيسة في "لوس أنجلوس ميسيون" عيادة مجانية في حي فقير
-  - التوافق مطلوب من الأجهزة أو البرمجيات.
-- source_sentence: رجل يرتدي قميصًا بنيًا مخططًا يقف يثني ذراعيه على قمة مبنى على
-    سطح منزل.
-  sentences:
-  - رجل ينظر من نافذة المطبخ
-  - شخص على السطح
-  - لا يجوز إظهار أي مبلغ من الأصول في الميزانية العمومية للمهمة الفيدرالية
-- source_sentence: الحيوانات الأليفة تلعب دور الجدار
-  sentences:
-  - كلبان يلعبان في منطقة محصورة من الحصى.
-  - الكلاب تجري لالتقاط عصا عبر الشارع.
-  - يمكن تطوير التكنولوجيا.
-pipeline_tag: sentence-similarity
-library_name: sentence-transformers
 metrics:
-- pearson_cosine
-- spearman_cosine
 model-index:
-- name: SentenceTransformer based on aubmindlab/bert-base-arabertv02
   results:
   - task:
-      type: semantic-similarity
-      name: Semantic Similarity
     dataset:
-      name: arabic nli dev
-      type: arabic-nli-dev
     metrics:
-    - type: pearson_cosine
-      value: 0.5891378532917348
-      name: Pearson Cosine
-    - type: spearman_cosine
-      value: 0.5933477548023721
-      name: Spearman Cosine
 ---
-# SentenceTransformer based on aubmindlab/bert-base-arabertv02
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
-## Model Details
-### Model Description
-- **Model Type:** Sentence Transformer
-- **Base model:** [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) <!-- at revision 016fb9d6768f522a59c6e0d2d5d5d43a4e1bff60 -->
-- **Maximum Sequence Length:** 75 tokens
-- **Output Dimensionality:** 768 dimensions
-- **Similarity Function:** Cosine Similarity
-<!-- - **Training Dataset:** Unknown -->
-<!-- - **Language:** Unknown -->
-<!-- - **License:** Unknown -->
-### Model Sources
-- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
-- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
-- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
-### Full Model Architecture
-```
-SentenceTransformer(
-  (0): Transformer({'max_seq_length': 75, 'do_lower_case': False}) with Transformer model: BertModel
-  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
-)
-```
-## Usage
-### Direct Usage (Sentence Transformers)
-First install the Sentence Transformers library:
 ```bash
-pip install -U sentence-transformers
 ```
-Then you can load this model and run inference.
 ```python
 from sentence_transformers import SentenceTransformer
-# Download from the 🤗 Hub
-model = SentenceTransformer("sentence_transformers_model_id")
-# Run inference
 sentences = [
-    'الحيوانات الأليفة تلعب دور الجدار',
-    'كلبان يلعبان في منطقة محصورة من الحصى.',
-    'الكلاب تجري لالتقاط عصا عبر الشارع.',
 ]
 embeddings = model.encode(sentences)
-print(embeddings.shape)
-# [3, 768]
-# Get the similarity scores for the embeddings
-similarities = model.similarity(embeddings, embeddings)
-print(similarities.shape)
-# [3, 3]
 ```
-<!--
-### Direct Usage (Transformers)
-<details><summary>Click to see the direct usage in Transformers</summary>
-</details>
--->
-<!--
-### Downstream Usage (Sentence Transformers)
-You can finetune this model on your own dataset.
-<details><summary>Click to expand</summary>
-</details>
--->
-<!--
-### Out-of-Scope Use
-*List how the model may foreseeably be misused and address what users ought not to do with the model.*
--->
-## Evaluation
-### Metrics
-#### Semantic Similarity
-* Dataset: `arabic-nli-dev`
-* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
-| Metric              | Value      |
-|:--------------------|:-----------|
-| pearson_cosine      | 0.5891     |
-| **spearman_cosine** | **0.5933** |
-<!--
-## Bias, Risks and Limitations
-*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
--->
-<!--
-### Recommendations
-*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
--->
-## Training Details
-### Training Dataset
-#### Unnamed Dataset
-* Size: 457,216 training samples
-* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
-* Approximate statistics based on the first 1000 samples:
-  |         | sentence_0                                                                       | sentence_1                                                                        | sentence_2                                                                       |
-  |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
-  | type    | string                                                                           | string                                                                            | string                                                                           |
-  | details | <ul><li>min: 4 tokens</li><li>mean: 12.5 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.33 tokens</li><li>max: 68 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.59 tokens</li><li>max: 33 tokens</li></ul> |
-* Samples:
-  | sentence_0                                                                                            | sentence_1                                                                   | sentence_2                                                    |
-  |:------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|:--------------------------------------------------------------|
-  | <code>يجلس طفل أحمر الشعر ينظر من خلال السور إلى الماء بينما يلعب الناس على الشاطئ في المسافة.</code> | <code>طفل أحمر الشعر مهتم بالماء والناس يلعبون على الشاطئ في المسافة.</code> | <code>فتى شقراء يراقب القارب مع الناس عليه يبحر بعيدا.</code> |
-  | <code>عامل نظافة على وشك التنظيف في محطة القطار</code>                                                | <code>البواب سيقوم بتنظيف محطة القطار</code>                                 | <code>البواب يجلس في محطة القطار</code>                       |
-  | <code>رجل يرتدي قميصاً أخضر وبنطال جينز ينحني فوق مرمى الهوكي الأحمر مع ثقب فوقه.</code>              | <code>رجل يرتدي قميصاً أخضر.</code>                                          | <code>امرأة ترتدي قميصاً أخضر.</code>                         |
-* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
-  ```json
-  {
-      "loss": "MultipleNegativesRankingLoss",
-      "matryoshka_dims": [
-          768,
-          512,
-          256,
-          128,
-          64
-      ],
-      "matryoshka_weights": [
-          1,
-          1,
-          1,
-          1,
-          1
-      ],
-      "n_dims_per_step": -1
-  }
-  ```
-### Training Hyperparameters
-#### Non-Default Hyperparameters
-- `eval_strategy`: steps
-- `per_device_train_batch_size`: 64
-- `per_device_eval_batch_size`: 64
-- `fp16`: True
-- `batch_sampler`: no_duplicates
-- `multi_dataset_batch_sampler`: round_robin
-#### All Hyperparameters
-<details><summary>Click to expand</summary>
-- `overwrite_output_dir`: False
-- `do_predict`: False
-- `eval_strategy`: steps
-- `prediction_loss_only`: True
-- `per_device_train_batch_size`: 64
-- `per_device_eval_batch_size`: 64
-- `per_gpu_train_batch_size`: None
-- `per_gpu_eval_batch_size`: None
-- `gradient_accumulation_steps`: 1
-- `eval_accumulation_steps`: None
-- `torch_empty_cache_steps`: None
-- `learning_rate`: 5e-05
-- `weight_decay`: 0.0
-- `adam_beta1`: 0.9
-- `adam_beta2`: 0.999
-- `adam_epsilon`: 1e-08
-- `max_grad_norm`: 1
-- `num_train_epochs`: 3
-- `max_steps`: -1
-- `lr_scheduler_type`: linear
-- `lr_scheduler_kwargs`: {}
-- `warmup_ratio`: 0.0
-- `warmup_steps`: 0
-- `log_level`: passive
-- `log_level_replica`: warning
-- `log_on_each_node`: True
-- `logging_nan_inf_filter`: True
-- `save_safetensors`: True
-- `save_on_each_node`: False
-- `save_only_model`: False
-- `restore_callback_states_from_checkpoint`: False
-- `no_cuda`: False
-- `use_cpu`: False
-- `use_mps_device`: False
-- `seed`: 42
-- `data_seed`: None
-- `jit_mode_eval`: False
-- `use_ipex`: False
-- `bf16`: False
-- `fp16`: True
-- `fp16_opt_level`: O1
-- `half_precision_backend`: auto
-- `bf16_full_eval`: False
-- `fp16_full_eval`: False
-- `tf32`: None
-- `local_rank`: 0
-- `ddp_backend`: None
-- `tpu_num_cores`: None
-- `tpu_metrics_debug`: False
-- `debug`: []
-- `dataloader_drop_last`: False
-- `dataloader_num_workers`: 0
-- `dataloader_prefetch_factor`: None
-- `past_index`: -1
-- `disable_tqdm`: False
-- `remove_unused_columns`: True
-- `label_names`: None
-- `load_best_model_at_end`: False
-- `ignore_data_skip`: False
-- `fsdp`: []
-- `fsdp_min_num_params`: 0
-- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
-- `tp_size`: 0
-- `fsdp_transformer_layer_cls_to_wrap`: None
-- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
-- `deepspeed`: None
-- `label_smoothing_factor`: 0.0
-- `optim`: adamw_torch
-- `optim_args`: None
-- `adafactor`: False
-- `group_by_length`: False
-- `length_column_name`: length
-- `ddp_find_unused_parameters`: None
-- `ddp_bucket_cap_mb`: None
-- `ddp_broadcast_buffers`: False
-- `dataloader_pin_memory`: True
-- `dataloader_persistent_workers`: False
-- `skip_memory_metrics`: True
-- `use_legacy_prediction_loop`: False
-- `push_to_hub`: False
-- `resume_from_checkpoint`: None
-- `hub_model_id`: None
-- `hub_strategy`: every_save
-- `hub_private_repo`: None
-- `hub_always_push`: False
-- `gradient_checkpointing`: False
-- `gradient_checkpointing_kwargs`: None
-- `include_inputs_for_metrics`: False
-- `include_for_metrics`: []
-- `eval_do_concat_batches`: True
-- `fp16_backend`: auto
-- `push_to_hub_model_id`: None
-- `push_to_hub_organization`: None
-- `mp_parameters`:
-- `auto_find_batch_size`: False
-- `full_determinism`: False
-- `torchdynamo`: None
-- `ray_scope`: last
-- `ddp_timeout`: 1800
-- `torch_compile`: False
-- `torch_compile_backend`: None
-- `torch_compile_mode`: None
-- `dispatch_batches`: None
-- `split_batches`: None
-- `include_tokens_per_second`: False
-- `include_num_input_tokens_seen`: False
-- `neftune_noise_alpha`: None
-- `optim_target_modules`: None
-- `batch_eval_metrics`: False
-- `eval_on_start`: False
-- `use_liger_kernel`: False
-- `eval_use_gather_object`: False
-- `average_tokens_across_devices`: False
-- `prompts`: None
-- `batch_sampler`: no_duplicates
-- `multi_dataset_batch_sampler`: round_robin
-</details>
-### Training Logs
-| Epoch  | Step  | Training Loss | arabic-nli-dev_spearman_cosine |
-|:------:|:-----:|:-------------:|:------------------------------:|
-| 0.1400 | 500   | 10.0831       | -                              |
-| 0.1999 | 714   | -             | 0.4417                         |
-| 0.2800 | 1000  | 5.6335        | -                              |
-| 0.3998 | 1428  | -             | 0.5157                         |
-| 0.4199 | 1500  | 4.7627        | -                              |
-| 0.5599 | 2000  | 4.3656        | -                              |
-| 0.5997 | 2142  | -             | 0.5443                         |
-| 0.6999 | 2500  | 4.085         | -                              |
-| 0.7996 | 2856  | -             | 0.5569                         |
-| 0.8399 | 3000  | 3.8314        | -                              |
-| 0.9798 | 3500  | 3.5961        | -                              |
-| 0.9994 | 3570  | -             | 0.5612                         |
-| 1.0    | 3572  | -             | 0.5617                         |
-| 1.1198 | 4000  | 3.2502        | -                              |
-| 1.1993 | 4284  | -             | 0.5819                         |
-| 1.2598 | 4500  | 3.1274        | -                              |
-| 1.3992 | 4998  | -             | 0.5848                         |
-| 1.3998 | 5000  | 3.0461        | -                              |
-| 1.5398 | 5500  | 2.9606        | -                              |
-| 1.5991 | 5712  | -             | 0.5930                         |
-| 1.6797 | 6000  | 2.9263        | -                              |
-| 1.7990 | 6426  | -             | 0.5906                         |
-| 1.8197 | 6500  | 2.8313        | -                              |
-| 1.9597 | 7000  | 2.7663        | -                              |
-| 1.9989 | 7140  | -             | 0.5868                         |
-| 2.0    | 7144  | -             | 0.5888                         |
-| 2.0997 | 7500  | 2.4814        | -                              |
-| 2.1988 | 7854  | -             | 0.5864                         |
-| 2.2396 | 8000  | 2.3545        | -                              |
-| 2.3796 | 8500  | 2.3052        | -                              |
-| 2.3987 | 8568  | -             | 0.5898                         |
-| 2.5196 | 9000  | 2.3227        | -                              |
-| 2.5985 | 9282  | -             | 0.5924                         |
-| 2.6596 | 9500  | 2.3185        | -                              |
-| 2.7984 | 9996  | -             | 0.5933                         |
-| 2.7996 | 10000 | 2.2571        | -                              |
-| 2.9395 | 10500 | 2.2335        | -                              |
-| 2.9983 | 10710 | -             | 0.5925                         |
-| 3.0    | 10716 | -             | 0.5933                         |
-### Framework Versions
-- Python: 3.11.11
-- Sentence Transformers: 4.1.0
-- Transformers: 4.50.0.dev0
-- PyTorch: 2.6.0+cu124
-- Accelerate: 1.4.0
-- Datasets: 3.3.2
-- Tokenizers: 0.21.0
 ## Citation
-### BibTeX
-#### Sentence Transformers
 ```bibtex
-@inproceedings{reimers-2019-sentence-bert,
-    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
-    author = "Reimers, Nils and Gurevych, Iryna",
-    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
-    month = "11",
-    year = "2019",
-    publisher = "Association for Computational Linguistics",
-    url = "https://arxiv.org/abs/1908.10084",
 }
 ```
-#### MatryoshkaLoss
-```bibtex
-@misc{kusupati2024matryoshka,
-    title={Matryoshka Representation Learning},
-    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
-    year={2024},
-    eprint={2205.13147},
-    archivePrefix={arXiv},
-    primaryClass={cs.LG}
-}
-```
-#### MultipleNegativesRankingLoss
-```bibtex
-@misc{henderson2017efficient,
-    title={Efficient Natural Language Response Suggestion for Smart Reply},
-    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
-    year={2017},
-    eprint={1705.00652},
-    archivePrefix={arXiv},
-    primaryClass={cs.CL}
-}
-```
-<!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
--->
-<!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
--->
-<!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

 ---
+language:
+- ar
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
+- matryoshka
+- arabic
+- natural-language-inference
+- bert
+- nli
+- arabert
+datasets:
+- Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
 metrics:
+- cosine_accuracy
+- cosine_f1
+- accuracy
+- f1
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+base_model: aubmindlab/bert-base-arabertv02
+license: apache-2.0
 model-index:
+- name: Arabic BERT NLI Matryoshka
   results:
   - task:
+      type: natural-language-inference
+      name: Natural Language Inference
     dataset:
+      type: Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
+      name: Arabic NLI Pair Classification
     metrics:
+    - type: accuracy
+      value: 0.8125
+      name: Best Accuracy (128 dim)
+    - type: f1
+      value: 0.8142
+      name: Best F1 (256 dim)
 ---
+# Arabic BERT NLI Matryoshka Embeddings
+## Model Description
+This model is a **Matryoshka representation learning** version of AraBERT specifically fine-tuned for Arabic Natural Language Inference (NLI) tasks. It generates embeddings that can be truncated to different dimensions (768, 512, 256, 128, 64) while maintaining strong performance across all sizes.
+The model is based on `aubmindlab/bert-base-arabertv02` and trained using the Matryoshka Representation Learning approach, which allows for flexible embedding dimensions without retraining.
+## Key Features
+- 🔄 **Flexible Dimensions**: Single model supports embeddings of size 768, 512, 256, 128, and 64
+- 🚀 **High Performance**: Consistently outperforms base model across all dimensions
+- 📊 **Arabic NLI Optimized**: Specifically trained on Arabic Natural Language Inference tasks
+- ⚡ **Efficient**: Smaller dimensions offer faster inference with minimal performance loss
+- 🎯 **Binary Classification**: Optimized for entailment vs contradiction classification
+## Performance Results
+Our model shows significant improvements over the base AraBERT model across all embedding dimensions:
+| Dimension | Matryoshka Accuracy | Base Accuracy | Matryoshka F1 | Base F1 | Improvement |
+|-----------|---------------------|---------------|---------------|---------|-------------|
+| 768       | 80.3%              | 56.8%         | 81.15%        | 41.94%  | +39.21%     |
+| 512       | 80.6%              | 56.9%         | 81.36%        | 44.32%  | +37.05%     |
+| 256       | 80.95%             | 55.65%        | 81.42%        | 38.7%   | +42.72%     |
+| 128       | 81.25%             | 56.7%         | 81.37%        | 40.6%   | +40.77%     |
+| 64        | 81.0%              | 55.8%         | 80.51%        | 37.92%  | +42.59%     |
+## Quick Start
+### Installation
 ```bash
+pip install sentence-transformers torch
 ```
+### Basic Usage
 ```python
 from sentence_transformers import SentenceTransformer
+# Load the model
+model = SentenceTransformer('AhmedZaky1/arabic-bert-nli-matryoshka')
+# Example sentences
 sentences = [
+    "الطقس جميل اليوم",
+    "إنه يوم مشمس وجميل",
+    "أحب قراءة الكتب"
 ]
+# Generate embeddings (default: full 768 dimensions)
 embeddings = model.encode(sentences)
+print(f"Full embeddings shape: {embeddings.shape}")
+# Use different dimensions by truncating
+embeddings_256 = embeddings[:, :256]  # Use first 256 dimensions
+embeddings_128 = embeddings[:, :128]  # Use first 128 dimensions
+embeddings_64 = embeddings[:, :64]    # Use first 64 dimensions
+print(f"256-dim embeddings shape: {embeddings_256.shape}")
 ```
+### Similarity Computation
+```python
+from sentence_transformers import util
+# Compute similarity between sentences
+sentence1 = "القطة تجلس على السجادة"
+sentence2 = "الكلب يلعب في الحديقة"
+embeddings = model.encode([sentence1, sentence2])
+similarity = util.cos_sim(embeddings[0], embeddings[1])
+print(f"Similarity: {similarity.item():.4f}")
+```
+### NLI Classification
+```python
+def classify_nli_pair(premise, hypothesis, threshold=0.6):
+    """
+    Classify Natural Language Inference relationship
+    Args:
+        premise: The premise sentence
+        hypothesis: The hypothesis sentence
+        threshold: Similarity threshold for classification
+    Returns:
+        str: 'entailment' if similarity > threshold, else 'contradiction'
+    """
+    embeddings = model.encode([premise, hypothesis])
+    similarity = util.cos_sim(embeddings[0], embeddings[1]).item()
+    return 'entailment' if similarity > threshold else 'contradiction'
+# Example usage
+premise = "الرجل يقرأ كتاباً في المكتبة"
+hypothesis = "شخص يقرأ في مكان هادئ"
+result = classify_nli_pair(premise, hypothesis)
+print(f"Relationship: {result}")
+```
+### Choosing the Right Dimension
+- **768 dimensions**: Maximum accuracy for critical applications
+- **512 dimensions**: Best balance of performance and efficiency
+- **256 dimensions**: Good performance with 3x faster inference
+- **128 dimensions**: Suitable for real-time applications
+- **64 dimensions**: Ultra-fast inference for large-scale processing
+## Training Details
+### Dataset
+- **Training Data**: Arabic-NLI-Pair-Class dataset from Omartificial-Intelligence-Space
+- **Language**: Modern Standard Arabic (MSA)
+- **Task Type**: Binary classification (entailment vs contradiction)
+### Training Configuration
+- **Base Model**: aubmindlab/bert-base-arabertv02
+- **Max Sequence Length**: 75 tokens
+- **Batch Size**: 64
+- **Epochs**: 5
+- **Matryoshka Dimensions**: [768, 512, 256, 128, 64]
+- **Loss Function**: MatryoshkaLoss with CosineSimilarityLoss
+- **Optimization**: AdamW with automatic mixed precision (AMP)
+## Use Cases
+1. **Arabic Text Similarity**: Measure semantic similarity between Arabic texts
+2. **Natural Language Inference**: Determine logical relationships between Arabic sentences
+3. **Information Retrieval**: Find relevant Arabic documents based on queries
+4. **Semantic Search**: Build Arabic search engines with semantic understanding
+5. **Text Classification**: Use embeddings as features for downstream Arabic NLP tasks
+## Limitations
+- Primarily trained on Modern Standard Arabic (MSA)
+- Performance may vary on dialectal Arabic
+- Optimized for shorter texts (up to 75 tokens)
+- Binary classification focus (entailment/contradiction)
 ## Citation
+If you use this model in your research, please cite:
 ```bibtex
+@model{arabic-bert-nli-matryoshka,
+  title={Arabic BERT NLI Matryoshka Embeddings},
+  author={Ahmed Mouad},
+  year={2025},
+  url={https://huggingface.co/AhmedZaky1/arabic-bert-nli-matryoshka}
 }
 ```
+## Acknowledgments
+- **AraBERT Team**: For the excellent base model (aubmindlab/bert-base-arabertv02)
+- **Sentence Transformers**: For the robust training framework
+- **Matryoshka Representation Learning**: For the innovative approach to nested embeddings
+- **Arabic NLI Dataset**: Omartificial-Intelligence-Space for the training data
+## License
+This model is released under the Apache 2.0 License.
+---
+**Model Version**: 1.0
+**Last Updated**: May 2025
+**Framework**: sentence-transformers
+**Language**: Arabic (العربية)