|
--- |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- generated_from_trainer |
|
- dataset_size:1225740 |
|
- loss:MultipleNegativesRankingLoss |
|
base_model: BAAI/bge-m3 |
|
widget: |
|
- source_sentence: yoghurt cow chocolate chip sugar reduced half skimmed in plastic |
|
container commercial supermarket shop organic shop </s> This facet allows recording |
|
the place where the food was prepared for consumption. Only one descriptor from |
|
this facet can be added to each entry. |
|
sentences: |
|
- Product obtained during the processing of screened, dehusked barley into pearl |
|
barley, semolina or flour. It consists principally of particles of endosperm with |
|
fine fragments of the outer skins and some grain screenings. |
|
- Produced by industry in the form it arrives to the final consumer |
|
- Tree nuts from the plant classified under the species Corylus avellana L., commonly |
|
known as Hazelnuts or Cobnuts or Common hazelnut. The part consumed/analysed is |
|
not specified. When relevant, information on the part consumed/analysed has to |
|
be reported with additional facet descriptors. In case of data collections related |
|
to legislations, the default part consumed/analysed is the one defined in the |
|
applicable legislation. |
|
- source_sentence: sauce cold liquid preservation method onion mint croutons sweet |
|
pepper prepared at a restaurant </s> This facet collects ingredients and/or flavour |
|
note. Regarding ingredients this facet serves the purpose of providing information |
|
on ingredients of a composite food being important from some point of view, like |
|
allergic reactions, hazards, but also aspect, taste. The descriptors for this |
|
facet are taken from a selected subset of the main list (actually a relevant part |
|
of the food list). More (none contradicting) descriptors can be applied to each |
|
entry. |
|
sentences: |
|
- Spices from the fruits of the plant classified under the species Piper cubeba |
|
L. f., commonly known as Cubeb fruit or Tailed pepper. The part consumed/analysed |
|
is not specified. When relevant, information on the part consumed/analysed has |
|
to be reported with additional facet descriptors. In case of data collections |
|
related to legislations, the default part consumed/analysed is the one defined |
|
in the applicable legislation. |
|
- Tree nuts from the plant classified under the genus Juglans L. spp., commonly |
|
known as Walnuts or Walnut Black or Walnut English or Walnut Persian. The part |
|
consumed/analysed is not specified. When relevant, information on the part consumed/analysed |
|
has to be reported with additional facet descriptors. In case of data collections |
|
related to legislations, the default part consumed/analysed is the one defined |
|
in the applicable legislation. |
|
- Fruiting vegetables from the plant classified under the species Capsicum annuum |
|
var. grossum (L.) Sendtner or Capsicum annuum var. longum Bailey, commonly known |
|
as Sweet peppers or Bell peppers or Paprika or PeppersLong or Pimento or Pimiento. |
|
The part consumed/analysed is not specified. When relevant, information on the |
|
part consumed/analysed has to be reported with additional facet descriptors. In |
|
case of data collections related to legislations, the default part consumed/analysed |
|
is the one defined in the applicable legislation. |
|
- source_sentence: yoghurt with fruits cow passion fruit sweetened with sugar sucrose |
|
fat content in plastic container commercial supermarket shop organic shop </s> |
|
This facet provides some principal claims related to important nutrients-ingredients, |
|
like fat, sugar etc. It is not intended to include health claims or similar. The |
|
present guidance provides a limited list, to be eventually improved during the |
|
evolution of the system. More than one descriptor can be applied to each entry, |
|
provided they are not contradicting each other. |
|
sentences: |
|
- Product where all or part of the sugar has been added during processing and is |
|
not naturally contained |
|
- Infusion materials from flowers of the plant classified under the genus Rosa L. |
|
spp., commonly known as Rose infusion flowers. The part consumed/analysed is not |
|
specified. When relevant, information on the part consumed/analysed has to be |
|
reported with additional facet descriptors. In case of data collections related |
|
to legislations, the default part consumed/analysed is the one defined in the |
|
applicable legislation. |
|
- Molecules providing intensive sweet sensation, used to substitute natural sugars |
|
in food formulas |
|
- source_sentence: pepper sweet green facets desc physical state form as quantified |
|
grated cooking method stir fried sauted preservation method fresh </s> This facet |
|
describes the form (physical aspect) of the food as reported by the consumer (as |
|
estimated during interview or as registered in the diary) (Consumption Data) or |
|
as expressed in the analysis results in the laboratory (Occurrence Data). Only |
|
one descriptor from this facet can be added to each entry, apart from the specification |
|
“with solid particles”. This facet should only be used in case of raw foods and |
|
ingredients (not for composite foods). |
|
sentences: |
|
- Unprocessed and not stored over any long period |
|
- Paste coarsely divided, where particles are still recognisable at naked eye |
|
- The food item is considered in its form with skin |
|
- source_sentence: tome des bauges raw milk aoc in plastic container brand product |
|
name </s> This facet allows recording whether the food list code was chosen because |
|
of lack of information on the food item or because the proper entry in the food |
|
list was missing. Only one descriptor from this facet can be added to each entry. |
|
sentences: |
|
- The food list item has been chosen because none of the more detailed items corresponded |
|
to the available information. Please consider the eventual addition of a new term |
|
in the list |
|
- The food item has a fat content which, when rounded with the standard rules of |
|
rounding, equals 25 % (weight/weight) |
|
- 'Deprecated term that must NOT be used for any purpose. Its original scopenote |
|
was: The group includes any type of Other fruiting vegetables (exposure). The |
|
part consumed/analysed is by default unspecified. When relevant, information on |
|
the part consumed/analysed has to be reported with additional facet descriptors.' |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
metrics: |
|
- cosine_accuracy@1 |
|
- cosine_accuracy@3 |
|
- cosine_accuracy@5 |
|
- cosine_accuracy@10 |
|
- cosine_precision@1 |
|
- cosine_precision@3 |
|
- cosine_precision@5 |
|
- cosine_precision@10 |
|
- cosine_recall@1 |
|
- cosine_recall@3 |
|
- cosine_recall@5 |
|
- cosine_recall@10 |
|
- cosine_ndcg@10 |
|
- cosine_mrr@10 |
|
- cosine_map@100 |
|
model-index: |
|
- name: SentenceTransformer based on BAAI/bge-m3 |
|
results: |
|
- task: |
|
type: device-aware-information-retrieval |
|
name: Device Aware Information Retrieval |
|
dataset: |
|
name: Unknown |
|
type: unknown |
|
metrics: |
|
- type: cosine_accuracy@1 |
|
value: 0.9849655460430152 |
|
name: Cosine Accuracy@1 |
|
- type: cosine_accuracy@3 |
|
value: 0.9989559406974317 |
|
name: Cosine Accuracy@3 |
|
- type: cosine_accuracy@5 |
|
value: 0.9997911881394863 |
|
name: Cosine Accuracy@5 |
|
- type: cosine_accuracy@10 |
|
value: 1.0 |
|
name: Cosine Accuracy@10 |
|
- type: cosine_precision@1 |
|
value: 0.9849655460430152 |
|
name: Cosine Precision@1 |
|
- type: cosine_precision@3 |
|
value: 0.41713649335282244 |
|
name: Cosine Precision@3 |
|
- type: cosine_precision@5 |
|
value: 0.25370641052411774 |
|
name: Cosine Precision@5 |
|
- type: cosine_precision@10 |
|
value: 0.12752140321570266 |
|
name: Cosine Precision@10 |
|
- type: cosine_recall@1 |
|
value: 0.8690666019440294 |
|
name: Cosine Recall@1 |
|
- type: cosine_recall@3 |
|
value: 0.993924343214383 |
|
name: Cosine Recall@3 |
|
- type: cosine_recall@5 |
|
value: 0.998536283094646 |
|
name: Cosine Recall@5 |
|
- type: cosine_recall@10 |
|
value: 0.9999462151268373 |
|
name: Cosine Recall@10 |
|
- type: cosine_ndcg@10 |
|
value: 0.9936056206465634 |
|
name: Cosine Ndcg@10 |
|
- type: cosine_mrr@10 |
|
value: 0.9919155008004455 |
|
name: Cosine Mrr@10 |
|
- type: cosine_map@100 |
|
value: 0.9909164791232326 |
|
name: Cosine Map@100 |
|
--- |
|
|
|
# SentenceTransformer based on BAAI/bge-m3 |
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** Sentence Transformer |
|
- **Base model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) <!-- at revision 5617a9f61b028005a4858fdac845db406aefb181 --> |
|
- **Maximum Sequence Length:** 96 tokens |
|
- **Output Dimensionality:** 1024 dimensions |
|
- **Similarity Function:** Cosine Similarity |
|
<!-- - **Training Dataset:** Unknown --> |
|
<!-- - **Language:** Unknown --> |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
### Full Model Architecture |
|
|
|
``` |
|
SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 96, 'do_lower_case': False}) with Transformer model: XLMRobertaModel |
|
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
) |
|
``` |
|
|
|
## Usage |
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
First install the Sentence Transformers library: |
|
|
|
```bash |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Then you can load this model and run inference. |
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
# Download from the 🤗 Hub |
|
model = SentenceTransformer("disi-unibo-nlp/foodex-facet-descriptors-retriever") |
|
# Run inference |
|
sentences = [ |
|
'tome des bauges raw milk aoc in plastic container brand product name </s> This facet allows recording whether the food list code was chosen because of lack of information on the food item or because the proper entry in the food list was missing. Only one descriptor from this facet can be added to each entry.', |
|
'The food list item has been chosen because none of the more detailed items corresponded to the available information. Please consider the eventual addition of a new term in the list', |
|
'Deprecated term that must NOT be used for any purpose. Its original scopenote was: The group includes any type of Other fruiting vegetables (exposure). The part consumed/analysed is by default unspecified. When relevant, information on the part consumed/analysed has to be reported with additional facet descriptors.', |
|
] |
|
embeddings = model.encode(sentences) |
|
print(embeddings.shape) |
|
# [3, 1024] |
|
|
|
# Get the similarity scores for the embeddings |
|
similarities = model.similarity(embeddings, embeddings) |
|
print(similarities.shape) |
|
# [3, 3] |
|
``` |
|
|
|
<!-- |
|
### Direct Usage (Transformers) |
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Downstream Usage (Sentence Transformers) |
|
|
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Out-of-Scope Use |
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
--> |
|
|
|
## Evaluation |
|
|
|
### Metrics |
|
|
|
#### Device Aware Information Retrieval |
|
|
|
* Evaluated with <code>src.utils.eval_functions.DeviceAwareInformationRetrievalEvaluator</code> |
|
|
|
| Metric | Value | |
|
|:--------------------|:-----------| |
|
| cosine_accuracy@1 | 0.985 | |
|
| cosine_accuracy@3 | 0.999 | |
|
| cosine_accuracy@5 | 0.9998 | |
|
| cosine_accuracy@10 | 1.0 | |
|
| cosine_precision@1 | 0.985 | |
|
| cosine_precision@3 | 0.4171 | |
|
| cosine_precision@5 | 0.2537 | |
|
| cosine_precision@10 | 0.1275 | |
|
| cosine_recall@1 | 0.8691 | |
|
| cosine_recall@3 | 0.9939 | |
|
| cosine_recall@5 | 0.9985 | |
|
| cosine_recall@10 | 0.9999 | |
|
| **cosine_ndcg@10** | **0.9936** | |
|
| cosine_mrr@10 | 0.9919 | |
|
| cosine_map@100 | 0.9909 | |
|
|
|
<!-- |
|
## Bias, Risks and Limitations |
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
--> |
|
|
|
<!-- |
|
### Recommendations |
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
--> |
|
|
|
## Training Details |
|
|
|
### Training Dataset |
|
|
|
#### Unnamed Dataset |
|
|
|
* Size: 1,225,740 training samples |
|
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code> |
|
* Approximate statistics based on the first 1000 samples: |
|
| | sentence_0 | sentence_1 | sentence_2 | |
|
|:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| |
|
| type | string | string | string | |
|
| details | <ul><li>min: 37 tokens</li><li>mean: 89.82 tokens</li><li>max: 96 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 39.38 tokens</li><li>max: 96 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 39.59 tokens</li><li>max: 96 tokens</li></ul> | |
|
* Samples: |
|
| sentence_0 | sentence_1 | sentence_2 | |
|
|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
| <code>peach fresh flesh baked with skin </s> This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.</code> | <code>Cooking by dry heat in or as if in an oven</code> | <code>Previously cooked or heat-treated fodd, heated again in order to raise its temperature (all different techniques)</code> | |
|
| <code>turkey breast with bones frozen barbecued without skin </s> This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.</code> | <code>Preserving by freezing sufficiently rapidly to avoid spoilage and microbial growth</code> | <code>Drying to a water content low enough to guarantee microbiological stability, but still keeping a relatively soft structure (often used for fruit)</code> | |
|
| <code>yoghurt flavoured cow blueberry sweetened with sugar sucrose whole in glass commercial supermarket shop organic shop brand product name </s> This facet provides some principal claims related to important nutrients-ingredients, like fat, sugar etc. It is not intended to include health claims or similar. The present guidance provides a limited list, to be eventually improved during the evolution of the system. More than one descriptor can be applied to each entry, provided they are not contradicting each other.</code> | <code>The food item has all the natural (or average expected )fat content (for milk, at least the value defined in legislation, when available). In the case of cheese, the fat on the dry matter is 45-60%</code> | <code>The food item has an almost completely reduced amount of fat, with respect to the expected natural fat content (for milk, at least the value defined in legislation, when available). For meat, this is the entry for what is commercially intended as 'lean' meat, where fat is not visible.In the case of cheese, the fat on the dry matter is 10-25%</code> | |
|
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: |
|
```json |
|
{ |
|
"scale": 20.0, |
|
"similarity_fct": "cos_sim" |
|
} |
|
``` |
|
|
|
### Training Hyperparameters |
|
#### Non-Default Hyperparameters |
|
|
|
- `eval_strategy`: steps |
|
- `per_device_train_batch_size`: 48 |
|
- `per_device_eval_batch_size`: 48 |
|
- `fp16`: True |
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
|
#### All Hyperparameters |
|
<details><summary>Click to expand</summary> |
|
|
|
- `overwrite_output_dir`: False |
|
- `do_predict`: False |
|
- `eval_strategy`: steps |
|
- `prediction_loss_only`: True |
|
- `per_device_train_batch_size`: 48 |
|
- `per_device_eval_batch_size`: 48 |
|
- `per_gpu_train_batch_size`: None |
|
- `per_gpu_eval_batch_size`: None |
|
- `gradient_accumulation_steps`: 1 |
|
- `eval_accumulation_steps`: None |
|
- `torch_empty_cache_steps`: None |
|
- `learning_rate`: 5e-05 |
|
- `weight_decay`: 0.0 |
|
- `adam_beta1`: 0.9 |
|
- `adam_beta2`: 0.999 |
|
- `adam_epsilon`: 1e-08 |
|
- `max_grad_norm`: 1.0 |
|
- `num_train_epochs`: 3 |
|
- `max_steps`: -1 |
|
- `lr_scheduler_type`: linear |
|
- `lr_scheduler_kwargs`: {} |
|
- `warmup_ratio`: 0.0 |
|
- `warmup_steps`: 0 |
|
- `log_level`: passive |
|
- `log_level_replica`: warning |
|
- `log_on_each_node`: True |
|
- `logging_nan_inf_filter`: True |
|
- `save_safetensors`: True |
|
- `save_on_each_node`: False |
|
- `save_only_model`: False |
|
- `restore_callback_states_from_checkpoint`: False |
|
- `no_cuda`: False |
|
- `use_cpu`: False |
|
- `use_mps_device`: False |
|
- `seed`: 42 |
|
- `data_seed`: None |
|
- `jit_mode_eval`: False |
|
- `use_ipex`: False |
|
- `bf16`: False |
|
- `fp16`: True |
|
- `fp16_opt_level`: O1 |
|
- `half_precision_backend`: auto |
|
- `bf16_full_eval`: False |
|
- `fp16_full_eval`: False |
|
- `tf32`: None |
|
- `local_rank`: 0 |
|
- `ddp_backend`: None |
|
- `tpu_num_cores`: None |
|
- `tpu_metrics_debug`: False |
|
- `debug`: [] |
|
- `dataloader_drop_last`: False |
|
- `dataloader_num_workers`: 0 |
|
- `dataloader_prefetch_factor`: None |
|
- `past_index`: -1 |
|
- `disable_tqdm`: False |
|
- `remove_unused_columns`: True |
|
- `label_names`: None |
|
- `load_best_model_at_end`: False |
|
- `ignore_data_skip`: False |
|
- `fsdp`: [] |
|
- `fsdp_min_num_params`: 0 |
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
- `deepspeed`: None |
|
- `label_smoothing_factor`: 0.0 |
|
- `optim`: adamw_torch |
|
- `optim_args`: None |
|
- `adafactor`: False |
|
- `group_by_length`: False |
|
- `length_column_name`: length |
|
- `ddp_find_unused_parameters`: None |
|
- `ddp_bucket_cap_mb`: None |
|
- `ddp_broadcast_buffers`: False |
|
- `dataloader_pin_memory`: True |
|
- `dataloader_persistent_workers`: False |
|
- `skip_memory_metrics`: True |
|
- `use_legacy_prediction_loop`: False |
|
- `push_to_hub`: False |
|
- `resume_from_checkpoint`: None |
|
- `hub_model_id`: None |
|
- `hub_strategy`: every_save |
|
- `hub_private_repo`: None |
|
- `hub_always_push`: False |
|
- `gradient_checkpointing`: False |
|
- `gradient_checkpointing_kwargs`: None |
|
- `include_inputs_for_metrics`: False |
|
- `include_for_metrics`: [] |
|
- `eval_do_concat_batches`: True |
|
- `fp16_backend`: auto |
|
- `push_to_hub_model_id`: None |
|
- `push_to_hub_organization`: None |
|
- `mp_parameters`: |
|
- `auto_find_batch_size`: False |
|
- `full_determinism`: False |
|
- `torchdynamo`: None |
|
- `ray_scope`: last |
|
- `ddp_timeout`: 1800 |
|
- `torch_compile`: False |
|
- `torch_compile_backend`: None |
|
- `torch_compile_mode`: None |
|
- `dispatch_batches`: None |
|
- `split_batches`: None |
|
- `include_tokens_per_second`: False |
|
- `include_num_input_tokens_seen`: False |
|
- `neftune_noise_alpha`: None |
|
- `optim_target_modules`: None |
|
- `batch_eval_metrics`: False |
|
- `eval_on_start`: False |
|
- `use_liger_kernel`: False |
|
- `eval_use_gather_object`: False |
|
- `average_tokens_across_devices`: False |
|
- `prompts`: None |
|
- `batch_sampler`: batch_sampler |
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
|
</details> |
|
|
|
### Training Logs |
|
<details><summary>Click to expand</summary> |
|
|
|
| Epoch | Step | Training Loss | cosine_ndcg@10 | |
|
|:------:|:-----:|:-------------:|:--------------:| |
|
| 0 | 0 | - | 0.0266 | |
|
| 0.0196 | 500 | 1.5739 | - | |
|
| 0.0392 | 1000 | 0.9043 | - | |
|
| 0.0587 | 1500 | 0.8234 | - | |
|
| 0.0783 | 2000 | 0.7861 | - | |
|
| 0.0979 | 2500 | 0.7628 | - | |
|
| 0.1175 | 3000 | 0.7348 | - | |
|
| 0.1371 | 3500 | 0.7184 | - | |
|
| 0.1566 | 4000 | 0.7167 | - | |
|
| 0.1762 | 4500 | 0.7002 | - | |
|
| 0.1958 | 5000 | 0.6791 | 0.9264 | |
|
| 0.2154 | 5500 | 0.6533 | - | |
|
| 0.2350 | 6000 | 0.6628 | - | |
|
| 0.2545 | 6500 | 0.6637 | - | |
|
| 0.2741 | 7000 | 0.639 | - | |
|
| 0.2937 | 7500 | 0.6395 | - | |
|
| 0.3133 | 8000 | 0.6358 | - | |
|
| 0.3329 | 8500 | 0.617 | - | |
|
| 0.3524 | 9000 | 0.6312 | - | |
|
| 0.3720 | 9500 | 0.6107 | - | |
|
| 0.3916 | 10000 | 0.6083 | 0.9518 | |
|
| 0.4112 | 10500 | 0.6073 | - | |
|
| 0.4307 | 11000 | 0.601 | - | |
|
| 0.4503 | 11500 | 0.6047 | - | |
|
| 0.4699 | 12000 | 0.5986 | - | |
|
| 0.4895 | 12500 | 0.5913 | - | |
|
| 0.5091 | 13000 | 0.5992 | - | |
|
| 0.5286 | 13500 | 0.5911 | - | |
|
| 0.5482 | 14000 | 0.5923 | - | |
|
| 0.5678 | 14500 | 0.5816 | - | |
|
| 0.5874 | 15000 | 0.582 | 0.9628 | |
|
| 0.6070 | 15500 | 0.5815 | - | |
|
| 0.6265 | 16000 | 0.5827 | - | |
|
| 0.6461 | 16500 | 0.5885 | - | |
|
| 0.6657 | 17000 | 0.5737 | - | |
|
| 0.6853 | 17500 | 0.577 | - | |
|
| 0.7049 | 18000 | 0.5687 | - | |
|
| 0.7244 | 18500 | 0.5744 | - | |
|
| 0.7440 | 19000 | 0.5774 | - | |
|
| 0.7636 | 19500 | 0.5792 | - | |
|
| 0.7832 | 20000 | 0.5645 | 0.9739 | |
|
| 0.8028 | 20500 | 0.5769 | - | |
|
| 0.8223 | 21000 | 0.5659 | - | |
|
| 0.8419 | 21500 | 0.5635 | - | |
|
| 0.8615 | 22000 | 0.5677 | - | |
|
| 0.8811 | 22500 | 0.5693 | - | |
|
| 0.9007 | 23000 | 0.5666 | - | |
|
| 0.9202 | 23500 | 0.5526 | - | |
|
| 0.9398 | 24000 | 0.5591 | - | |
|
| 0.9594 | 24500 | 0.563 | - | |
|
| 0.9790 | 25000 | 0.555 | 0.9808 | |
|
| 0.9986 | 25500 | 0.5585 | - | |
|
| 1.0 | 25537 | - | 0.9811 | |
|
| 1.0181 | 26000 | 0.5595 | - | |
|
| 1.0377 | 26500 | 0.5507 | - | |
|
| 1.0573 | 27000 | 0.5582 | - | |
|
| 1.0769 | 27500 | 0.5543 | - | |
|
| 1.0964 | 28000 | 0.5598 | - | |
|
| 1.1160 | 28500 | 0.5613 | - | |
|
| 1.1356 | 29000 | 0.5457 | - | |
|
| 1.1552 | 29500 | 0.5524 | - | |
|
| 1.1748 | 30000 | 0.5324 | 0.9836 | |
|
| 1.1943 | 30500 | 0.5531 | - | |
|
| 1.2139 | 31000 | 0.5505 | - | |
|
| 1.2335 | 31500 | 0.5623 | - | |
|
| 1.2531 | 32000 | 0.5505 | - | |
|
| 1.2727 | 32500 | 0.5583 | - | |
|
| 1.2922 | 33000 | 0.548 | - | |
|
| 1.3118 | 33500 | 0.5485 | - | |
|
| 1.3314 | 34000 | 0.5509 | - | |
|
| 1.3510 | 34500 | 0.54 | - | |
|
| 1.3706 | 35000 | 0.5478 | 0.9835 | |
|
| 1.3901 | 35500 | 0.5416 | - | |
|
| 1.4097 | 36000 | 0.5438 | - | |
|
| 1.4293 | 36500 | 0.543 | - | |
|
| 1.4489 | 37000 | 0.547 | - | |
|
| 1.4685 | 37500 | 0.5362 | - | |
|
| 1.4880 | 38000 | 0.5536 | - | |
|
| 1.5076 | 38500 | 0.5356 | - | |
|
| 1.5272 | 39000 | 0.5382 | - | |
|
| 1.5468 | 39500 | 0.5481 | - | |
|
| 1.5664 | 40000 | 0.5302 | 0.9880 | |
|
| 1.5859 | 40500 | 0.5275 | - | |
|
| 1.6055 | 41000 | 0.5327 | - | |
|
| 1.6251 | 41500 | 0.5414 | - | |
|
| 1.6447 | 42000 | 0.5354 | - | |
|
| 1.6643 | 42500 | 0.536 | - | |
|
| 1.6838 | 43000 | 0.5364 | - | |
|
| 1.7034 | 43500 | 0.5391 | - | |
|
| 1.7230 | 44000 | 0.5342 | - | |
|
| 1.7426 | 44500 | 0.5369 | - | |
|
| 1.7621 | 45000 | 0.5387 | 0.9894 | |
|
| 1.7817 | 45500 | 0.5312 | - | |
|
| 1.8013 | 46000 | 0.5297 | - | |
|
| 1.8209 | 46500 | 0.5222 | - | |
|
| 1.8405 | 47000 | 0.5255 | - | |
|
| 1.8600 | 47500 | 0.5379 | - | |
|
| 1.8796 | 48000 | 0.5317 | - | |
|
| 1.8992 | 48500 | 0.5312 | - | |
|
| 1.9188 | 49000 | 0.5307 | - | |
|
| 1.9384 | 49500 | 0.5375 | - | |
|
| 1.9579 | 50000 | 0.527 | 0.9908 | |
|
| 1.9775 | 50500 | 0.538 | - | |
|
| 1.9971 | 51000 | 0.5312 | - | |
|
| 2.0 | 51074 | - | 0.9911 | |
|
| 2.0167 | 51500 | 0.5346 | - | |
|
| 2.0363 | 52000 | 0.5279 | - | |
|
| 2.0558 | 52500 | 0.517 | - | |
|
| 2.0754 | 53000 | 0.5193 | - | |
|
| 2.0950 | 53500 | 0.5286 | - | |
|
| 2.1146 | 54000 | 0.5229 | - | |
|
| 2.1342 | 54500 | 0.5183 | - | |
|
| 2.1537 | 55000 | 0.5194 | 0.9915 | |
|
| 2.1733 | 55500 | 0.5362 | - | |
|
| 2.1929 | 56000 | 0.5186 | - | |
|
| 2.2125 | 56500 | 0.5202 | - | |
|
| 2.2321 | 57000 | 0.5276 | - | |
|
| 2.2516 | 57500 | 0.5266 | - | |
|
| 2.2712 | 58000 | 0.5334 | - | |
|
| 2.2908 | 58500 | 0.5206 | - | |
|
| 2.3104 | 59000 | 0.5229 | - | |
|
| 2.3300 | 59500 | 0.5111 | - | |
|
| 2.3495 | 60000 | 0.5175 | 0.9928 | |
|
| 2.3691 | 60500 | 0.5235 | - | |
|
| 2.3887 | 61000 | 0.5127 | - | |
|
| 2.4083 | 61500 | 0.5291 | - | |
|
| 2.4278 | 62000 | 0.5122 | - | |
|
| 2.4474 | 62500 | 0.5196 | - | |
|
| 2.4670 | 63000 | 0.5159 | - | |
|
| 2.4866 | 63500 | 0.5207 | - | |
|
| 2.5062 | 64000 | 0.5157 | - | |
|
| 2.5257 | 64500 | 0.5094 | - | |
|
| 2.5453 | 65000 | 0.5283 | 0.9937 | |
|
| 2.5649 | 65500 | 0.5256 | - | |
|
| 2.5845 | 66000 | 0.524 | - | |
|
| 2.6041 | 66500 | 0.5324 | - | |
|
| 2.6236 | 67000 | 0.5132 | - | |
|
| 2.6432 | 67500 | 0.5203 | - | |
|
| 2.6628 | 68000 | 0.5224 | - | |
|
| 2.6824 | 68500 | 0.5255 | - | |
|
| 2.7020 | 69000 | 0.5132 | - | |
|
| 2.7215 | 69500 | 0.525 | - | |
|
| 2.7411 | 70000 | 0.5257 | 0.9936 | |
|
| 2.7607 | 70500 | 0.5206 | - | |
|
| 2.7803 | 71000 | 0.514 | - | |
|
| 2.7999 | 71500 | 0.5175 | - | |
|
| 2.8194 | 72000 | 0.5245 | - | |
|
| 2.8390 | 72500 | 0.5144 | - | |
|
| 2.8586 | 73000 | 0.5246 | - | |
|
| 2.8782 | 73500 | 0.5227 | - | |
|
| 2.8978 | 74000 | 0.5199 | - | |
|
| 2.9173 | 74500 | 0.5216 | - | |
|
| 2.9369 | 75000 | 0.5253 | 0.9936 | |
|
| 2.9565 | 75500 | 0.5303 | - | |
|
| 2.9761 | 76000 | 0.5148 | - | |
|
| 2.9957 | 76500 | 0.5248 | - | |
|
| 3.0 | 76611 | - | 0.9936 | |
|
|
|
</details> |
|
|
|
### Framework Versions |
|
- Python: 3.10.12 |
|
- Sentence Transformers: 3.4.1 |
|
- Transformers: 4.49.0 |
|
- PyTorch: 2.6.0+cu124 |
|
- Accelerate: 1.4.0 |
|
- Datasets: 3.3.1 |
|
- Tokenizers: 0.21.0 |
|
|
|
## Citation |
|
|
|
### BibTeX |
|
|
|
#### Sentence Transformers |
|
```bibtex |
|
@inproceedings{reimers-2019-sentence-bert, |
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
month = "11", |
|
year = "2019", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://arxiv.org/abs/1908.10084", |
|
} |
|
``` |
|
|
|
#### MultipleNegativesRankingLoss |
|
```bibtex |
|
@misc{henderson2017efficient, |
|
title={Efficient Natural Language Response Suggestion for Smart Reply}, |
|
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
|
year={2017}, |
|
eprint={1705.00652}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
|
|
<!-- |
|
## Glossary |
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Authors |
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Contact |
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
--> |