vumichien's picture
Add new SentenceTransformer model with an openvino backend (#1)
73c03d0 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:72
  - loss:BatchAllTripletLoss
base_model: cl-nagoya/sup-simcse-ja-base
widget:
  - source_sentence: 打放し型枠(B種)
    sentences:
      - 埋込み(B種)(手間)
      - 埋込み(C種)(手間)
      - 盛土A種
  - source_sentence: 埋込み[B種]
    sentences:
      - 打放し型枠(A種)
      - 盛土(C種)(手間)
      - 埋戻し[C種]
  - source_sentence: 盛土[C種]
    sentences:
      - 埋込み[C種]
      - 盛土(A種)
      - 盛土[A種]
  - source_sentence: 埋戻し[A種]
    sentences:
      - 打放し型枠C種
      - 打放し型枠(C種)(損料・手間)
      - 盛土[B種]
  - source_sentence: 埋込み(B種)(損料・手間)
    sentences:
      - 埋戻し(A種)(損料)
      - 埋戻し(C種)(損料・手間)
      - 埋戻し(B種)(手間)
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on cl-nagoya/sup-simcse-ja-base

This is a sentence-transformers model finetuned from cl-nagoya/sup-simcse-ja-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: cl-nagoya/sup-simcse-ja-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-for-standard-name-v0_9_11")
# Run inference
sentences = [
    '埋込み(B種)(損料・手間)',
    '埋戻し(A種)(損料)',
    '埋戻し(B種)(手間)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 72 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 72 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 16.21 tokens
    • max: 27 tokens
    • 0: ~0.50%
    • 1: ~0.50%
    • 2: ~0.50%
    • 3: ~0.50%
    • 4: ~0.50%
    • 5: ~0.50%
    • 6: ~0.50%
    • 7: ~0.50%
    • 8: ~0.50%
    • 9: ~0.50%
    • 10: ~0.50%
    • 11: ~0.50%
    • 12: ~0.50%
    • 13: ~0.50%
    • 14: ~0.50%
    • 15: ~0.50%
    • 16: ~0.50%
    • 17: ~0.50%
    • 18: ~0.50%
    • 19: ~0.50%
    • 20: ~0.50%
    • 21: ~0.50%
    • 22: ~0.50%
    • 23: ~0.50%
    • 24: ~0.50%
    • 25: ~0.50%
    • 26: ~0.50%
    • 27: ~0.50%
    • 28: ~0.50%
    • 29: ~0.50%
    • 30: ~0.50%
    • 31: ~0.50%
    • 32: ~0.50%
    • 33: ~0.50%
    • 34: ~0.50%
    • 35: ~0.50%
    • 36: ~0.50%
    • 37: ~0.50%
    • 38: ~0.50%
    • 39: ~0.50%
    • 40: ~0.50%
    • 41: ~0.50%
    • 42: ~0.50%
    • 43: ~0.50%
    • 44: ~0.60%
    • 45: ~0.70%
    • 46: ~0.50%
    • 47: ~0.50%
    • 48: ~0.50%
    • 49: ~0.50%
    • 50: ~0.50%
    • 51: ~0.50%
    • 52: ~0.50%
    • 53: ~0.50%
    • 54: ~0.50%
    • 55: ~0.50%
    • 56: ~0.50%
    • 57: ~0.80%
    • 58: ~0.50%
    • 59: ~0.50%
    • 60: ~0.50%
    • 61: ~0.50%
    • 62: ~0.50%
    • 63: ~0.50%
    • 64: ~0.50%
    • 65: ~0.50%
    • 66: ~0.50%
    • 67: ~0.50%
    • 68: ~0.50%
    • 69: ~0.50%
    • 70: ~0.50%
    • 71: ~0.50%
    • 72: ~0.50%
    • 73: ~0.50%
    • 74: ~0.50%
    • 75: ~0.50%
    • 76: ~0.50%
    • 77: ~0.50%
    • 78: ~0.50%
    • 79: ~0.50%
    • 80: ~0.50%
    • 81: ~0.50%
    • 82: ~0.50%
    • 83: ~0.50%
    • 84: ~0.50%
    • 85: ~0.50%
    • 86: ~0.50%
    • 87: ~0.50%
    • 88: ~0.60%
    • 89: ~0.50%
    • 90: ~0.50%
    • 91: ~0.50%
    • 92: ~0.50%
    • 93: ~0.50%
    • 94: ~0.50%
    • 95: ~1.20%
    • 96: ~1.70%
    • 97: ~3.90%
    • 98: ~0.50%
    • 99: ~0.50%
    • 100: ~0.50%
    • 101: ~0.60%
    • 102: ~0.50%
    • 103: ~0.50%
    • 104: ~0.50%
    • 105: ~0.50%
    • 106: ~0.50%
    • 107: ~1.20%
    • 108: ~0.50%
    • 109: ~0.50%
    • 110: ~0.50%
    • 111: ~0.50%
    • 112: ~0.50%
    • 113: ~0.50%
    • 114: ~0.50%
    • 115: ~0.50%
    • 116: ~0.50%
    • 117: ~0.50%
    • 118: ~0.50%
    • 119: ~0.50%
    • 120: ~0.50%
    • 121: ~0.50%
    • 122: ~0.50%
    • 123: ~0.50%
    • 124: ~0.50%
    • 125: ~0.50%
    • 126: ~0.50%
    • 127: ~0.50%
    • 128: ~0.50%
    • 129: ~0.50%
    • 130: ~0.50%
    • 131: ~0.50%
    • 132: ~0.50%
    • 133: ~0.50%
    • 134: ~0.50%
    • 135: ~0.50%
    • 136: ~0.50%
    • 137: ~0.50%
    • 138: ~0.50%
    • 139: ~0.50%
    • 140: ~0.50%
    • 141: ~0.50%
    • 142: ~0.50%
    • 143: ~0.50%
    • 144: ~0.50%
    • 145: ~0.50%
    • 146: ~0.70%
    • 147: ~0.50%
    • 148: ~3.10%
    • 149: ~0.50%
    • 150: ~2.30%
    • 151: ~0.50%
    • 152: ~0.50%
    • 153: ~0.50%
    • 154: ~0.50%
    • 155: ~0.50%
    • 156: ~0.50%
    • 157: ~0.50%
    • 158: ~0.50%
    • 159: ~0.50%
    • 160: ~0.50%
    • 161: ~0.50%
    • 162: ~0.50%
    • 163: ~0.50%
    • 164: ~0.50%
    • 165: ~0.50%
    • 166: ~0.50%
    • 167: ~0.50%
    • 168: ~0.50%
    • 169: ~0.50%
    • 170: ~0.50%
    • 171: ~0.50%
    • 172: ~0.50%
    • 173: ~0.50%
    • 174: ~0.50%
    • 175: ~0.50%
    • 176: ~0.50%
    • 177: ~0.10%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
  • Loss: BatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 250
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 250
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
10.0 10 1.6508
20.0 20 1.2554
30.0 30 0.8495
40.0 40 0.7182
50.0 50 0.6614
60.0 60 0.575
70.0 70 0.5027
80.0 80 0.32
90.0 90 0.1543
100.0 100 0.0102
110.0 110 0.012
120.0 120 0.1164
130.0 130 0.0
140.0 140 0.0
150.0 150 0.0
160.0 160 0.0157
170.0 170 0.0794
180.0 180 0.0
190.0 190 0.0
200.0 200 0.0141
210.0 210 0.0
220.0 220 0.0
230.0 230 0.1115
240.0 240 0.0
250.0 250 0.0
260.0 260 0.0
270.0 270 0.0
280.0 280 0.0
290.0 290 0.0
300.0 300 0.0
310.0 310 0.0
320.0 320 0.0
330.0 330 0.0
340.0 340 0.0
350.0 350 0.0
360.0 360 0.0197
370.0 370 0.0649
380.0 380 0.0
390.0 390 0.0
400.0 400 0.0
410.0 410 0.0
420.0 420 0.0
430.0 430 0.0
440.0 440 0.0
450.0 450 0.0
460.0 460 0.0
470.0 470 0.0
480.0 480 0.0
490.0 490 0.0
500.0 500 0.0
3.1842 100 0.6748
6.3684 200 0.5883
9.5526 300 0.5815
12.7368 400 0.5338
16.1053 500 0.5498
19.2895 600 0.5359
22.4737 700 0.5359
25.6579 800 0.4893
29.0263 900 0.4665
32.2105 1000 0.4205
35.3947 1100 0.4383
38.5789 1200 0.4552
41.7632 1300 0.4003
45.1316 1400 0.3816
48.3158 1500 0.3744
51.5 1600 0.3504
54.6842 1700 0.359
58.0526 1800 0.3019
61.2368 1900 0.3109
64.4211 2000 0.3151
67.6053 2100 0.3292
70.7895 2200 0.2813
74.1579 2300 0.2697
77.3421 2400 0.1975
80.5263 2500 0.2492
83.7105 2600 0.2608
87.0789 2700 0.2401
90.2632 2800 0.2265
93.4474 2900 0.2032
96.6316 3000 0.2368
99.8158 3100 0.2066
103.1842 3200 0.1558
106.3684 3300 0.2029
109.5526 3400 0.244
112.7368 3500 0.1894
116.1053 3600 0.193
119.2895 3700 0.1769
122.4737 3800 0.1821
125.6579 3900 0.0912
129.0263 4000 0.1834
132.2105 4100 0.1391
135.3947 4200 0.1718
138.5789 4300 0.1585
141.7632 4400 0.1829
145.1316 4500 0.1246
148.3158 4600 0.1327
151.5 4700 0.1396
154.6842 4800 0.1028
158.0526 4900 0.0907
161.2368 5000 0.1179
164.4211 5100 0.1496
167.6053 5200 0.1156
170.7895 5300 0.1148
174.1579 5400 0.1275
177.3421 5500 0.1354
180.5263 5600 0.1334
183.7105 5700 0.0874
187.0789 5800 0.0922
190.2632 5900 0.1109
193.4474 6000 0.0708
196.6316 6100 0.0943
199.8158 6200 0.1164
203.1842 6300 0.0785
206.3684 6400 0.0853
209.5526 6500 0.0674
212.7368 6600 0.1009
216.1053 6700 0.0846
219.2895 6800 0.078
222.4737 6900 0.0958
225.6579 7000 0.0811
229.0263 7100 0.0452
232.2105 7200 0.0705
235.3947 7300 0.0664
238.5789 7400 0.0501
241.7632 7500 0.0696
245.1316 7600 0.0736
248.3158 7700 0.08

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.50.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

BatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}