dtm-hoinv's picture
Add new SentenceTransformer model
dff6654 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:16547
  - loss:CustomBatchAllTripletLoss
widget:
  - source_sentence: 科目:ユニット及びその他。名称:突出サイン。
    sentences:
      - 科目:ユニット及びその他。名称:カウンター上ガラス台。
      - 科目:ユニット及びその他。名称:シャワーユニット枠。
      - 科目:ユニット及びその他。名称:エントランスサイン。
  - source_sentence: 科目:タイル。名称:階段踏面タイル。
    sentences:
      - 科目:コンクリート。名称:基礎部コンクリート。摘要:FC42N/mm2 スランプ21高性能AE減水剤。備考:代価表    0056
      - 科目:ユニット及びその他。名称:配膳棚。
      - 科目:コンクリート。名称:コンクリート打設。
  - source_sentence: 科目:ユニット及びその他。名称:オーバーフロー管。
    sentences:
      - 科目:ユニット及びその他。名称:自動支払機サイン。
      - 科目:ユニット及びその他。名称:SPボックス。
      - 科目:コンクリート。名称:捨てコンクリート。
  - source_sentence: 科目:ユニット及びその他。名称:執務室#-#規格品カウンター。
    sentences:
      - 科目:ユニット及びその他。名称:床コンクリート平板デッキ。
      - 科目:ユニット及びその他。名称:#階守衛室カウンター。
      - 科目:ユニット及びその他。名称:コンクリート舗装。
  - source_sentence: 科目:ユニット及びその他。名称:#FHCU#床室カウンター。
    sentences:
      - 科目:ユニット及びその他。名称:室名サイン。
      - 科目:ユニット及びその他。名称:#階数表示(階段室内・踊り場)。
      - 科目:ユニット及びその他。名称:Co-#ピクトサイン。
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v1_0_2")
# Run inference
sentences = [
    '科目:ユニット及びその他。名称:#FHCU#床室カウンター。',
    '科目:ユニット及びその他。名称:#階数表示(階段室内・踊り場)。',
    '科目:ユニット及びその他。名称:Co-#ピクトサイン。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16,547 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 18.77 tokens
    • max: 72 tokens
    • 0: ~0.30%
    • 1: ~0.30%
    • 2: ~0.30%
    • 3: ~0.30%
    • 4: ~0.30%
    • 5: ~3.40%
    • 6: ~0.30%
    • 7: ~0.30%
    • 8: ~0.30%
    • 9: ~0.30%
    • 10: ~0.30%
    • 11: ~0.30%
    • 12: ~0.40%
    • 13: ~0.30%
    • 14: ~0.30%
    • 15: ~0.40%
    • 16: ~0.30%
    • 17: ~0.30%
    • 18: ~0.30%
    • 19: ~0.90%
    • 20: ~0.30%
    • 21: ~0.30%
    • 22: ~1.10%
    • 23: ~0.30%
    • 24: ~0.30%
    • 25: ~0.30%
    • 26: ~0.30%
    • 27: ~0.30%
    • 28: ~0.30%
    • 29: ~0.30%
    • 30: ~0.30%
    • 31: ~0.30%
    • 32: ~0.30%
    • 33: ~0.30%
    • 34: ~0.30%
    • 35: ~0.30%
    • 36: ~0.30%
    • 37: ~0.30%
    • 38: ~0.30%
    • 39: ~0.30%
    • 40: ~0.40%
    • 41: ~0.30%
    • 42: ~0.30%
    • 43: ~0.30%
    • 44: ~0.60%
    • 45: ~0.70%
    • 46: ~0.30%
    • 47: ~0.30%
    • 48: ~0.30%
    • 49: ~0.30%
    • 50: ~0.30%
    • 51: ~0.30%
    • 52: ~0.30%
    • 53: ~0.30%
    • 54: ~0.30%
    • 55: ~0.30%
    • 56: ~0.30%
    • 57: ~0.80%
    • 58: ~0.30%
    • 59: ~0.30%
    • 60: ~0.60%
    • 61: ~0.30%
    • 62: ~0.30%
    • 63: ~0.30%
    • 64: ~0.50%
    • 65: ~0.30%
    • 66: ~0.30%
    • 67: ~0.30%
    • 68: ~0.30%
    • 69: ~0.50%
    • 70: ~0.60%
    • 71: ~0.30%
    • 72: ~0.30%
    • 73: ~0.30%
    • 74: ~0.30%
    • 75: ~0.30%
    • 76: ~0.30%
    • 77: ~0.30%
    • 78: ~0.50%
    • 79: ~0.30%
    • 80: ~0.30%
    • 81: ~0.30%
    • 82: ~0.30%
    • 83: ~0.80%
    • 84: ~0.60%
    • 85: ~0.50%
    • 86: ~0.30%
    • 87: ~0.30%
    • 88: ~16.50%
    • 89: ~0.30%
    • 90: ~0.30%
    • 91: ~0.30%
    • 92: ~0.30%
    • 93: ~0.30%
    • 94: ~0.30%
    • 95: ~0.30%
    • 96: ~0.30%
    • 97: ~0.50%
    • 98: ~0.30%
    • 99: ~0.30%
    • 100: ~0.30%
    • 101: ~0.30%
    • 102: ~0.30%
    • 103: ~0.30%
    • 104: ~0.30%
    • 105: ~0.70%
    • 106: ~0.70%
    • 107: ~0.30%
    • 108: ~3.20%
    • 109: ~0.30%
    • 110: ~0.40%
    • 111: ~2.30%
    • 112: ~0.30%
    • 113: ~0.30%
    • 114: ~0.50%
    • 115: ~0.50%
    • 116: ~0.50%
    • 117: ~0.40%
    • 118: ~0.30%
    • 119: ~0.30%
    • 120: ~0.30%
    • 121: ~0.80%
    • 122: ~0.30%
    • 123: ~0.30%
    • 124: ~0.30%
    • 125: ~0.30%
    • 126: ~0.30%
    • 127: ~0.30%
    • 128: ~0.30%
    • 129: ~0.30%
    • 130: ~0.50%
    • 131: ~0.30%
    • 132: ~0.40%
    • 133: ~0.30%
    • 134: ~0.30%
    • 135: ~0.30%
    • 136: ~0.30%
    • 137: ~0.30%
    • 138: ~0.30%
    • 139: ~0.30%
    • 140: ~0.30%
    • 141: ~0.30%
    • 142: ~0.30%
    • 143: ~0.40%
    • 144: ~0.30%
    • 145: ~0.30%
    • 146: ~0.30%
    • 147: ~0.30%
    • 148: ~0.30%
    • 149: ~0.30%
    • 150: ~0.70%
    • 151: ~0.30%
    • 152: ~0.30%
    • 153: ~0.30%
    • 154: ~1.30%
    • 155: ~0.30%
    • 156: ~0.40%
    • 157: ~0.30%
    • 158: ~0.30%
    • 159: ~0.30%
    • 160: ~1.50%
    • 161: ~0.30%
    • 162: ~0.30%
    • 163: ~0.30%
    • 164: ~0.30%
    • 165: ~0.30%
    • 166: ~0.30%
    • 167: ~0.30%
    • 168: ~1.50%
    • 169: ~0.30%
    • 170: ~0.30%
    • 171: ~7.20%
    • 172: ~0.30%
    • 173: ~1.00%
    • 174: ~0.30%
    • 175: ~0.30%
    • 176: ~0.30%
    • 177: ~1.80%
    • 178: ~0.30%
    • 179: ~0.50%
    • 180: ~0.70%
    • 181: ~0.10%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
  • Loss: sentence_transformer_lib.custom_batch_all_trip_loss.CustomBatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 200
  • warmup_ratio: 0.15
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 200
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
1.7576 50 0.0447
3.7576 100 0.048
5.7576 150 0.0472
7.7576 200 0.0505
9.7576 250 0.0547
11.7576 300 0.0548
13.7576 350 0.0527
15.7576 400 0.0522
17.7576 450 0.0496
19.7576 500 0.0506
21.7576 550 0.048
23.7576 600 0.0508
25.7576 650 0.0499
27.7576 700 0.0474
29.7576 750 0.0467
31.7576 800 0.0483
33.7576 850 0.0438
35.7576 900 0.0457
37.7576 950 0.0445
39.7576 1000 0.0452
41.7576 1050 0.046
43.7576 1100 0.0433
45.7576 1150 0.0419
47.7576 1200 0.0407
49.7576 1250 0.0397
51.7576 1300 0.043
53.7576 1350 0.0393
55.7576 1400 0.0411
57.7576 1450 0.0434
59.7576 1500 0.0446
61.7576 1550 0.0396
63.7576 1600 0.0375
65.7576 1650 0.0413
67.7576 1700 0.0398
69.7576 1750 0.0382
71.7576 1800 0.0346
73.7576 1850 0.0388
75.7576 1900 0.0347
77.7576 1950 0.0349
79.7576 2000 0.0402
81.7576 2050 0.039
83.7576 2100 0.0343
85.7576 2150 0.0465
87.7576 2200 0.033
89.7576 2250 0.0385
91.7576 2300 0.0305
93.7576 2350 0.0367
95.7576 2400 0.0377
97.7576 2450 0.0322
99.7576 2500 0.0354
101.7576 2550 0.0332
103.7576 2600 0.0365
105.7576 2650 0.0357
107.7576 2700 0.0301
109.7576 2750 0.0323
111.7576 2800 0.0328
113.7576 2850 0.0339
115.7576 2900 0.0379
117.7576 2950 0.0334
119.7576 3000 0.0338
121.7576 3050 0.0328
123.7576 3100 0.0281
125.7576 3150 0.0316
127.7576 3200 0.0387
129.7576 3250 0.0327
131.7576 3300 0.026
133.7576 3350 0.0247
135.7576 3400 0.0319
137.7576 3450 0.0299
139.7576 3500 0.0252
141.7576 3550 0.0265
143.7576 3600 0.0244
145.7576 3650 0.0317
147.7576 3700 0.0291
149.7576 3750 0.03
151.7576 3800 0.0299
153.7576 3850 0.0303
155.7576 3900 0.0296
157.7576 3950 0.0303
159.7576 4000 0.0282
161.7576 4050 0.0301
163.7576 4100 0.027
165.7576 4150 0.0259
167.7576 4200 0.0294
169.7576 4250 0.0267
171.7576 4300 0.0303
173.7576 4350 0.0199
175.7576 4400 0.0253
177.7576 4450 0.0254
179.7576 4500 0.0202
181.7576 4550 0.0263
183.7576 4600 0.0302
185.7576 4650 0.0292
187.7576 4700 0.0264
189.7576 4750 0.0289
191.7576 4800 0.026
193.7576 4850 0.0285
195.7576 4900 0.0234
197.7576 4950 0.0297
199.7576 5000 0.0238

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CustomBatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}