SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v_1_0_3")
# Run inference
sentences = [
    '科目:ユニット及びその他。名称:HWC荷物棚。',
    '科目:コンクリート。名称:地上部暑中コンクリート。',
    '科目:コンクリート。名称:普通コンクリート。摘要:JIS A5308   FC=36       S18粗骨材20 高性能AE減水剤。備考:刊コンクリート   2。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 11,961 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 18.2 tokens
    • max: 54 tokens
    • 0: ~0.30%
    • 1: ~0.30%
    • 2: ~0.30%
    • 3: ~0.30%
    • 4: ~0.30%
    • 5: ~1.10%
    • 6: ~0.30%
    • 7: ~0.30%
    • 8: ~0.30%
    • 9: ~0.30%
    • 10: ~0.30%
    • 11: ~0.30%
    • 12: ~0.30%
    • 13: ~0.30%
    • 14: ~0.30%
    • 15: ~0.30%
    • 16: ~0.40%
    • 17: ~0.30%
    • 18: ~0.30%
    • 19: ~0.30%
    • 20: ~0.90%
    • 21: ~0.30%
    • 22: ~0.40%
    • 23: ~0.30%
    • 24: ~1.10%
    • 25: ~0.30%
    • 26: ~0.30%
    • 27: ~0.30%
    • 28: ~0.30%
    • 29: ~0.30%
    • 30: ~0.30%
    • 31: ~0.30%
    • 32: ~0.30%
    • 33: ~0.30%
    • 34: ~0.30%
    • 35: ~0.30%
    • 36: ~0.30%
    • 37: ~0.30%
    • 38: ~0.30%
    • 39: ~0.30%
    • 40: ~0.30%
    • 41: ~0.30%
    • 42: ~0.40%
    • 43: ~0.30%
    • 44: ~0.30%
    • 45: ~0.30%
    • 46: ~0.60%
    • 47: ~0.70%
    • 48: ~0.30%
    • 49: ~0.30%
    • 50: ~0.30%
    • 51: ~0.30%
    • 52: ~0.30%
    • 53: ~0.30%
    • 54: ~0.30%
    • 55: ~0.30%
    • 56: ~0.30%
    • 57: ~0.30%
    • 58: ~0.30%
    • 59: ~0.30%
    • 60: ~0.30%
    • 61: ~0.50%
    • 62: ~0.30%
    • 63: ~0.30%
    • 64: ~0.30%
    • 65: ~0.30%
    • 66: ~0.30%
    • 67: ~0.30%
    • 68: ~0.30%
    • 69: ~0.30%
    • 70: ~0.30%
    • 71: ~0.30%
    • 72: ~0.30%
    • 73: ~0.30%
    • 74: ~0.30%
    • 75: ~0.30%
    • 76: ~0.30%
    • 77: ~0.80%
    • 78: ~0.60%
    • 79: ~0.30%
    • 80: ~0.30%
    • 81: ~0.30%
    • 82: ~0.30%
    • 83: ~0.30%
    • 84: ~0.30%
    • 85: ~0.30%
    • 86: ~0.50%
    • 87: ~0.30%
    • 88: ~0.30%
    • 89: ~0.30%
    • 90: ~0.30%
    • 91: ~0.80%
    • 92: ~0.60%
    • 93: ~0.50%
    • 94: ~0.30%
    • 95: ~0.30%
    • 96: ~16.50%
    • 97: ~0.30%
    • 98: ~0.30%
    • 99: ~0.30%
    • 100: ~0.30%
    • 101: ~0.30%
    • 102: ~0.30%
    • 103: ~0.30%
    • 104: ~0.30%
    • 105: ~0.50%
    • 106: ~0.30%
    • 107: ~0.30%
    • 108: ~0.30%
    • 109: ~0.30%
    • 110: ~0.30%
    • 111: ~0.30%
    • 112: ~0.30%
    • 113: ~0.30%
    • 114: ~0.70%
    • 115: ~0.30%
    • 116: ~0.30%
    • 117: ~0.30%
    • 118: ~0.40%
    • 119: ~2.10%
    • 120: ~2.10%
    • 121: ~0.30%
    • 122: ~0.30%
    • 123: ~0.50%
    • 124: ~0.50%
    • 125: ~0.50%
    • 126: ~0.40%
    • 127: ~0.30%
    • 128: ~0.30%
    • 129: ~0.30%
    • 130: ~0.80%
    • 131: ~0.30%
    • 132: ~0.30%
    • 133: ~0.30%
    • 134: ~0.30%
    • 135: ~0.30%
    • 136: ~0.30%
    • 137: ~0.30%
    • 138: ~0.30%
    • 139: ~0.30%
    • 140: ~0.30%
    • 141: ~0.30%
    • 142: ~0.30%
    • 143: ~0.50%
    • 144: ~0.30%
    • 145: ~0.40%
    • 146: ~0.30%
    • 147: ~0.30%
    • 148: ~0.30%
    • 149: ~0.30%
    • 150: ~0.30%
    • 151: ~0.30%
    • 152: ~0.30%
    • 153: ~0.30%
    • 154: ~0.30%
    • 155: ~0.30%
    • 156: ~0.30%
    • 157: ~0.40%
    • 158: ~0.30%
    • 159: ~0.30%
    • 160: ~0.30%
    • 161: ~0.30%
    • 162: ~0.30%
    • 163: ~0.30%
    • 164: ~0.70%
    • 165: ~0.30%
    • 166: ~0.30%
    • 167: ~0.30%
    • 168: ~1.30%
    • 169: ~0.30%
    • 170: ~0.40%
    • 171: ~0.30%
    • 172: ~0.30%
    • 173: ~0.30%
    • 174: ~1.50%
    • 175: ~0.30%
    • 176: ~0.30%
    • 177: ~0.30%
    • 178: ~0.30%
    • 179: ~0.30%
    • 180: ~0.30%
    • 181: ~0.30%
    • 182: ~1.60%
    • 183: ~0.30%
    • 184: ~0.30%
    • 185: ~7.20%
    • 186: ~0.30%
    • 187: ~1.00%
    • 188: ~0.30%
    • 189: ~0.30%
    • 190: ~0.30%
    • 191: ~1.80%
    • 192: ~0.30%
    • 193: ~0.50%
    • 194: ~0.70%
    • 195: ~0.30%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
  • Loss: sentence_transformer_lib.custom_batch_all_trip_loss.CustomBatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 200
  • warmup_ratio: 0.15
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 200
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
2.3333 50 0.0589
4.6667 100 0.0668
7.125 150 0.0677
9.4583 200 0.0655
11.7917 250 0.062
14.25 300 0.0601
16.5833 350 0.0604
19.0417 400 0.0602
21.375 450 0.0546
23.7083 500 0.0575
26.1667 550 0.0569
28.5 600 0.0533
30.8333 650 0.0527
33.2917 700 0.0518
35.625 750 0.0487
38.0833 800 0.0514
40.4167 850 0.0469
42.75 900 0.0464
45.2083 950 0.0481
47.5417 1000 0.0502
49.875 1050 0.0511
52.3333 1100 0.0449
54.6667 1150 0.0439
57.125 1200 0.0443
59.4583 1250 0.0445
61.7917 1300 0.0455
64.25 1350 0.0417
66.5833 1400 0.0397
69.0417 1450 0.0392
71.375 1500 0.0411
73.7083 1550 0.0375
76.1667 1600 0.0444
78.5 1650 0.0353
80.8333 1700 0.0402
83.2917 1750 0.0353
85.625 1800 0.0354
88.0833 1850 0.0347
90.4167 1900 0.0368
92.75 1950 0.0353
95.2083 2000 0.0374
97.5417 2050 0.0375
99.875 2100 0.0324
1.7576 50 0.0365
3.7576 100 0.0372
5.7576 150 0.0392
7.7576 200 0.0392
9.7576 250 0.0386
11.7576 300 0.0402
13.7576 350 0.0342
15.7576 400 0.037
17.7576 450 0.0355
19.7576 500 0.0341
21.7576 550 0.0354
23.7576 600 0.0322
25.7576 650 0.0361
27.7576 700 0.0316
29.7576 750 0.0338
31.7576 800 0.0311
33.7576 850 0.0288
35.7576 900 0.0311
37.7576 950 0.0307
39.7576 1000 0.0288
41.7576 1050 0.0324
43.7576 1100 0.0276
45.7576 1150 0.0304
47.7576 1200 0.0267
49.7576 1250 0.0272
51.7576 1300 0.0269
53.7576 1350 0.0264
55.7576 1400 0.0324
57.7576 1450 0.0278
59.7576 1500 0.0315
61.7576 1550 0.0285
63.7576 1600 0.0241
65.7576 1650 0.0288
67.7576 1700 0.0263
69.7576 1750 0.0295
71.7576 1800 0.0238
73.7576 1850 0.0214
75.7576 1900 0.0281
77.7576 1950 0.0269
79.7576 2000 0.0268
81.7576 2050 0.0242
83.7576 2100 0.0226
85.7576 2150 0.0249
87.7576 2200 0.0254
89.7576 2250 0.0226
91.7576 2300 0.0181
93.7576 2350 0.019
95.7576 2400 0.0207
97.7576 2450 0.0205
99.7576 2500 0.0241
101.7576 2550 0.0219
103.7576 2600 0.0237
105.7576 2650 0.0194
107.7576 2700 0.0184
109.7576 2750 0.0206
111.7576 2800 0.0189
113.7576 2850 0.0216
115.7576 2900 0.0234
117.7576 2950 0.0192
119.7576 3000 0.0193
121.7576 3050 0.0211
123.7576 3100 0.0161
125.7576 3150 0.022
127.7576 3200 0.0176
129.7576 3250 0.0227
131.7576 3300 0.0224
133.7576 3350 0.0172
135.7576 3400 0.0168
137.7576 3450 0.0165
139.7576 3500 0.016
141.7576 3550 0.0143
143.7576 3600 0.0165
145.7576 3650 0.0202
147.7576 3700 0.0118
149.7576 3750 0.0163
151.7576 3800 0.0188
153.7576 3850 0.0137
155.7576 3900 0.0172
157.7576 3950 0.0175
159.7576 4000 0.0204
161.7576 4050 0.0175
163.7576 4100 0.0169
165.7576 4150 0.0184
167.7576 4200 0.0176
169.7576 4250 0.0102
171.7576 4300 0.014
173.7576 4350 0.0164
175.7576 4400 0.0203
177.7576 4450 0.0099
179.7576 4500 0.0143
181.7576 4550 0.0182
183.7576 4600 0.009
185.7576 4650 0.0157
187.7576 4700 0.015
189.7576 4750 0.0168
191.7576 4800 0.0172
193.7576 4850 0.0154
195.7576 4900 0.0162
197.7576 4950 0.0143
199.7576 5000 0.0156

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CustomBatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
78
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Detomo/cl-nagoya-sup-simcse-ja-nss-v_1_0_3 1