SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2 on the pre-finetune dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "The SPDR S&P Health Care Services ETF (XHS) aims to mirror the performance of the S&P Health Care Services Select Industry Index by employing a sampling strategy, investing at least 80% of its assets in the index's securities. This index, part of the S&P Total Market Index, focuses on the U.S. health care services sector, including related industries like medical equipment, pharmaceuticals, and drug retailers. XHS offers unique exposure by equally weighting its holdings, which results in a tilt towards smaller companies and an underweighting of large managed health care firms. The index is rebalanced quarterly, ensuring a dynamic and diversified portfolio.",
    "Castle Biosciences, Inc., a commercial-stage diagnostics company, focuses to provide diagnostic and prognostic testing services for dermatological cancers. Its lead product is DecisionDx-Melanoma, a multi-gene expression profile (GEP) test to identify the risk of metastasis for patients diagnosed with invasive cutaneous melanoma. The company also offers DecisionDx-UM test, a proprietary GEP test that predicts the risk of metastasis for patients with uveal melanoma, a rare eye cancer; DecisionDx-SCC, a proprietary 40-gene expression profile test that uses an individual patient's tumor biology to predict individual risk of squamous cell carcinoma metastasis for patients with one or more risk factors; and DecisionDx DiffDx-Melanoma and myPath Melanoma, a proprietary GEP test to diagnose suspicious pigmented lesions. It offers test services through physicians and their patients. The company was founded in 2007 and is headquartered in Friendswood, Texas.",
    'Invesco Senior Income Trust is a closed ended fixed income mutual fund launched by Invesco Ltd. It is co-managed by Invesco Advisers, Inc., Invesco Asset Management Deutschland GmbH, Invesco Asset Management Limited, Invesco Asset Management (Japan) Limited, Invesco Australia Limited, Invesco Hong Kong Limited, Invesco Senior Secured Management, Inc., and Invesco Canada Ltd. The fund invests in the fixed income markets of the United States. It primarily invests in a portfolio of interests in floating or variable rate senior loans to corporations, partnerships, and other entities which operate in a variety of industries and geographical regions. The fund typically employs fundamental analysis with a bottom up stock picking approach to create its portfolio. It benchmarks the performance of its portfolio against the Credit Suisse Leveraged Loan Index. The fund was formerly known as Invesco Van Kampen Senior Income Trust and Van Kampen Senior Income Trust. Invesco Senior Income Trust was formed on June 23, 1998 and is domiciled in the United States.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

pre-finetune

  • Dataset: pre-finetune at 5e8c10c
  • Size: 14,388 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 120 tokens
    • mean: 127.97 tokens
    • max: 128 tokens
    • min: 21 tokens
    • mean: 122.36 tokens
    • max: 128 tokens
  • Samples:
    anchor positive
    The ETF Series Solutions AAM Transf (TRFM) employs a passive management strategy to track the Pence Transformers Index, focusing on U.S.-listed equities, including ADRs, that are set to benefit from significant disruptions in consumer behavior and technological innovation. The index is rules-based and modified equal-weighted, emphasizing companies with substantial R&D spending. It targets sectors like autonomous driving, electric vehicles, the digital economy, 5G, low-carbon technologies, and renewable energy. Eligible companies must have a market cap of at least $2 billion and meet analyst rating criteria. The portfolio is tiered by market cap and domicile, with 75% allocated to U.S. companies, and is reconstituted quarterly. Nova Ltd. designs, develops, produces, and sells process control systems used in the manufacture of semiconductors in Israel, Taiwan, the United States, China, Korea, and internationally. Its product portfolio includes a set of metrology platforms for dimensional, films, and materials and chemical metrology measurements for process control for various semiconductor manufacturing process steps, including lithography, etch, chemical mechanical planarization, deposition, electrochemical plating, and advanced packaging. The company serves various sectors of the integrated circuit manufacturing industry, including logic, foundries, and memory manufacturers, as well as process equipment manufacturers. Nova Ltd. was formerly known as Nova Measuring Instruments Ltd. and changed its name to Nova Ltd. in July 2021. The company was incorporated in 1993 and is headquartered in Rehovot, Israel.
    The U.S. Global Jets ETF (JETS) employs a passive management strategy to track the U.S. Global Jets Index, focusing on U.S. and international airline companies, including passenger airlines, aircraft manufacturers, and airport services. The fund is non-diversified and uses a tiered weighting scheme primarily based on market cap and passenger load. Approximately 70% of its portfolio is allocated to large-cap U.S. passenger airlines, with the top four companies receiving 10% each. The next five largest U.S. or Canadian airlines receive 4% each, while other companies meeting trading and liquidity criteria are weighted based on fundamental factors like cash flow return on capital and sales growth. United Airlines Holdings, Inc., through its subsidiaries, provides air transportation services in North America, Asia, Europe, Africa, the Pacific, the Middle East, and Latin America. The company transports people and cargo through its mainline and regional fleets. It also offers catering, ground handling, training, and maintenance services for third parties. The company was formerly known as United Continental Holdings, Inc. and changed its name to United Airlines Holdings, Inc. in June 2019. United Airlines Holdings, Inc. was incorporated in 1968 and is headquartered in Chicago, Illinois.
    The SPDR S&P Bank ETF (KBE) aims to deliver investment results that correspond to the total return performance of the S&P Banks Select Industry Index, which is part of the S&P Total Market Index tracking the broad U.S. equity market. KBE invests at least 80% of its total assets in securities within this index, focusing on the bank segment, including sub-industries like Asset Management & Custody Banks, Diversified Banks, Regional Banks, Other Diversified Financial Services, and Thrifts & Mortgage Finance. The fund employs an equal-weighted strategy, rebalancing quarterly to ensure equal emphasis on both large and small banking firms, thus providing diversified exposure across the banking sector. Additionally, KBE may hold equity securities outside the index, cash, and money market instruments to maintain liquidity and flexibility. MGIC Investment Corporation, through its subsidiaries, provides private mortgage insurance, other mortgage credit risk management solutions, and ancillary services to lenders and government sponsored entities in the United States, Puerto Rico, and Guam. The company offers primary mortgage insurance that provides mortgage default protection on individual loans, as well as covers unpaid loan principal, delinquent interest, and various expenses associated with the default and subsequent foreclosure. It also provides contract underwriting services, as well as reinsurance. The company serves originators of residential mortgage loans, including savings institutions, commercial banks, mortgage brokers, credit unions, mortgage bankers, and other lenders. MGIC Investment Corporation was founded in 1957 and is headquartered in Milwaukee, Wisconsin.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

pre-finetune

  • Dataset: pre-finetune at 5e8c10c
  • Size: 3,597 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 120 tokens
    • mean: 127.99 tokens
    • max: 128 tokens
    • min: 22 tokens
    • mean: 122.47 tokens
    • max: 128 tokens
  • Samples:
    anchor positive
    The Goldman Sachs Future Health Care ETF (GDOC) seeks long-term capital growth by investing at least 80% of its net assets in equity investments of U.S. and non-U.S. healthcare companies. This actively managed, non-diversified fund targets innovators and disruptors in the healthcare sector, focusing on key themes such as genomics, precision medicine, technology-enabled procedures, and digital healthcare. GDOC may invest in companies of any market capitalization and may use derivatives like futures and options to achieve its investment goals. The fund's adviser employs a fundamental investment process that may integrate ESG factors, utilizing company disclosures, third-party research, and engagement to inform decisions. The fund's thematic allocations can vary over time at the adviser's discretion. Insulet Corporation develops, manufactures, and sells insulin delivery systems for people with insulin-dependent diabetes. It offers Omnipod System, a self-adhesive disposable tubeless Omnipod device that is worn on the body for up to three days at a time, as well as its wireless companion, the handheld personal diabetes manager. The company sells its products primarily through independent distributors and pharmacy channels, as well as directly in the United States, Canada, Europe, the Middle East, and Australia. Insulet Corporation was incorporated in 2000 and is headquartered in Acton, Massachusetts.
    The J.P. Morgan Exchange-Traded Fund (JPRE) aims to provide high total investment return through capital appreciation and current income by investing at least 80% of its net assets in equity securities of real estate investment trusts (REITs), including both equity and mortgage REITs across various market capitalizations. As an actively managed, non-diversified fund, JPRE focuses on U.S. REITs with strong financials, operating revenues, and growth potential. The fund employs a disciplined investment process, evaluating securities based on their ability to generate long-term earnings and growth, while also considering ESG factors. On May 20, 2022, JPRE acquired the assets and liabilities of the JPMorgan Realty Income Fund, which had $2.2 billion in assets, enhancing its investment strategy and historical performance data. Extra Space Storage Inc., headquartered in Salt Lake City, Utah, is a self-administered and self-managed REIT and a member of the S&P 500. As of September 30, 2020, the Company owned and/or operated 1,906 self-storage stores in 40 states, Washington, D.C. and Puerto Rico. The Company's stores comprise approximately 1.4 million units and approximately 147.5 million square feet of rentable space. The Company offers customers a wide selection of conveniently located and secure storage units across the country, including boat storage, RV storage and business storage. The Company is the second largest owner and/or operator of self-storage stores in the United States and is the largest self-storage management company in the United States.
    The First Trust Indxx Metaverse ETF (ARVR) aims to replicate the performance of the Indxx Metaverse Index, investing at least 80% of its net assets in securities within the index. This non-diversified fund targets companies globally that are integral to the Metaverse, focusing on those generating at least 50% of their revenue from five key sub-themes: IP & Contents, Platforms, Payment, Optics & Display, and Semiconductor, Hardware & 5G. The portfolio, comprising 50 companies selected by market-cap, is weighted using revenue thresholds, favoring firms with higher Metaverse-related revenue. Stocks are equally weighted, capped at 2%, and adjusted for market-cap, with the index rebalanced quarterly and reconstituted semi-annually. Adobe Inc. operates as a diversified software company worldwide. It operates through three segments: Digital Media, Digital Experience, and Publishing and Advertising. The Digital Media segment offers products, services, and solutions that enable individuals, teams, and enterprises to create, publish, and promote content; and Document Cloud, a unified cloud-based document services platform. Its flagship product is Creative Cloud, a subscription service that allows members to access its creative products. This segment serves content creators, workers, marketers, educators, enthusiasts, communicators, and consumers. The Digital Experience segment provides an integrated platform and set of applications and services that enable brands and businesses to create, manage, execute, measure, monetize, and optimize customer experiences from analytics to commerce. This segment serves marketers, advertisers, agencies, publishers, merchandisers, merchants, web analysts, data scientists, developers, ...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_drop_last: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0056 10 2.0017 -
0.0111 20 1.7604 -
0.0167 30 1.7855 -
0.0222 40 2.1169 -
0.0278 50 1.7254 -
0.0334 60 1.3081 -
0.0389 70 1.5951 -
0.0445 80 1.4423 -
0.0501 90 1.1902 -
0.0556 100 1.2449 1.2373
0.0612 110 1.3177 -
0.0667 120 1.2411 -
0.0723 130 1.266 -
0.0779 140 1.2949 -
0.0834 150 1.1601 -
0.0890 160 1.2164 -
0.0945 170 0.9354 -
0.1001 180 1.1337 -
0.1057 190 0.8352 -
0.1112 200 1.0118 1.0049
0.1168 210 0.8274 -
0.1224 220 1.1467 -
0.1279 230 1.0113 -
0.1335 240 0.9029 -
0.1390 250 0.7778 -
0.1446 260 0.7863 -
0.1502 270 0.8369 -
0.1557 280 0.8474 -
0.1613 290 0.8498 -
0.1669 300 0.8299 0.8631
0.1724 310 0.9025 -
0.1780 320 0.6665 -
0.1835 330 1.1485 -
0.1891 340 0.8733 -
0.1947 350 0.8992 -
0.2002 360 0.567 -
0.2058 370 0.9371 -
0.2113 380 0.8934 -
0.2169 390 1.0511 -
0.2225 400 0.6262 0.7888
0.2280 410 0.6581 -
0.2336 420 0.7694 -
0.2392 430 0.7046 -
0.2447 440 0.5984 -
0.2503 450 0.7362 -
0.2558 460 0.6819 -
0.2614 470 0.7147 -
0.2670 480 1.2227 -
0.2725 490 0.694 -
0.2781 500 0.7129 0.7650
0.2836 510 0.592 -
0.2892 520 0.7802 -
0.2948 530 0.6695 -
0.3003 540 0.8442 -
0.3059 550 0.9118 -
0.3115 560 0.8278 -
0.3170 570 0.7366 -
0.3226 580 0.889 -
0.3281 590 0.7323 -
0.3337 600 0.5478 0.7326
0.3393 610 0.5562 -
0.3448 620 0.8333 -
0.3504 630 0.6804 -
0.3560 640 0.68 -
0.3615 650 0.6592 -
0.3671 660 0.7572 -
0.3726 670 0.5261 -
0.3782 680 0.6703 -
0.3838 690 0.7719 -
0.3893 700 0.6809 0.7414
0.3949 710 0.8704 -
0.4004 720 0.5926 -
0.4060 730 0.8478 -
0.4116 740 0.6448 -
0.4171 750 0.8352 -
0.4227 760 0.6417 -
0.4283 770 0.6317 -
0.4338 780 0.8715 -
0.4394 790 0.6437 -
0.4449 800 0.5226 0.7210
0.4505 810 0.7438 -
0.4561 820 0.5888 -
0.4616 830 0.6922 -
0.4672 840 0.5851 -
0.4727 850 0.767 -
0.4783 860 0.7227 -
0.4839 870 0.7196 -
0.4894 880 0.5192 -
0.4950 890 0.7199 -
0.5006 900 0.7474 0.6889
0.5061 910 0.8965 -
0.5117 920 0.6767 -
0.5172 930 0.6318 -
0.5228 940 0.6522 -
0.5284 950 0.6574 -
0.5339 960 0.6544 -
0.5395 970 0.7488 -
0.5451 980 0.5972 -
0.5506 990 0.5109 -
0.5562 1000 0.5295 0.7493
0.5617 1010 0.9111 -
0.5673 1020 0.6716 -
0.5729 1030 0.6971 -
0.5784 1040 0.7311 -
0.5840 1050 0.676 -
0.5895 1060 0.6864 -
0.5951 1070 0.885 -
0.6007 1080 0.668 -
0.6062 1090 0.5427 -
0.6118 1100 0.5875 0.7083
0.6174 1110 0.8703 -
0.6229 1120 0.6143 -
0.6285 1130 0.6069 -
0.6340 1140 0.639 -
0.6396 1150 0.8214 -
0.6452 1160 0.638 -
0.6507 1170 0.692 -
0.6563 1180 0.5953 -
0.6618 1190 0.5384 -
0.6674 1200 0.7248 0.7398
0.6730 1210 0.7493 -
0.6785 1220 0.6966 -
0.6841 1230 0.564 -
0.6897 1240 0.6447 -
0.6952 1250 0.4488 -
0.7008 1260 0.7266 -
0.7063 1270 0.847 -
0.7119 1280 0.5734 -
0.7175 1290 0.5047 -
0.7230 1300 0.7196 0.7221
0.7286 1310 0.7561 -
0.7341 1320 0.5301 -
0.7397 1330 0.8898 -
0.7453 1340 0.9251 -
0.7508 1350 0.5438 -
0.7564 1360 0.7402 -
0.7620 1370 0.7043 -
0.7675 1380 0.7119 -
0.7731 1390 0.6493 -
0.7786 1400 0.6253 0.6853
0.7842 1410 0.7815 -
0.7898 1420 0.6936 -
0.7953 1430 0.5198 -
0.8009 1440 0.7672 -
0.8065 1450 0.5436 -
0.8120 1460 0.6117 -
0.8176 1470 0.7137 -
0.8231 1480 0.7257 -
0.8287 1490 0.9861 -
0.8343 1500 0.7558 0.6728
0.8398 1510 0.7658 -
0.8454 1520 0.6785 -
0.8509 1530 0.6592 -
0.8565 1540 0.5787 -
0.8621 1550 0.5519 -
0.8676 1560 0.5911 -
0.8732 1570 0.5285 -
0.8788 1580 0.8498 -
0.8843 1590 0.5782 -
0.8899 1600 0.7702 0.6698
0.8954 1610 0.6775 -
0.9010 1620 0.6656 -
0.9066 1630 0.8432 -
0.9121 1640 0.5653 -
0.9177 1650 0.9223 -
0.9232 1660 0.5962 -
0.9288 1670 0.8247 -
0.9344 1680 0.5816 -
0.9399 1690 0.4149 -
0.9455 1700 0.7022 0.7110
0.9511 1710 0.8407 -
0.9566 1720 0.6638 -
0.9622 1730 0.584 -
0.9677 1740 0.4661 -
0.9733 1750 0.8718 -
0.9789 1760 0.9301 -
0.9844 1770 0.6969 -
0.9900 1780 0.6779 -
0.9956 1790 0.5245 -
1.0011 1800 0.6074 0.7736
1.0067 1810 0.6787 -
1.0122 1820 0.7032 -
1.0178 1830 0.52 -
1.0234 1840 0.573 -
1.0289 1850 0.892 -
1.0345 1860 0.7932 -
1.0400 1870 0.5999 -
1.0456 1880 0.5743 -
1.0512 1890 0.7808 -
1.0567 1900 0.6154 0.7187
1.0623 1910 0.4507 -
1.0679 1920 0.7064 -
1.0734 1930 0.7717 -
1.0790 1940 0.6801 -
1.0845 1950 0.5516 -
1.0901 1960 0.5035 -
1.0957 1970 0.5313 -
1.1012 1980 0.8015 -
1.1068 1990 0.4896 -
1.1123 2000 0.6729 0.7362
1.1179 2010 0.4016 -
1.1235 2020 0.5297 -
1.1290 2030 0.7291 -
1.1346 2040 0.6016 -
1.1402 2050 0.7842 -
1.1457 2060 0.9177 -
1.1513 2070 0.8202 -
1.1568 2080 0.5088 -
1.1624 2090 0.5693 -
1.1680 2100 0.5345 0.7454
1.1735 2110 0.7902 -
1.1791 2120 0.6566 -
1.1846 2130 0.8788 -
1.1902 2140 0.5827 -
1.1958 2150 0.637 -
1.2013 2160 0.8633 -
1.2069 2170 0.3402 -
1.2125 2180 0.7573 -
1.2180 2190 0.6678 -
1.2236 2200 0.6598 0.6689
1.2291 2210 0.5696 -
1.2347 2220 0.6602 -
1.2403 2230 0.6607 -
1.2458 2240 0.79 -
1.2514 2250 0.6669 -
1.2570 2260 0.6055 -
1.2625 2270 0.6212 -
1.2681 2280 0.8946 -
1.2736 2290 0.552 -
1.2792 2300 0.7008 0.6983
1.2848 2310 0.4716 -
1.2903 2320 0.5656 -
1.2959 2330 0.8129 -
1.3014 2340 0.4394 -
1.3070 2350 0.701 -
1.3126 2360 0.6499 -
1.3181 2370 0.5047 -
1.3237 2380 0.6408 -
1.3293 2390 0.5313 -
1.3348 2400 0.6719 0.6520
1.3404 2410 0.7874 -
1.3459 2420 0.4832 -
1.3515 2430 0.6547 -
1.3571 2440 0.5849 -
1.3626 2450 0.6484 -
1.3682 2460 0.58 -
1.3737 2470 0.7658 -
1.3793 2480 0.6171 -
1.3849 2490 0.6701 -
1.3904 2500 0.5618 0.6657
1.3960 2510 0.6476 -
1.4016 2520 0.63 -
1.4071 2530 0.572 -
1.4127 2540 0.5754 -
1.4182 2550 0.6653 -
1.4238 2560 0.7646 -
1.4294 2570 0.569 -
1.4349 2580 0.7779 -
1.4405 2590 0.5836 -
1.4461 2600 0.6308 0.6516
1.4516 2610 0.6666 -
1.4572 2620 0.6455 -
1.4627 2630 0.6055 -
1.4683 2640 0.7232 -
1.4739 2650 0.6897 -
1.4794 2660 0.5363 -
1.4850 2670 0.6541 -
1.4905 2680 0.4246 -
1.4961 2690 0.7298 -
1.5017 2700 0.7172 0.6607
1.5072 2710 0.7145 -
1.5128 2720 0.7005 -
1.5184 2730 0.5449 -
1.5239 2740 0.7212 -
1.5295 2750 0.7456 -
1.5350 2760 0.6035 -
1.5406 2770 0.522 -
1.5462 2780 0.6602 -
1.5517 2790 0.6164 -
1.5573 2800 0.4539 0.6169
1.5628 2810 0.5992 -
1.5684 2820 0.6953 -
1.5740 2830 0.5285 -
1.5795 2840 0.5541 -
1.5851 2850 0.7905 -
1.5907 2860 0.7597 -
1.5962 2870 0.6202 -
1.6018 2880 0.7864 -
1.6073 2890 0.4652 -
1.6129 2900 0.5419 0.6443
1.6185 2910 0.4241 -
1.6240 2920 0.6315 -
1.6296 2930 0.5556 -
1.6352 2940 0.5154 -
1.6407 2950 0.6229 -
1.6463 2960 0.5244 -
1.6518 2970 0.431 -
1.6574 2980 0.7253 -
1.6630 2990 0.5751 -
1.6685 3000 0.618 0.6336
1.6741 3010 0.4592 -
1.6796 3020 0.6263 -
1.6852 3030 0.7317 -
1.6908 3040 0.6233 -
1.6963 3050 0.6546 -
1.7019 3060 0.6236 -
1.7075 3070 0.6012 -
1.7130 3080 0.5819 -
1.7186 3090 0.4667 -
1.7241 3100 0.5198 0.6339
1.7297 3110 0.6028 -
1.7353 3120 0.7013 -
1.7408 3130 0.6106 -
1.7464 3140 0.5535 -
1.7519 3150 0.5766 -
1.7575 3160 0.5127 -
1.7631 3170 0.786 -
1.7686 3180 0.5813 -
1.7742 3190 0.3937 -
1.7798 3200 0.5797 0.6450
1.7853 3210 0.47 -
1.7909 3220 0.6528 -
1.7964 3230 0.4784 -
1.8020 3240 0.7885 -
1.8076 3250 0.558 -
1.8131 3260 0.5268 -
1.8187 3270 0.5434 -
1.8242 3280 0.5277 -
1.8298 3290 0.6126 -
1.8354 3300 0.6411 0.6487
1.8409 3310 0.6255 -
1.8465 3320 0.5895 -
1.8521 3330 0.6065 -
1.8576 3340 0.7614 -
1.8632 3350 0.6079 -
1.8687 3360 0.8003 -
1.8743 3370 0.5454 -
1.8799 3380 0.6056 -
1.8854 3390 0.6906 -
1.8910 3400 0.4542 0.6413
1.8966 3410 0.6845 -
1.9021 3420 0.5585 -
1.9077 3430 0.5673 -
1.9132 3440 0.4752 -
1.9188 3450 0.5202 -
1.9244 3460 0.6504 -
1.9299 3470 0.6346 -
1.9355 3480 0.4864 -
1.9410 3490 0.529 -
1.9466 3500 0.583 0.6556
1.9522 3510 0.6182 -
1.9577 3520 0.6825 -
1.9633 3530 0.624 -
1.9689 3540 0.6257 -
1.9744 3550 0.6063 -
1.9800 3560 0.6281 -
1.9855 3570 0.4984 -
1.9911 3580 0.4623 -
1.9967 3590 0.37 -
2.0022 3600 0.5525 0.6623
2.0078 3610 0.6398 -
2.0133 3620 0.5049 -
2.0189 3630 0.3842 -
2.0245 3640 0.376 -
2.0300 3650 0.5997 -
2.0356 3660 0.4695 -
2.0412 3670 0.6691 -
2.0467 3680 0.5538 -
2.0523 3690 0.5726 -
2.0578 3700 0.4352 0.6381
2.0634 3710 0.5047 -
2.0690 3720 0.6121 -
2.0745 3730 0.4385 -
2.0801 3740 0.5293 -
2.0857 3750 0.4501 -
2.0912 3760 0.54 -
2.0968 3770 0.6387 -
2.1023 3780 0.5413 -
2.1079 3790 0.4567 -
2.1135 3800 0.6769 0.6179
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.0.2
  • Transformers: 4.51.2
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
30
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for suhwan3/mini-lm-finetuned-step1

Finetuned
(32)
this model