Medical Embedding
Collection
12 items
•
Updated
•
3
This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yasserrmd/dental-gemma-300m-emb")
# Run inference
queries = [
"What is tooth transposition and how does it occur?\n",
]
documents = [
'Tooth transposition is a condition where two adjacent teeth switch positions in the dental arch. It can be classified as complete transposition when both the crowns and roots of the involved teeth exchange places, or incomplete transposition when only the crowns are transposed. Tooth transposition is more commonly found unilaterally than bilaterally, with a higher prevalence in the maxillary (upper) arch. There is no sex preference for tooth transposition. The condition is not significantly related to other dental anomalies such as missing teeth, peg-shaped or hypoplastic teeth, or impacted teeth. The etiology of tooth transposition is believed to be genetically involved, and it most often occurs at the maxillary canine.',
'Air polishing offers many advantages to clinicians and their patients. It is less time-consuming and effective in heavily stained surfaces, such as smoking and chlorhexidine stain. This minimizes operator and patient fatigue. Additionally, air polishing treatments can provide superior treatment outcomes in terms of speed and efficacy.',
'The main forms of treatment for oral and tooth health in government hospitals and health centers in Turkey are extraction, removable prosthodontics, and routine amalgam and composite restorations. Root canal treatment is not generally preferred due to a lack of endodontic equipment and time.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7669, 0.0321, 0.2308]])
sentence_0
and sentence_1
sentence_0 | sentence_1 | |
---|---|---|
type | string | string |
details |
|
|
sentence_0 | sentence_1 |
---|---|
How do prosthodontists in Australia determine the need for placing a post in a tooth restoration? |
Prosthodontists in Australia consider both the quantity of tooth structure and the type of planned restoration when deciding whether to place a post. The location of the tooth in the arch seems to have less influence on their decision. Molar teeth and mandibular anterior teeth are less likely to receive posts. |
What are some patient-centered outcome measures used to assess the impact of oral health problems on quality of life? |
There are several patient-centered outcome measures called 'oral health related quality of life measures' (OHQoL) that have been developed to assess the extent to which oral health problems affect a person's quality of life. Two measures that have received particular attention are the Oral Health Impact Profile (OHIP-14) and the UK Oral Health Related Quality of Life (OHQoL-UK) questionnaires. The OHIP-14 measures the adverse impacts of oral conditions on daily life, while the OHQoL-UK incorporates both negative and positive influences on health. |
How does finite element analysis (FEA) contribute to the understanding and improvement of dental implant procedures? |
Finite element analysis (FEA) is a computational method that can be used to simulate the distribution of stress and strain in the mandibular bone and osseointegrated implants. By considering various variables such as material characteristics, types of loads, and individual bio-subjectivity, FEA studies provide valuable insights into stress distribution and geometry evaluation. This information helps in making informed decisions about implant positioning, inclination, and type to ensure the long-term stability and success of dental implants. |
MultipleNegativesRankingLoss
with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size
: 4per_device_eval_batch_size
: 4num_train_epochs
: 1multi_dataset_batch_sampler
: round_robinoverwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config
: Nonedeepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
: auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robinrouter_mapping
: {}learning_rate_mapping
: {}Epoch | Step | Training Loss |
---|---|---|
0.1 | 500 | 0.0147 |
0.2 | 1000 | 0.0142 |
0.3 | 1500 | 0.0154 |
0.4 | 2000 | 0.0085 |
0.5 | 2500 | 0.0052 |
0.6 | 3000 | 0.0071 |
0.7 | 3500 | 0.0025 |
0.8 | 4000 | 0.0028 |
0.9 | 4500 | 0.0056 |
1.0 | 5000 | 0.0045 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
google/embeddinggemma-300m