Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use ChenyuEcho/hospital_emaillevel_newtrainmethod with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("ChenyuEcho/hospital_emaillevel_newtrainmethod")
sentences = [
"Is the palliative care team’s EHR access timeout linked to a permissions sync error introduced by the last update?",
"Subject: Palliative Care Team Access Issues in EHR\nDate: 2025-12-09T10:06:00\nFrom: Angela R. Scott\nParticipants: Christopher P. Brown\n\nBody:\nHi Christopher,\n\nWe've received multiple tickets from the palliative care team indicating intermittent access issues within the EHR system—specifically, they are experiencing timeouts when attempting to document consult notes and orders. Initial logs point to a permissions sync error following last week's update, but I need your assistance in confirming if this is tied to the recent downtime patch. Would your team be able to prioritize a review of the access logs and coordinate with us to implement a temporary fix while we pursue a full-scale resolution?\n\nThanks for your support,\nAngela\n\n--\nAngela Scott | EHR Support",
"Subject: Re: 关于流感疫苗接种活动的具体问题与建议\nDate: 2025-11-12T08:18:00\nFrom: Margaret Collins\nParticipants: Jennifer Rodriguez\n\nBody:\nJennifer,你好。\n\n感谢你及时反馈流感疫苗接种活动中一线团队面临的实际困难。你的建议非常具有建设性,我完全同意需要在保障医疗安全的同时,尊重员工的工作安排和个人意愿。建议本周安排一次小型讨论会,届时我们可邀请感染控制部门共同参与,听取相关人员的具体建议,并尝试制定更具弹性的接种方案。请你根据大家的时间建议定个会议时间,后续如需协调资源请随时告知。\n\n谢谢你的主动沟通!\n\n祝好,\nMargaret Collins",
"Subject: Re: Aktueller Stand der Patientenbettenverfügbarkeit und Auswirkung auf Laborauswertungen\nDate: 2025-10-15T06:26:00\nFrom: Ethan L. Foster\nParticipants: David S. Wilson\n\nBody:\nHallo David,\n\nDanke für deinen Hinweis zu den Verzögerungen bei der Auswertung von Laborproben aufgrund der begrenzten Bettenanzahl. Ich bin grundsätzlich bereit, gemeinsam Optimierungen am Priorisierungsprozess zu erarbeiten, und habe bereits erste Ideen dazu, wie wir kritische Proben besser kennzeichnen und deren Transport beschleunigen können. Ich schlage vor, dass wir kurzfristig ein gemeinsames Meeting mit dem Labor- und Stationspersonal ansetzen, um konkrete Schritte auszuarbeiten und Herausforderungen direkt zu adressieren. Bitte gib mir Bescheid, wann es dir in dieser Woche passen würde, damit wir einen Termin koordinieren können.\n\nBeste Grüße\nEthan"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from Qwen/Qwen3-Embedding-0.6B. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 32768, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
"What is Patricia Vasquez\u0027s availability for a risk mitigation meeting today regarding the OR3 surgical incident?",
]
documents = [
'Subject: Re: URGENT: Incident Report - OR3 Surgical Case\nDate: 2026-01-28T09:16:00\nFrom: David R. Park\nParticipants: Patricia Vasquez\n\nBody:\nDear Patricia,\n\nThank you for your prompt and thorough notification regarding the OR3 surgical incident. Given the seriousness of the situation, I strongly recommend convening a risk mitigation meeting with the core team as soon as possible today. It is essential that we conduct a careful review of all documentation related to the event and ensure our messaging to the patient, family, and regulatory bodies remains consistent and accurate. Please remind all involved staff to refrain from independent written or verbal statements about the incident; all communications should be directed through my office to preserve attorney-client privilege. Additionally, we must protect peer review privilege wherever it may apply.\n\nLet me know your availability for a meeting, and I will coordinate accordingly. I appreciate your diligence as we navigate this sensitive matter together.\n\nBest regards,\nDavid',
'Subject: Re: Cultural Competency Training Compliance Concerns\nDate: 2025-10-23T12:42:00\nFrom: Maria C. Gonzalez\nParticipants: Carlos J. Rodriguez\n\nBody:\nHi Carlos,\n\nThank you for your prompt response and for taking immediate action to reinforce the cultural competency training requirements with your team. I appreciate your commitment to ensuring that all staff members complete the training by the outlined deadline. Please keep me updated if you encounter any challenges or need additional support, as maintaining regulatory compliance is a top priority for our department. Your attention to this matter is greatly appreciated.\n\nBest regards,\nMaria',
'Subject: Fuera de la oficina: Karen M. Phillips\nDate: 2025-09-29T14:19:00\nFrom: Karen M. Phillips\nParticipants: Remitente\n\nBody:\nEstimado/a,\n\nGracias por su correo. Lamentablemente, hoy me encuentro fuera de la oficina debido a una enfermedad imprevista. Mi ausencia se debe a la necesidad de un monitoreo riguroso de mis síntomas y ajustes de medicación, prestando especial atención a posibles interacciones farmacológicas y la dosificación precisa para evitar complicaciones.\n\nSi su consulta es urgente, o requiere revisión inmediata de interacción de medicamentos o recomendaciones sobre dosificación, por favor contacte a mi colega Dr. Samuel Rivera al correo s.rivera@patientsafetyinstitute.org, quien podrá asistirle y revisar cualquier cuestión relativa a protocolos de seguridad farmacológica en mi ausencia.\n\nAgradezco su comprensión ante la situación. Revisaré y responderé su mensaje a mi regreso tan pronto me sea posible.\n\nAtentamente,\nKaren M. Phillips\n\n--\nKaren Phillips, PharmD | Pharmacy Services',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 1024] [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.7148, -0.0698, 0.0708]], dtype=torch.bfloat16)
val_full_corpusInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.8173 |
| cosine_accuracy@3 | 0.887 |
| cosine_accuracy@5 | 0.9203 |
| cosine_accuracy@10 | 0.9651 |
| cosine_precision@1 | 0.8173 |
| cosine_precision@3 | 0.2957 |
| cosine_precision@5 | 0.1841 |
| cosine_precision@10 | 0.0965 |
| cosine_recall@1 | 0.8173 |
| cosine_recall@3 | 0.887 |
| cosine_recall@5 | 0.9203 |
| cosine_recall@10 | 0.9651 |
| cosine_ndcg@10 | 0.8867 |
| cosine_mrr@10 | 0.8623 |
| cosine_map@100 | 0.8644 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Which specific crash cart items should be prioritized for procurement outreach during this week's supply review? |
Subject: Re: Emergency Code Response Drill – Supply Issues Noted |
Who is responsible for coordinating this week's biannual calibration with Biomedical Engineering for ADL lab equipment (electronic lifts and hand dynamometer) and what is the preferred time window? |
Subject: Lab Equipment Calibration Reminder – Request for Coordination |
Can the phlebotomy slot be moved to before 9:30 AM to resolve the overlap with the CT with IV contrast and blood draw? |
Subject: OR Schedule Conflict: Contrast Protocol Coordination Needed |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robindo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | val_full_corpus_cosine_ndcg@10 |
|---|---|---|
| 1.0 | 149 | 0.8776 |
| 2.0 | 298 | 0.8866 |
| 3.0 | 447 | 0.8867 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}