Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use GozdeA/tennis-multi-return-mlp-v3 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("GozdeA/tennis-multi-return-mlp-v3")
sentences = [
"2026 for Djokovic?",
"What is the serve speed for he?",
"momentum for Djokovic?",
"2026 for Sinner?"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("GozdeA/tennis-multi-return-mlp-v3")
# Run inference
sentences = [
'What is the break point conversion for Sinner?',
'Show me how many winners',
'service for Sinner?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5908, 0.2891],
# [0.5908, 1.0000, 0.4187],
# [0.2891, 0.4187, 1.0000]])
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
What about he's odds? |
momentum shift? |
What happened to he? |
How far has Nardi advanced at Wimbledon in his best run? |
how many titles? |
What is the what court for he? |
How effective is Swiatek's return in the match? |
How effective is he's return in the match? |
How effective is his return in the game |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
what venue |
Show me what venue |
venue time? |
2025 for he? |
how many titles? |
Show me which court |
What about Djokovic's debut? |
What about he's debut? |
What about Djokovic's momentum? |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
per_device_train_batch_size: 16learning_rate: 2e-05num_train_epochs: 15warmup_ratio: 0.1fp16: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 15max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0687 | 50 | 5.002 |
| 0.1374 | 100 | 4.122 |
| 0.2060 | 150 | 3.3282 |
| 0.2747 | 200 | 2.5309 |
| 0.3434 | 250 | 1.9021 |
| 0.4121 | 300 | 1.7012 |
| 0.4808 | 350 | 1.4657 |
| 0.5495 | 400 | 1.433 |
| 0.6181 | 450 | 1.5156 |
| 0.6868 | 500 | 1.3941 |
| 0.7555 | 550 | 1.2544 |
| 0.8242 | 600 | 1.1585 |
| 0.8929 | 650 | 1.0916 |
| 0.9615 | 700 | 0.9743 |
| 1.0302 | 750 | 1.0443 |
| 1.0989 | 800 | 0.9942 |
| 1.1676 | 850 | 1.0508 |
| 1.2363 | 900 | 0.9211 |
| 1.3049 | 950 | 0.9522 |
| 1.3736 | 1000 | 0.804 |
| 1.4423 | 1050 | 0.8645 |
| 1.5110 | 1100 | 0.8335 |
| 1.5797 | 1150 | 0.7337 |
| 1.6484 | 1200 | 0.7857 |
| 1.7170 | 1250 | 0.8482 |
| 1.7857 | 1300 | 0.7211 |
| 1.8544 | 1350 | 0.7442 |
| 1.9231 | 1400 | 0.7557 |
| 1.9918 | 1450 | 0.7323 |
| 2.0604 | 1500 | 0.677 |
| 2.1291 | 1550 | 0.6635 |
| 2.1978 | 1600 | 0.71 |
| 2.2665 | 1650 | 0.6193 |
| 2.3352 | 1700 | 0.6792 |
| 2.4038 | 1750 | 0.7151 |
| 2.4725 | 1800 | 0.6825 |
| 2.5412 | 1850 | 0.6452 |
| 2.6099 | 1900 | 0.666 |
| 2.6786 | 1950 | 0.5733 |
| 2.7473 | 2000 | 0.5546 |
| 2.8159 | 2050 | 0.6443 |
| 2.8846 | 2100 | 0.6835 |
| 2.9533 | 2150 | 0.6499 |
| 3.0220 | 2200 | 0.6229 |
| 3.0907 | 2250 | 0.6151 |
| 3.1593 | 2300 | 0.539 |
| 3.2280 | 2350 | 0.5997 |
| 3.2967 | 2400 | 0.571 |
| 3.3654 | 2450 | 0.6257 |
| 3.4341 | 2500 | 0.6222 |
| 3.5027 | 2550 | 0.6102 |
| 3.5714 | 2600 | 0.6575 |
| 3.6401 | 2650 | 0.5844 |
| 3.7088 | 2700 | 0.5439 |
| 3.7775 | 2750 | 0.5528 |
| 3.8462 | 2800 | 0.5894 |
| 3.9148 | 2850 | 0.6576 |
| 3.9835 | 2900 | 0.6063 |
| 4.0522 | 2950 | 0.5556 |
| 4.1209 | 3000 | 0.5872 |
| 4.1896 | 3050 | 0.544 |
| 4.2582 | 3100 | 0.5114 |
| 4.3269 | 3150 | 0.587 |
| 4.3956 | 3200 | 0.5392 |
| 4.4643 | 3250 | 0.5846 |
| 4.5330 | 3300 | 0.6077 |
| 4.6016 | 3350 | 0.6597 |
| 4.6703 | 3400 | 0.5425 |
| 4.7390 | 3450 | 0.5493 |
| 4.8077 | 3500 | 0.5291 |
| 4.8764 | 3550 | 0.5145 |
| 4.9451 | 3600 | 0.5534 |
| 5.0137 | 3650 | 0.5018 |
| 5.0824 | 3700 | 0.4948 |
| 5.1511 | 3750 | 0.553 |
| 5.2198 | 3800 | 0.5772 |
| 5.2885 | 3850 | 0.5264 |
| 5.3571 | 3900 | 0.5516 |
| 5.4258 | 3950 | 0.5303 |
| 5.4945 | 4000 | 0.5213 |
| 5.5632 | 4050 | 0.5558 |
| 5.6319 | 4100 | 0.4956 |
| 5.7005 | 4150 | 0.6035 |
| 5.7692 | 4200 | 0.5706 |
| 5.8379 | 4250 | 0.4922 |
| 5.9066 | 4300 | 0.5965 |
| 5.9753 | 4350 | 0.5143 |
| 6.0440 | 4400 | 0.5798 |
| 6.1126 | 4450 | 0.5219 |
| 6.1813 | 4500 | 0.5803 |
| 6.25 | 4550 | 0.5035 |
| 6.3187 | 4600 | 0.5534 |
| 6.3874 | 4650 | 0.546 |
| 6.4560 | 4700 | 0.525 |
| 6.5247 | 4750 | 0.4751 |
| 6.5934 | 4800 | 0.5085 |
| 6.6621 | 4850 | 0.5282 |
| 6.7308 | 4900 | 0.5845 |
| 6.7995 | 4950 | 0.5153 |
| 6.8681 | 5000 | 0.5399 |
| 6.9368 | 5050 | 0.5532 |
| 7.0055 | 5100 | 0.5005 |
| 7.0742 | 5150 | 0.5273 |
| 7.1429 | 5200 | 0.5212 |
| 7.2115 | 5250 | 0.5245 |
| 7.2802 | 5300 | 0.5075 |
| 7.3489 | 5350 | 0.5687 |
| 7.4176 | 5400 | 0.4674 |
| 7.4863 | 5450 | 0.5115 |
| 7.5549 | 5500 | 0.4938 |
| 7.6236 | 5550 | 0.5059 |
| 7.6923 | 5600 | 0.5065 |
| 7.7610 | 5650 | 0.5252 |
| 7.8297 | 5700 | 0.4852 |
| 7.8984 | 5750 | 0.48 |
| 7.9670 | 5800 | 0.5503 |
| 8.0357 | 5850 | 0.5164 |
| 8.1044 | 5900 | 0.5756 |
| 8.1731 | 5950 | 0.5175 |
| 8.2418 | 6000 | 0.5033 |
| 8.3104 | 6050 | 0.4992 |
| 8.3791 | 6100 | 0.5299 |
| 8.4478 | 6150 | 0.4862 |
| 8.5165 | 6200 | 0.548 |
| 8.5852 | 6250 | 0.454 |
| 8.6538 | 6300 | 0.4941 |
| 8.7225 | 6350 | 0.5088 |
| 8.7912 | 6400 | 0.5065 |
| 8.8599 | 6450 | 0.4921 |
| 8.9286 | 6500 | 0.4756 |
| 8.9973 | 6550 | 0.5258 |
| 9.0659 | 6600 | 0.4658 |
| 9.1346 | 6650 | 0.4894 |
| 9.2033 | 6700 | 0.5097 |
| 9.2720 | 6750 | 0.493 |
| 9.3407 | 6800 | 0.5311 |
| 9.4093 | 6850 | 0.5157 |
| 9.4780 | 6900 | 0.5142 |
| 9.5467 | 6950 | 0.4664 |
| 9.6154 | 7000 | 0.528 |
| 9.6841 | 7050 | 0.5645 |
| 9.7527 | 7100 | 0.5214 |
| 9.8214 | 7150 | 0.4777 |
| 9.8901 | 7200 | 0.5449 |
| 9.9588 | 7250 | 0.492 |
| 10.0275 | 7300 | 0.4591 |
| 10.0962 | 7350 | 0.4576 |
| 10.1648 | 7400 | 0.4692 |
| 10.2335 | 7450 | 0.5415 |
| 10.3022 | 7500 | 0.4803 |
| 10.3709 | 7550 | 0.5487 |
| 10.4396 | 7600 | 0.5706 |
| 10.5082 | 7650 | 0.4815 |
| 10.5769 | 7700 | 0.4585 |
| 10.6456 | 7750 | 0.4861 |
| 10.7143 | 7800 | 0.4247 |
| 10.7830 | 7850 | 0.4906 |
| 10.8516 | 7900 | 0.5371 |
| 10.9203 | 7950 | 0.5393 |
| 10.9890 | 8000 | 0.4788 |
| 11.0577 | 8050 | 0.5038 |
| 11.1264 | 8100 | 0.4838 |
| 11.1951 | 8150 | 0.515 |
| 11.2637 | 8200 | 0.5299 |
| 11.3324 | 8250 | 0.5044 |
| 11.4011 | 8300 | 0.5045 |
| 11.4698 | 8350 | 0.465 |
| 11.5385 | 8400 | 0.5253 |
| 11.6071 | 8450 | 0.4517 |
| 11.6758 | 8500 | 0.5048 |
| 11.7445 | 8550 | 0.4733 |
| 11.8132 | 8600 | 0.47 |
| 11.8819 | 8650 | 0.4552 |
| 11.9505 | 8700 | 0.4203 |
| 12.0192 | 8750 | 0.395 |
| 12.0879 | 8800 | 0.5411 |
| 12.1566 | 8850 | 0.4911 |
| 12.2253 | 8900 | 0.4641 |
| 12.2940 | 8950 | 0.4608 |
| 12.3626 | 9000 | 0.4839 |
| 12.4313 | 9050 | 0.4491 |
| 12.5 | 9100 | 0.517 |
| 12.5687 | 9150 | 0.5031 |
| 12.6374 | 9200 | 0.4869 |
| 12.7060 | 9250 | 0.4856 |
| 12.7747 | 9300 | 0.4754 |
| 12.8434 | 9350 | 0.5167 |
| 12.9121 | 9400 | 0.5004 |
| 12.9808 | 9450 | 0.5293 |
| 13.0495 | 9500 | 0.4566 |
| 13.1181 | 9550 | 0.477 |
| 13.1868 | 9600 | 0.4501 |
| 13.2555 | 9650 | 0.4791 |
| 13.3242 | 9700 | 0.4746 |
| 13.3929 | 9750 | 0.4702 |
| 13.4615 | 9800 | 0.469 |
| 13.5302 | 9850 | 0.5046 |
| 13.5989 | 9900 | 0.4895 |
| 13.6676 | 9950 | 0.5223 |
| 13.7363 | 10000 | 0.4245 |
| 13.8049 | 10050 | 0.4701 |
| 13.8736 | 10100 | 0.4548 |
| 13.9423 | 10150 | 0.4998 |
| 14.0110 | 10200 | 0.4345 |
| 14.0797 | 10250 | 0.4371 |
| 14.1484 | 10300 | 0.5009 |
| 14.2170 | 10350 | 0.4816 |
| 14.2857 | 10400 | 0.4665 |
| 14.3544 | 10450 | 0.5047 |
| 14.4231 | 10500 | 0.5132 |
| 14.4918 | 10550 | 0.473 |
| 14.5604 | 10600 | 0.4387 |
| 14.6291 | 10650 | 0.4775 |
| 14.6978 | 10700 | 0.4522 |
| 14.7665 | 10750 | 0.4807 |
| 14.8352 | 10800 | 0.482 |
| 14.9038 | 10850 | 0.4625 |
| 14.9725 | 10900 | 0.5052 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
nreimers/MiniLM-L6-H384-uncased