Mollel's picture
Add new SentenceTransformer model.
5b15e6f verified
language: []
library_name: sentence-transformers
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1115700
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: nomic-ai/nomic-embed-text-v1.5
datasets: []
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
- source_sentence: Ndege mwenye mdomo mrefu katikati ya ndege.
- Panya anayekimbia juu ya gurudumu.
- Mtu anashindana katika mashindano ya mbio.
- Ndege anayeruka.
- source_sentence: Msichana mchanga mwenye nywele nyeusi anakabili kamera na kushikilia
mfuko wa karatasi wakati amevaa shati la machungwa na mabawa ya kipepeo yenye
rangi nyingi.
- Mwanamke mzee anakataa kupigwa picha.
- mtu akila na mvulana mdogo kwenye kijia cha jiji
- Msichana mchanga anakabili kamera.
- source_sentence: Wanawake na watoto wameketi nje katika kivuli wakati kikundi cha
watoto wadogo wameketi ndani katika kivuli.
- Mwanamke na watoto na kukaa chini.
- Mwanamke huyo anakimbia.
- Watu wanasafiri kwa baiskeli.
- source_sentence: Mtoto mdogo anaruka mikononi mwa mwanamke aliyevalia suti nyeusi
ya kuogelea akiwa kwenye dimbwi.
- Mtoto akiruka mikononi mwa mwanamke aliyevalia suti ya kuogelea kwenye dimbwi.
- Someone is holding oranges and walking
- Mama na binti wakinunua viatu.
- source_sentence: Mwanamume na mwanamke wachanga waliovaa mikoba wanaweka au kuondoa
kitu kutoka kwenye mti mweupe wa zamani, huku watu wengine wamesimama au wameketi
- tai huruka
- mwanamume na mwanamke wenye mikoba
- Wanaume wawili wameketi karibu na mwanamke.
pipeline_tag: sentence-similarity
- name: SentenceTransformer based on nomic-ai/nomic-embed-text-v1.5
- task:
type: semantic-similarity
name: Semantic Similarity
name: sts test 768
type: sts-test-768
- type: pearson_cosine
value: 0.6944960057464138
name: Pearson Cosine
- type: spearman_cosine
value: 0.6872396378196957
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7086043588614903
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7136479613274518
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7084460037709435
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7128357831285198
name: Spearman Euclidean
- type: pearson_dot
value: 0.481902874304561
name: Pearson Dot
- type: spearman_dot
value: 0.46588918379526945
name: Spearman Dot
- type: pearson_max
value: 0.7086043588614903
name: Pearson Max
- type: spearman_max
value: 0.7136479613274518
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
name: sts test 512
type: sts-test-512
- type: pearson_cosine
value: 0.6925787246105148
name: Pearson Cosine
- type: spearman_cosine
value: 0.6859479129419207
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7087290093387656
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7127968133455542
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7088805484816247
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7123606046721803
name: Spearman Euclidean
- type: pearson_dot
value: 0.4684333245586192
name: Pearson Dot
- type: spearman_dot
value: 0.45257836578849003
name: Spearman Dot
- type: pearson_max
value: 0.7088805484816247
name: Pearson Max
- type: spearman_max
value: 0.7127968133455542
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
name: sts test 256
type: sts-test-256
- type: pearson_cosine
value: 0.6876956481856266
name: Pearson Cosine
- type: spearman_cosine
value: 0.6814892249857147
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7083882582081078
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7097524143994903
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7094190252305796
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7104287347206688
name: Spearman Euclidean
- type: pearson_dot
value: 0.4438925722484721
name: Pearson Dot
- type: spearman_dot
value: 0.4255299982188107
name: Spearman Dot
- type: pearson_max
value: 0.7094190252305796
name: Pearson Max
- type: spearman_max
value: 0.7104287347206688
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
name: sts test 128
type: sts-test-128
- type: pearson_cosine
value: 0.6708560165075523
name: Pearson Cosine
- type: spearman_cosine
value: 0.6669935075512006
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7041961281711793
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7000807688296651
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7055061381768357
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7022686907818495
name: Spearman Euclidean
- type: pearson_dot
value: 0.37855771167572094
name: Pearson Dot
- type: spearman_dot
value: 0.35930717422088765
name: Spearman Dot
- type: pearson_max
value: 0.7055061381768357
name: Pearson Max
- type: spearman_max
value: 0.7022686907818495
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
name: sts test 64
type: sts-test-64
- type: pearson_cosine
value: 0.6533817775144477
name: Pearson Cosine
- type: spearman_cosine
value: 0.6523997361414113
name: Spearman Cosine
- type: pearson_manhattan
value: 0.6919834348567717
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.6857245312336051
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.6950438027503257
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.6899151458827059
name: Spearman Euclidean
- type: pearson_dot
value: 0.33502302384042637
name: Pearson Dot
- type: spearman_dot
value: 0.3097469345046609
name: Spearman Dot
- type: pearson_max
value: 0.6950438027503257
name: Pearson Max
- type: spearman_max
value: 0.6899151458827059
name: Spearman Max
# SentenceTransformer based on nomic-ai/nomic-embed-text-v1.5
This is a [sentence-transformers]( model finetuned from [nomic-ai/nomic-embed-text-v1.5]( on the Mollel/swahili-n_li-triplet-swh-eng dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [nomic-ai/nomic-embed-text-v1.5]( <!-- at revision b0753ae76394dd36bcfb912a46018088bca48be0 -->
- **Maximum Sequence Length:** 8192 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
- Mollel/swahili-n_li-triplet-swh-eng
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](
- **Repository:** [Sentence Transformers on GitHub](
- **Hugging Face:** [Sentence Transformers on Hugging Face](
### Full Model Architecture
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Mollel/MultiLinguSwahili-nomic-embed-text-v1.5-nli-matryoshka")
# Run inference
sentences = [
'Mwanamume na mwanamke wachanga waliovaa mikoba wanaweka au kuondoa kitu kutoka kwenye mti mweupe wa zamani, huku watu wengine wamesimama au wameketi nyuma.',
'mwanamume na mwanamke wenye mikoba',
'tai huruka',
embeddings = model.encode(sentences)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
# [3, 3]
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
## Evaluation
### Metrics
#### Semantic Similarity
* Dataset: `sts-test-768`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](
| Metric | Value |
| pearson_cosine | 0.6945 |
| **spearman_cosine** | **0.6872** |
| pearson_manhattan | 0.7086 |
| spearman_manhattan | 0.7136 |
| pearson_euclidean | 0.7084 |
| spearman_euclidean | 0.7128 |
| pearson_dot | 0.4819 |
| spearman_dot | 0.4659 |
| pearson_max | 0.7086 |
| spearman_max | 0.7136 |
#### Semantic Similarity
* Dataset: `sts-test-512`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](
| Metric | Value |
| pearson_cosine | 0.6926 |
| **spearman_cosine** | **0.6859** |
| pearson_manhattan | 0.7087 |
| spearman_manhattan | 0.7128 |
| pearson_euclidean | 0.7089 |
| spearman_euclidean | 0.7124 |
| pearson_dot | 0.4684 |
| spearman_dot | 0.4526 |
| pearson_max | 0.7089 |
| spearman_max | 0.7128 |
#### Semantic Similarity
* Dataset: `sts-test-256`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](
| Metric | Value |
| pearson_cosine | 0.6877 |
| **spearman_cosine** | **0.6815** |
| pearson_manhattan | 0.7084 |
| spearman_manhattan | 0.7098 |
| pearson_euclidean | 0.7094 |
| spearman_euclidean | 0.7104 |
| pearson_dot | 0.4439 |
| spearman_dot | 0.4255 |
| pearson_max | 0.7094 |
| spearman_max | 0.7104 |
#### Semantic Similarity
* Dataset: `sts-test-128`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](
| Metric | Value |
| pearson_cosine | 0.6709 |
| **spearman_cosine** | **0.667** |
| pearson_manhattan | 0.7042 |
| spearman_manhattan | 0.7001 |
| pearson_euclidean | 0.7055 |
| spearman_euclidean | 0.7023 |
| pearson_dot | 0.3786 |
| spearman_dot | 0.3593 |
| pearson_max | 0.7055 |
| spearman_max | 0.7023 |
#### Semantic Similarity
* Dataset: `sts-test-64`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](
| Metric | Value |
| pearson_cosine | 0.6534 |
| **spearman_cosine** | **0.6524** |
| pearson_manhattan | 0.692 |
| spearman_manhattan | 0.6857 |
| pearson_euclidean | 0.695 |
| spearman_euclidean | 0.6899 |
| pearson_dot | 0.335 |
| spearman_dot | 0.3097 |
| pearson_max | 0.695 |
| spearman_max | 0.6899 |
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
## Training Details
### Training Dataset
#### Mollel/swahili-n_li-triplet-swh-eng
* Dataset: Mollel/swahili-n_li-triplet-swh-eng
* Size: 1,115,700 training samples
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
* Approximate statistics based on the first 1000 samples:
| | anchor | positive | negative |
| type | string | string | string |
| details | <ul><li>min: 7 tokens</li><li>mean: 15.18 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.53 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 17.8 tokens</li><li>max: 53 tokens</li></ul> |
* Samples:
| anchor | positive | negative |
| <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> | <code>A person is at a diner, ordering an omelette.</code> |
| <code>Mtu aliyepanda farasi anaruka juu ya ndege iliyovunjika.</code> | <code>Mtu yuko nje, juu ya farasi.</code> | <code>Mtu yuko kwenye mkahawa, akiagiza omelette.</code> |
| <code>Children smiling and waving at camera</code> | <code>There are children present</code> | <code>The kids are frowning</code> |
* Loss: [<code>MatryoshkaLoss</code>]( with these parameters:
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
"matryoshka_weights": [
"n_dims_per_step": -1
### Evaluation Dataset
#### Mollel/swahili-n_li-triplet-swh-eng
* Dataset: Mollel/swahili-n_li-triplet-swh-eng
* Size: 13,168 evaluation samples
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
* Approximate statistics based on the first 1000 samples:
| | anchor | positive | negative |
| type | string | string | string |
| details | <ul><li>min: 6 tokens</li><li>mean: 26.43 tokens</li><li>max: 94 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 13.37 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 14.7 tokens</li><li>max: 54 tokens</li></ul> |
* Samples:
| anchor | positive | negative |
| <code>Two women are embracing while holding to go packages.</code> | <code>Two woman are holding packages.</code> | <code>The men are fighting outside a deli.</code> |
| <code>Wanawake wawili wanakumbatiana huku wakishikilia vifurushi vya kwenda.</code> | <code>Wanawake wawili wanashikilia vifurushi.</code> | <code>Wanaume hao wanapigana nje ya duka la vyakula vitamu.</code> |
| <code>Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.</code> | <code>Two kids in numbered jerseys wash their hands.</code> | <code>Two kids in jackets walk to school.</code> |
* Loss: [<code>MatryoshkaLoss</code>]( with these parameters:
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
"matryoshka_weights": [
"n_dims_per_step": -1
### Training Hyperparameters
#### Non-Default Hyperparameters
- `per_device_train_batch_size`: 24
- `per_device_eval_batch_size`: 24
- `learning_rate`: 2e-05
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `bf16`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 24
- `per_device_eval_batch_size`: 24
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
### Training Logs
<details><summary>Click to expand</summary>
| Epoch | Step | Training Loss | sts-test-128_spearman_cosine | sts-test-256_spearman_cosine | sts-test-512_spearman_cosine | sts-test-64_spearman_cosine | sts-test-768_spearman_cosine |
| 0.0043 | 100 | 10.0627 | - | - | - | - | - |
| 0.0086 | 200 | 8.2355 | - | - | - | - | - |
| 0.0129 | 300 | 6.7233 | - | - | - | - | - |
| 0.0172 | 400 | 6.5832 | - | - | - | - | - |
| 0.0215 | 500 | 6.7512 | - | - | - | - | - |
| 0.0258 | 600 | 6.7634 | - | - | - | - | - |
| 0.0301 | 700 | 6.5592 | - | - | - | - | - |
| 0.0344 | 800 | 5.0689 | - | - | - | - | - |
| 0.0387 | 900 | 4.7079 | - | - | - | - | - |
| 0.0430 | 1000 | 4.6359 | - | - | - | - | - |
| 0.0473 | 1100 | 4.4513 | - | - | - | - | - |
| 0.0516 | 1200 | 4.2328 | - | - | - | - | - |
| 0.0559 | 1300 | 3.7454 | - | - | - | - | - |
| 0.0602 | 1400 | 3.9198 | - | - | - | - | - |
| 0.0645 | 1500 | 4.0727 | - | - | - | - | - |
| 0.0688 | 1600 | 3.8923 | - | - | - | - | - |
| 0.0731 | 1700 | 3.8137 | - | - | - | - | - |
| 0.0774 | 1800 | 4.1512 | - | - | - | - | - |
| 0.0817 | 1900 | 4.1304 | - | - | - | - | - |
| 0.0860 | 2000 | 4.0195 | - | - | - | - | - |
| 0.0903 | 2100 | 3.6836 | - | - | - | - | - |
| 0.0946 | 2200 | 2.9968 | - | - | - | - | - |
| 0.0990 | 2300 | 2.8909 | - | - | - | - | - |
| 0.1033 | 2400 | 3.0884 | - | - | - | - | - |
| 0.1076 | 2500 | 3.3081 | - | - | - | - | - |
| 0.1119 | 2600 | 3.6266 | - | - | - | - | - |
| 0.1162 | 2700 | 4.3754 | - | - | - | - | - |
| 0.1205 | 2800 | 4.0218 | - | - | - | - | - |
| 0.1248 | 2900 | 3.7167 | - | - | - | - | - |
| 0.1291 | 3000 | 3.4815 | - | - | - | - | - |
| 0.1334 | 3100 | 3.6446 | - | - | - | - | - |
| 0.1377 | 3200 | 3.44 | - | - | - | - | - |
| 0.1420 | 3300 | 3.6725 | - | - | - | - | - |
| 0.1463 | 3400 | 3.4699 | - | - | - | - | - |
| 0.1506 | 3500 | 3.076 | - | - | - | - | - |
| 0.1549 | 3600 | 3.1179 | - | - | - | - | - |
| 0.1592 | 3700 | 3.1704 | - | - | - | - | - |
| 0.1635 | 3800 | 3.4614 | - | - | - | - | - |
| 0.1678 | 3900 | 4.1157 | - | - | - | - | - |
| 0.1721 | 4000 | 4.1584 | - | - | - | - | - |
| 0.1764 | 4100 | 4.5602 | - | - | - | - | - |
| 0.1807 | 4200 | 3.6875 | - | - | - | - | - |
| 0.1850 | 4300 | 4.1521 | - | - | - | - | - |
| 0.1893 | 4400 | 3.5475 | - | - | - | - | - |
| 0.1936 | 4500 | 3.4036 | - | - | - | - | - |
| 0.1979 | 4600 | 3.0564 | - | - | - | - | - |
| 0.2022 | 4700 | 3.7761 | - | - | - | - | - |
| 0.2065 | 4800 | 3.6857 | - | - | - | - | - |
| 0.2108 | 4900 | 3.3534 | - | - | - | - | - |
| 0.2151 | 5000 | 4.1137 | - | - | - | - | - |
| 0.2194 | 5100 | 3.5239 | - | - | - | - | - |
| 0.2237 | 5200 | 4.1297 | - | - | - | - | - |
| 0.2280 | 5300 | 3.5339 | - | - | - | - | - |
| 0.2323 | 5400 | 3.9294 | - | - | - | - | - |
| 0.2366 | 5500 | 3.717 | - | - | - | - | - |
| 0.2409 | 5600 | 3.3346 | - | - | - | - | - |
| 0.2452 | 5700 | 4.0495 | - | - | - | - | - |
| 0.2495 | 5800 | 3.7869 | - | - | - | - | - |
| 0.2538 | 5900 | 3.9533 | - | - | - | - | - |
| 0.2581 | 6000 | 4.1135 | - | - | - | - | - |
| 0.2624 | 6100 | 3.6655 | - | - | - | - | - |
| 0.2667 | 6200 | 3.9111 | - | - | - | - | - |
| 0.2710 | 6300 | 3.8582 | - | - | - | - | - |
| 0.2753 | 6400 | 3.7712 | - | - | - | - | - |
| 0.2796 | 6500 | 3.6536 | - | - | - | - | - |
| 0.2839 | 6600 | 3.4516 | - | - | - | - | - |
| 0.2882 | 6700 | 3.7151 | - | - | - | - | - |
| 0.2925 | 6800 | 3.7659 | - | - | - | - | - |
| 0.2969 | 6900 | 3.3159 | - | - | - | - | - |
| 0.3012 | 7000 | 3.5753 | - | - | - | - | - |
| 0.3055 | 7100 | 4.2095 | - | - | - | - | - |
| 0.3098 | 7200 | 3.718 | - | - | - | - | - |
| 0.3141 | 7300 | 4.0709 | - | - | - | - | - |
| 0.3184 | 7400 | 3.8079 | - | - | - | - | - |
| 0.3227 | 7500 | 3.3735 | - | - | - | - | - |
| 0.3270 | 7600 | 3.7303 | - | - | - | - | - |
| 0.3313 | 7700 | 3.2693 | - | - | - | - | - |
| 0.3356 | 7800 | 3.6564 | - | - | - | - | - |
| 0.3399 | 7900 | 3.6702 | - | - | - | - | - |
| 0.3442 | 8000 | 3.7274 | - | - | - | - | - |
| 0.3485 | 8100 | 3.8536 | - | - | - | - | - |
| 0.3528 | 8200 | 3.9516 | - | - | - | - | - |
| 0.3571 | 8300 | 3.7351 | - | - | - | - | - |
| 0.3614 | 8400 | 3.649 | - | - | - | - | - |
| 0.3657 | 8500 | 3.5913 | - | - | - | - | - |
| 0.3700 | 8600 | 3.7733 | - | - | - | - | - |
| 0.3743 | 8700 | 3.6359 | - | - | - | - | - |
| 0.3786 | 8800 | 4.2983 | - | - | - | - | - |
| 0.3829 | 8900 | 3.6692 | - | - | - | - | - |
| 0.3872 | 9000 | 3.7309 | - | - | - | - | - |
| 0.3915 | 9100 | 3.8886 | - | - | - | - | - |
| 0.3958 | 9200 | 3.8999 | - | - | - | - | - |
| 0.4001 | 9300 | 3.5528 | - | - | - | - | - |
| 0.4044 | 9400 | 3.6309 | - | - | - | - | - |
| 0.4087 | 9500 | 4.2475 | - | - | - | - | - |
| 0.4130 | 9600 | 3.793 | - | - | - | - | - |
| 0.4173 | 9700 | 3.6575 | - | - | - | - | - |
| 0.4216 | 9800 | 3.84 | - | - | - | - | - |
| 0.4259 | 9900 | 3.3721 | - | - | - | - | - |
| 0.4302 | 10000 | 4.3743 | - | - | - | - | - |
| 0.4345 | 10100 | 3.5054 | - | - | - | - | - |
| 0.4388 | 10200 | 3.54 | - | - | - | - | - |
| 0.4431 | 10300 | 3.6197 | - | - | - | - | - |
| 0.4474 | 10400 | 3.7567 | - | - | - | - | - |
| 0.4517 | 10500 | 3.9814 | - | - | - | - | - |
| 0.4560 | 10600 | 3.6277 | - | - | - | - | - |
| 0.4603 | 10700 | 3.5071 | - | - | - | - | - |
| 0.4646 | 10800 | 3.8348 | - | - | - | - | - |
| 0.4689 | 10900 | 3.8674 | - | - | - | - | - |
| 0.4732 | 11000 | 3.0325 | - | - | - | - | - |
| 0.4775 | 11100 | 3.7262 | - | - | - | - | - |
| 0.4818 | 11200 | 3.6921 | - | - | - | - | - |
| 0.4861 | 11300 | 3.4946 | - | - | - | - | - |
| 0.4904 | 11400 | 3.7541 | - | - | - | - | - |
| 0.4948 | 11500 | 3.6751 | - | - | - | - | - |
| 0.4991 | 11600 | 3.8765 | - | - | - | - | - |
| 0.5034 | 11700 | 3.5058 | - | - | - | - | - |
| 0.5077 | 11800 | 3.5135 | - | - | - | - | - |
| 0.5120 | 11900 | 3.8052 | - | - | - | - | - |
| 0.5163 | 12000 | 3.3015 | - | - | - | - | - |
| 0.5206 | 12100 | 3.5389 | - | - | - | - | - |
| 0.5249 | 12200 | 3.5226 | - | - | - | - | - |
| 0.5292 | 12300 | 3.6715 | - | - | - | - | - |
| 0.5335 | 12400 | 3.2256 | - | - | - | - | - |
| 0.5378 | 12500 | 3.3447 | - | - | - | - | - |
| 0.5421 | 12600 | 3.6315 | - | - | - | - | - |
| 0.5464 | 12700 | 3.8674 | - | - | - | - | - |
| 0.5507 | 12800 | 3.4066 | - | - | - | - | - |
| 0.5550 | 12900 | 3.7356 | - | - | - | - | - |
| 0.5593 | 13000 | 3.5742 | - | - | - | - | - |
| 0.5636 | 13100 | 3.7676 | - | - | - | - | - |
| 0.5679 | 13200 | 3.7907 | - | - | - | - | - |
| 0.5722 | 13300 | 3.8089 | - | - | - | - | - |
| 0.5765 | 13400 | 3.4742 | - | - | - | - | - |
| 0.5808 | 13500 | 3.6536 | - | - | - | - | - |
| 0.5851 | 13600 | 3.7736 | - | - | - | - | - |
| 0.5894 | 13700 | 3.9072 | - | - | - | - | - |
| 0.5937 | 13800 | 3.7386 | - | - | - | - | - |
| 0.5980 | 13900 | 3.3387 | - | - | - | - | - |
| 0.6023 | 14000 | 3.5509 | - | - | - | - | - |
| 0.6066 | 14100 | 3.7056 | - | - | - | - | - |
| 0.6109 | 14200 | 3.7283 | - | - | - | - | - |
| 0.6152 | 14300 | 3.7301 | - | - | - | - | - |
| 0.6195 | 14400 | 3.8027 | - | - | - | - | - |
| 0.6238 | 14500 | 3.5606 | - | - | - | - | - |
| 0.6281 | 14600 | 3.9467 | - | - | - | - | - |
| 0.6324 | 14700 | 3.3394 | - | - | - | - | - |
| 0.6367 | 14800 | 4.1254 | - | - | - | - | - |
| 0.6410 | 14900 | 3.7121 | - | - | - | - | - |
| 0.6453 | 15000 | 3.9167 | - | - | - | - | - |
| 0.6496 | 15100 | 3.8084 | - | - | - | - | - |
| 0.6539 | 15200 | 3.7794 | - | - | - | - | - |
| 0.6582 | 15300 | 3.7664 | - | - | - | - | - |
| 0.6625 | 15400 | 3.4378 | - | - | - | - | - |
| 0.6668 | 15500 | 3.6632 | - | - | - | - | - |
| 0.6711 | 15600 | 3.8493 | - | - | - | - | - |
| 0.6754 | 15700 | 4.1475 | - | - | - | - | - |
| 0.6797 | 15800 | 3.5782 | - | - | - | - | - |
| 0.6840 | 15900 | 3.4341 | - | - | - | - | - |
| 0.6883 | 16000 | 3.3295 | - | - | - | - | - |
| 0.6927 | 16100 | 3.8165 | - | - | - | - | - |
| 0.6970 | 16200 | 3.9702 | - | - | - | - | - |
| 0.7013 | 16300 | 3.6555 | - | - | - | - | - |
| 0.7056 | 16400 | 3.6946 | - | - | - | - | - |
| 0.7099 | 16500 | 3.8027 | - | - | - | - | - |
| 0.7142 | 16600 | 3.4523 | - | - | - | - | - |
| 0.7185 | 16700 | 3.461 | - | - | - | - | - |
| 0.7228 | 16800 | 3.4403 | - | - | - | - | - |
| 0.7271 | 16900 | 3.6398 | - | - | - | - | - |
| 0.7314 | 17000 | 3.8443 | - | - | - | - | - |
| 0.7357 | 17100 | 3.6012 | - | - | - | - | - |
| 0.7400 | 17200 | 3.6645 | - | - | - | - | - |
| 0.7443 | 17300 | 3.4899 | - | - | - | - | - |
| 0.7486 | 17400 | 3.7186 | - | - | - | - | - |
| 0.7529 | 17500 | 3.6199 | - | - | - | - | - |
| 0.7572 | 17600 | 4.4274 | - | - | - | - | - |
| 0.7615 | 17700 | 4.0262 | - | - | - | - | - |
| 0.7658 | 17800 | 3.9325 | - | - | - | - | - |
| 0.7701 | 17900 | 3.6338 | - | - | - | - | - |
| 0.7744 | 18000 | 3.6136 | - | - | - | - | - |
| 0.7787 | 18100 | 3.4514 | - | - | - | - | - |
| 0.7830 | 18200 | 3.4427 | - | - | - | - | - |
| 0.7873 | 18300 | 3.3601 | - | - | - | - | - |
| 0.7916 | 18400 | 3.313 | - | - | - | - | - |
| 0.7959 | 18500 | 3.4062 | - | - | - | - | - |
| 0.8002 | 18600 | 3.098 | - | - | - | - | - |
| 0.8045 | 18700 | 3.183 | - | - | - | - | - |
| 0.8088 | 18800 | 3.1482 | - | - | - | - | - |
| 0.8131 | 18900 | 3.0122 | - | - | - | - | - |
| 0.8174 | 19000 | 3.0828 | - | - | - | - | - |
| 0.8217 | 19100 | 3.063 | - | - | - | - | - |
| 0.8260 | 19200 | 2.9688 | - | - | - | - | - |
| 0.8303 | 19300 | 3.0425 | - | - | - | - | - |
| 0.8346 | 19400 | 3.2018 | - | - | - | - | - |
| 0.8389 | 19500 | 2.9111 | - | - | - | - | - |
| 0.8432 | 19600 | 2.9516 | - | - | - | - | - |
| 0.8475 | 19700 | 2.9115 | - | - | - | - | - |
| 0.8518 | 19800 | 2.9323 | - | - | - | - | - |
| 0.8561 | 19900 | 2.8753 | - | - | - | - | - |
| 0.8604 | 20000 | 2.8344 | - | - | - | - | - |
| 0.8647 | 20100 | 2.7665 | - | - | - | - | - |
| 0.8690 | 20200 | 2.7732 | - | - | - | - | - |
| 0.8733 | 20300 | 2.8622 | - | - | - | - | - |
| 0.8776 | 20400 | 2.8749 | - | - | - | - | - |
| 0.8819 | 20500 | 2.8534 | - | - | - | - | - |
| 0.8863 | 20600 | 2.9254 | - | - | - | - | - |
| 0.8906 | 20700 | 2.7366 | - | - | - | - | - |
| 0.8949 | 20800 | 2.7287 | - | - | - | - | - |
| 0.8992 | 20900 | 2.9469 | - | - | - | - | - |
| 0.9035 | 21000 | 2.9052 | - | - | - | - | - |
| 0.9078 | 21100 | 2.7256 | - | - | - | - | - |
| 0.9121 | 21200 | 2.8469 | - | - | - | - | - |
| 0.9164 | 21300 | 2.6626 | - | - | - | - | - |
| 0.9207 | 21400 | 2.6796 | - | - | - | - | - |
| 0.9250 | 21500 | 2.6927 | - | - | - | - | - |
| 0.9293 | 21600 | 2.7125 | - | - | - | - | - |
| 0.9336 | 21700 | 2.6734 | - | - | - | - | - |
| 0.9379 | 21800 | 2.7199 | - | - | - | - | - |
| 0.9422 | 21900 | 2.6635 | - | - | - | - | - |
| 0.9465 | 22000 | 2.5218 | - | - | - | - | - |
| 0.9508 | 22100 | 2.7595 | - | - | - | - | - |
| 0.9551 | 22200 | 2.6821 | - | - | - | - | - |
| 0.9594 | 22300 | 2.6578 | - | - | - | - | - |
| 0.9637 | 22400 | 2.568 | - | - | - | - | - |
| 0.9680 | 22500 | 2.5527 | - | - | - | - | - |
| 0.9723 | 22600 | 2.6857 | - | - | - | - | - |
| 0.9766 | 22700 | 2.6637 | - | - | - | - | - |
| 0.9809 | 22800 | 2.6311 | - | - | - | - | - |
| 0.9852 | 22900 | 2.4635 | - | - | - | - | - |
| 0.9895 | 23000 | 2.6239 | - | - | - | - | - |
| 0.9938 | 23100 | 2.6873 | - | - | - | - | - |
| 0.9981 | 23200 | 2.5138 | - | - | - | - | - |
| 1.0 | 23244 | - | 0.6670 | 0.6815 | 0.6859 | 0.6524 | 0.6872 |
### Framework Versions
- Python: 3.11.9
- Sentence Transformers: 3.0.1
- Transformers: 4.40.1
- PyTorch: 2.3.0+cu121
- Accelerate: 0.29.3
- Datasets: 2.19.0
- Tokenizers: 0.19.1
## Citation
### BibTeX
#### Sentence Transformers
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "",
#### MatryoshkaLoss
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
#### MultipleNegativesRankingLoss
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
## Glossary
*Clearly define terms in order to be accessible across audiences.*
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*