SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Give me a quick summary of [NEWSLETTER_NAME_1]',
    '[{"newsletter_search([\'<NEWSLETTER_NAME_1>\'],None,None,\'<DATES>\',True)": "newsletter_chunks"}]',
    '[{"get_portfolio([\'type\', \'marketValue\'],True,<PORTFOLIO_NAME_1>)": "portfolio"}, {"filter(\'portfolio\',\'type\',\'==\',\'CASH\')": "portfolio"}, {"aggregate(\'portfolio\',\'ticker\',\'marketValue\',\'sum\',None)": "buying_power"}]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6571
cosine_accuracy@3 0.9006
cosine_accuracy@5 0.9487
cosine_accuracy@10 0.9808
cosine_precision@1 0.6571
cosine_precision@3 0.3002
cosine_precision@5 0.1897
cosine_precision@10 0.0981
cosine_recall@1 0.0183
cosine_recall@3 0.025
cosine_recall@5 0.0264
cosine_recall@10 0.0272
cosine_ndcg@10 0.1826
cosine_mrr@10 0.7801
cosine_map@100 0.0217

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,024 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 13.46 tokens
    • max: 27 tokens
    • min: 20 tokens
    • mean: 90.37 tokens
    • max: 280 tokens
  • Samples:
    sentence_0 sentence_1
    how are my holdings doing [DATES]? [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    how much did I earn [DATES] [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    how am i doing [DATES]? [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 6
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss cosine_ndcg@10
0.0194 2 - 0.0744
0.0388 4 - 0.0750
0.0583 6 - 0.0769
0.0777 8 - 0.0794
0.0971 10 - 0.0819
0.1165 12 - 0.0861
0.1359 14 - 0.0878
0.1553 16 - 0.0923
0.1748 18 - 0.0976
0.1942 20 - 0.1006
0.2136 22 - 0.1046
0.2330 24 - 0.1071
0.2524 26 - 0.1093
0.2718 28 - 0.1115
0.2913 30 - 0.1141
0.3107 32 - 0.1160
0.3301 34 - 0.1176
0.3495 36 - 0.1204
0.3689 38 - 0.1222
0.3883 40 - 0.1249
0.4078 42 - 0.1264
0.4272 44 - 0.1281
0.4466 46 - 0.1302
0.4660 48 - 0.1328
0.4854 50 - 0.1356
0.5049 52 - 0.1380
0.5243 54 - 0.1405
0.5437 56 - 0.1431
0.5631 58 - 0.1471
0.5825 60 - 0.1485
0.6019 62 - 0.1494
0.6214 64 - 0.1494
0.6408 66 - 0.1511
0.6602 68 - 0.1523
0.6796 70 - 0.1520
0.6990 72 - 0.1523
0.7184 74 - 0.1534
0.7379 76 - 0.1531
0.7573 78 - 0.1532
0.7767 80 - 0.1534
0.7961 82 - 0.1544
0.8155 84 - 0.1549
0.8350 86 - 0.1567
0.8544 88 - 0.1573
0.8738 90 - 0.1570
0.8932 92 - 0.1572
0.9126 94 - 0.1569
0.9320 96 - 0.1570
0.9515 98 - 0.1577
0.9709 100 - 0.1570
0.9903 102 - 0.1578
1.0 103 - 0.1585
1.0097 104 - 0.1588
1.0291 106 - 0.1584
1.0485 108 - 0.1592
1.0680 110 - 0.1598
1.0874 112 - 0.1598
1.1068 114 - 0.1606
1.1262 116 - 0.1615
1.1456 118 - 0.1621
1.1650 120 - 0.1624
1.1845 122 - 0.1619
1.2039 124 - 0.1627
1.2233 126 - 0.1626
1.2427 128 - 0.1632
1.2621 130 - 0.1644
1.2816 132 - 0.1649
1.3010 134 - 0.1651
1.3204 136 - 0.1653
1.3398 138 - 0.1651
1.3592 140 - 0.1652
1.3786 142 - 0.1654
1.3981 144 - 0.1656
1.4175 146 - 0.1656
1.4369 148 - 0.1659
1.4563 150 - 0.1661
1.4757 152 - 0.1658
1.4951 154 - 0.1664
1.5146 156 - 0.1658
1.5340 158 - 0.1659
1.5534 160 - 0.1656
1.5728 162 - 0.1655
1.5922 164 - 0.1669
1.6117 166 - 0.1679
1.6311 168 - 0.1682
1.6505 170 - 0.1687
1.6699 172 - 0.1692
1.6893 174 - 0.1703
1.7087 176 - 0.1713
1.7282 178 - 0.1706
1.7476 180 - 0.1713
1.7670 182 - 0.1718
1.7864 184 - 0.1723
1.8058 186 - 0.1715
1.8252 188 - 0.1715
1.8447 190 - 0.1717
1.8641 192 - 0.1717
1.8835 194 - 0.1725
1.9029 196 - 0.1735
1.9223 198 - 0.1743
1.9417 200 - 0.1742
1.9612 202 - 0.1747
1.9806 204 - 0.1751
2.0 206 - 0.1753
2.0194 208 - 0.1757
2.0388 210 - 0.1758
2.0583 212 - 0.1753
2.0777 214 - 0.1749
2.0971 216 - 0.1749
2.1165 218 - 0.1749
2.1359 220 - 0.1749
2.1553 222 - 0.1751
2.1748 224 - 0.1745
2.1942 226 - 0.1738
2.2136 228 - 0.1742
2.2330 230 - 0.1741
2.2524 232 - 0.1735
2.2718 234 - 0.1732
2.2913 236 - 0.1735
2.3107 238 - 0.1735
2.3301 240 - 0.1731
2.3495 242 - 0.1735
2.3689 244 - 0.1736
2.3883 246 - 0.1741
2.4078 248 - 0.1742
2.4272 250 - 0.1745
2.4466 252 - 0.1750
2.4660 254 - 0.1758
2.4854 256 - 0.1757
2.5049 258 - 0.1761
2.5243 260 - 0.1758
2.5437 262 - 0.1761
2.5631 264 - 0.1761
2.5825 266 - 0.1762
2.6019 268 - 0.1761
2.6214 270 - 0.1759
2.6408 272 - 0.1759
2.6602 274 - 0.1768
2.6796 276 - 0.1772
2.6990 278 - 0.1773
2.7184 280 - 0.1769
2.7379 282 - 0.1767
2.7573 284 - 0.1766
2.7767 286 - 0.1769
2.7961 288 - 0.1770
2.8155 290 - 0.1763
2.8350 292 - 0.1767
2.8544 294 - 0.1768
2.8738 296 - 0.1773
2.8932 298 - 0.1772
2.9126 300 - 0.1765
2.9320 302 - 0.1769
2.9515 304 - 0.1769
2.9709 306 - 0.1771
2.9903 308 - 0.1771
3.0 309 - 0.1770
3.0097 310 - 0.1769
3.0291 312 - 0.1770
3.0485 314 - 0.1768
3.0680 316 - 0.1770
3.0874 318 - 0.1767
3.1068 320 - 0.1765
3.1262 322 - 0.1768
3.1456 324 - 0.1771
3.1650 326 - 0.1768
3.1845 328 - 0.1770
3.2039 330 - 0.1776
3.2233 332 - 0.1785
3.2427 334 - 0.1784
3.2621 336 - 0.1784
3.2816 338 - 0.1790
3.3010 340 - 0.1791
3.3204 342 - 0.1791
3.3398 344 - 0.1789
3.3592 346 - 0.1789
3.3786 348 - 0.1789
3.3981 350 - 0.1790
3.4175 352 - 0.1792
3.4369 354 - 0.1785
3.4563 356 - 0.1780
3.4757 358 - 0.1779
3.4951 360 - 0.1786
3.5146 362 - 0.1791
3.5340 364 - 0.1792
3.5534 366 - 0.1791
3.5728 368 - 0.1793
3.5922 370 - 0.1789
3.6117 372 - 0.1792
3.6311 374 - 0.1791
3.6505 376 - 0.1797
3.6699 378 - 0.1795
3.6893 380 - 0.1794
3.7087 382 - 0.1797
3.7282 384 - 0.1797
3.7476 386 - 0.1799
3.7670 388 - 0.1796
3.7864 390 - 0.1800
3.8058 392 - 0.1802
3.8252 394 - 0.1795
3.8447 396 - 0.1794
3.8641 398 - 0.1790
3.8835 400 - 0.1789
3.9029 402 - 0.1789
3.9223 404 - 0.1789
3.9417 406 - 0.1787
3.9612 408 - 0.1790
3.9806 410 - 0.1796
4.0 412 - 0.1799
4.0194 414 - 0.1793
4.0388 416 - 0.1797
4.0583 418 - 0.1792
4.0777 420 - 0.1795
4.0971 422 - 0.1803
4.1165 424 - 0.1804
4.1359 426 - 0.1805
4.1553 428 - 0.1805
4.1748 430 - 0.1808
4.1942 432 - 0.1808
4.2136 434 - 0.1811
4.2330 436 - 0.1814
4.2524 438 - 0.1811
4.2718 440 - 0.1813
4.2913 442 - 0.1813
4.3107 444 - 0.1814
4.3301 446 - 0.1814
4.3495 448 - 0.1815
4.3689 450 - 0.1814
4.3883 452 - 0.1812
4.4078 454 - 0.1813
4.4272 456 - 0.1813
4.4466 458 - 0.1818
4.4660 460 - 0.1814
4.4854 462 - 0.1813
4.5049 464 - 0.1815
4.5243 466 - 0.1812
4.5437 468 - 0.1812
4.5631 470 - 0.1812
4.5825 472 - 0.1812
4.6019 474 - 0.1811
4.6214 476 - 0.1812
4.6408 478 - 0.1810
4.6602 480 - 0.1810
4.6796 482 - 0.1808
4.6990 484 - 0.1807
4.7184 486 - 0.1810
4.7379 488 - 0.1810
4.7573 490 - 0.1806
4.7767 492 - 0.1806
4.7961 494 - 0.1807
4.8155 496 - 0.1812
4.8350 498 - 0.1811
4.8544 500 0.3558 0.1811
4.8738 502 - 0.1815
4.8932 504 - 0.1814
4.9126 506 - 0.1817
4.9320 508 - 0.1814
4.9515 510 - 0.1816
4.9709 512 - 0.1816
4.9903 514 - 0.1814
5.0 515 - 0.1814
5.0097 516 - 0.1817
5.0291 518 - 0.1815
5.0485 520 - 0.1815
5.0680 522 - 0.1820
5.0874 524 - 0.1817
5.1068 526 - 0.1817
5.1262 528 - 0.1818
5.1456 530 - 0.1818
5.1650 532 - 0.1818
5.1845 534 - 0.1818
5.2039 536 - 0.1818
5.2233 538 - 0.1818
5.2427 540 - 0.1818
5.2621 542 - 0.1818
5.2816 544 - 0.1821
5.3010 546 - 0.1820
5.3204 548 - 0.1821
5.3398 550 - 0.1821
5.3592 552 - 0.1824
5.3786 554 - 0.1824
5.3981 556 - 0.1822
5.4175 558 - 0.1819
5.4369 560 - 0.1819
5.4563 562 - 0.1820
5.4757 564 - 0.1818
5.4951 566 - 0.1819
5.5146 568 - 0.1819
5.5340 570 - 0.1819
5.5534 572 - 0.1820
5.5728 574 - 0.1820
5.5922 576 - 0.1820
5.6117 578 - 0.1821
5.6311 580 - 0.1821
5.6505 582 - 0.1821
5.6699 584 - 0.1818
5.6893 586 - 0.1820
5.7087 588 - 0.1823
5.7282 590 - 0.1825
5.7476 592 - 0.1822
5.7670 594 - 0.1825
5.7864 596 - 0.1825
5.8058 598 - 0.1825
5.8252 600 - 0.1826
5.8447 602 - 0.1826
5.8641 604 - 0.1826

Framework Versions

  • Python: 3.12.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.50.0
  • PyTorch: 2.6.0
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
7
Safetensors
Model size
33.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for magnifi/bge-small-en-v1-5-ft-orc-093-hce

Finetuned
(175)
this model

Evaluation results