SentenceTransformer based on BAAI/bge-small-en

This is a sentence-transformers model finetuned from BAAI/bge-small-en. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What is the website to find services for customers purchasing from a commercial reseller?',
    'Parts and Materials\nHPE will provide HPE-supported replacement parts and materials necessary to maintain the covered hardware\nproduct in operating condition, including parts and materials for available and recommended engineering\nimprovements. \xa0\nParts and components that have reached their maximum supported lifetime and/or the maximum usage\nlimitations as set forth in the manufacturer\'s operating manual, product quick-specs, or the technical product\ndata sheet will not be provided, repaired, or replaced as part of these services.\n\xa0\nHow to Purchase Services\nServices are sold by Hewlett Packard Enterprise and Hewlett Packard Enterprise Authorized Service Partners:\nServices for customers purchasing from HPE or an enterprise reseller are quoted using HPE order\nconfiguration tools.\nCustomers purchasing from a commercial reseller can find services at\nhttps://ssc.hpe.com/portal/site/ssc/\n\xa0\nAI Powered and Digitally Enabled Support Experience\nAchieve faster time to resolution with access to product-specific resources and expertise through a digital and\ndata driven customer experience \xa0\nSign into the HPE Support Center experience, featuring streamlined self-serve case creation and\nmanagement capabilities with inline knowledge recommendations. You will also find personalized task alerts\nand powerful troubleshooting support through an intelligent virtual agent with seamless transition when needed\nto a live support agent. \xa0\nhttps://support.hpe.com/hpesc/public/home/signin\nConsume IT On Your Terms\nHPE GreenLake  edge-to-cloud platform brings the cloud experience directly to your apps and data wherever\nthey are-the edge, colocations, or your data center. It delivers cloud services for on-premises IT infrastructure\nspecifically tailored to your most demanding workloads. With a pay-per-use, scalable, point-and-click self-\nservice experience that is managed for you, HPE GreenLake edge-to-cloud platform accelerates digital\ntransformation in a distributed, edge-to-cloud world.\nGet faster time to market\nSave on TCO, align costs to business\nScale quickly, meet unpredictable demand\nSimplify IT operations across your data centers and clouds\nTo learn more about HPE Services, please contact your Hewlett Packard Enterprise sales representative or\nHewlett Packard Enterprise Authorized Channel Partner. \xa0 Contact information for a representative in your area\ncan be found at "Contact HPE"  https://www.hpe.com/us/en/contact-hpe.html \xa0\nFor more information\nhttp://www.hpe.com/services\nQuickSpecs\nHPE Cray XD675\nService and Support\nDA - 17239\xa0\xa0\xa0Worldwide QuickSpecs — Version 4 — 8/19/2024\nPage\xa0 13',
    'HPE Cray XD675 Server Top View\nItem Description \xa0 \xa0\n1. 8x AMD MI300X OAM Accelerator \xa0 \xa0\n\xa0\nQuickSpecs\nHPE Cray XD675\nOverview\nDA - 17239\xa0\xa0\xa0Worldwide QuickSpecs — Version 4 — 8/19/2024\nPage\xa0 2',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.4857
cosine_accuracy@3 0.8048
cosine_accuracy@5 0.8619
cosine_accuracy@10 0.9095
cosine_precision@1 0.4857
cosine_precision@3 0.2683
cosine_precision@5 0.1724
cosine_precision@10 0.091
cosine_recall@1 0.4857
cosine_recall@3 0.8048
cosine_recall@5 0.8619
cosine_recall@10 0.9095
cosine_ndcg@10 0.7184
cosine_mrr@10 0.6552
cosine_map@100 0.6599
dot_accuracy@1 0.4857
dot_accuracy@3 0.8048
dot_accuracy@5 0.8619
dot_accuracy@10 0.9095
dot_precision@1 0.4857
dot_precision@3 0.2683
dot_precision@5 0.1724
dot_precision@10 0.091
dot_recall@1 0.4857
dot_recall@3 0.8048
dot_recall@5 0.8619
dot_recall@10 0.9095
dot_ndcg@10 0.7184
dot_mrr@10 0.6552
dot_map@100 0.6599

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,221 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 3 tokens
    • mean: 22.72 tokens
    • max: 80 tokens
    • min: 36 tokens
    • mean: 328.94 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    What is the maximum number of Apollo n2X00 series chassis that can fit in a 42U rack? HPE Apollo 2000 Gen10 Plus System  
    HPE is bringing the power of supercomputing to datacenters of any size with the Apollo 2000 Gen10 Plus
    system.  
    The HPE Apollo 2000 Gen10 Plus System is a dense, multi-server platform that packs incredible
    performance and workload flexibility into a small datacenter space, while delivering the efficiencies of a
    shared infrastructure. It is designed to provide a bridge to scale-out architecture for traditional data centers,
    so enterprise and SME customers can achieve the space-saving value of density-optimized infrastructure in a
    cost-effective and non-disruptive manner.  
    The Apollo 2000 Gen10 Plus offers a density optimized, shared infrastructure with a flexible scale-out
    architecture to support a variety of workloads from remote site systems to large HPC clusters and everything
    in between. HPE iLO5 provides built-in firmware-level server security with silicon root of trust. It can be
    deployed cost-effectively starting with a single 2U, shared infrastructure chassis and configured with a variety
    of storage options to meet the configuration needs of a wide variety of scale-out workloads.
     
    The Apollo 2000 Gen10 Plus System delivers up to four times the density of a traditional rack mount server
    with up to four ProLiant Gen10 Plus independent servers per 2U mounted in standard racks with rear-aisle
    serviceability access. A 42U rack fits up to 20 Apollo n2X00 series chassis accommodating up to 80 servers
    per rack.
    What's New  
    Support for up to four Xilinx Alveo U50 single wide  GPU's  in XL290n node.
    Enables a robust stack of Intel 3 rd generation Xeon Scalable Processors to increase your power density
    and increase datacenter efficiency. Intel AVX-512 * feature increases memory bandwidth, improves
    frequency management to enable greater performance. Also Speed Select Technology (SST) allows
    Core count and frequency flexibility *
    The Direct Liquid Cooling (DLC) option for the Apollo 2000 Gen10 Plus System comes ready to plug and
    play. Choose from either CPU only or CPU plus memory cooling options.
    Enables flexible choices with Intel 3 rd Generation Xeon Scalable Processors and AMD 2 nd and 3 rd
    generation EPYC Processors
    New flexible infrastructure offers multiple storage options, 8 memory channels and 3200 MT/s memory,
    PCIe Gen4 and support for processors over 250W for improved application performance.
    Complete software portfolio for all customer workloads, for node to rack management, including
    comprehensive integrated cluster management software
    Secure from the start with firmware anchored into silicon with iLO5 and silicon root of trust for the
    highest level of system security  
    Notes: *Available on select processors
     
    QuickSpecs
    HPE Apollo 2000 Gen10 Plus System
    Overview
    DA - 16526   Worldwide QuickSpecs — Version 41 — 7/1/2024
    Page  1
    What is the maximum number of independent servers that can be mounted in a single 2U Apollo 2000 Gen10 Plus System chassis? HPE Apollo 2000 Gen10 Plus System  
    HPE is bringing the power of supercomputing to datacenters of any size with the Apollo 2000 Gen10 Plus
    system.  
    The HPE Apollo 2000 Gen10 Plus System is a dense, multi-server platform that packs incredible
    performance and workload flexibility into a small datacenter space, while delivering the efficiencies of a
    shared infrastructure. It is designed to provide a bridge to scale-out architecture for traditional data centers,
    so enterprise and SME customers can achieve the space-saving value of density-optimized infrastructure in a
    cost-effective and non-disruptive manner.  
    The Apollo 2000 Gen10 Plus offers a density optimized, shared infrastructure with a flexible scale-out
    architecture to support a variety of workloads from remote site systems to large HPC clusters and everything
    in between. HPE iLO5 provides built-in firmware-level server security with silicon root of trust. It can be
    deployed cost-effectively starting with a single 2U, shared infrastructure chassis and configured with a variety
    of storage options to meet the configuration needs of a wide variety of scale-out workloads.
     
    The Apollo 2000 Gen10 Plus System delivers up to four times the density of a traditional rack mount server
    with up to four ProLiant Gen10 Plus independent servers per 2U mounted in standard racks with rear-aisle
    serviceability access. A 42U rack fits up to 20 Apollo n2X00 series chassis accommodating up to 80 servers
    per rack.
    What's New  
    Support for up to four Xilinx Alveo U50 single wide  GPU's  in XL290n node.
    Enables a robust stack of Intel 3 rd generation Xeon Scalable Processors to increase your power density
    and increase datacenter efficiency. Intel AVX-512 * feature increases memory bandwidth, improves
    frequency management to enable greater performance. Also Speed Select Technology (SST) allows
    Core count and frequency flexibility *
    The Direct Liquid Cooling (DLC) option for the Apollo 2000 Gen10 Plus System comes ready to plug and
    play. Choose from either CPU only or CPU plus memory cooling options.
    Enables flexible choices with Intel 3 rd Generation Xeon Scalable Processors and AMD 2 nd and 3 rd
    generation EPYC Processors
    New flexible infrastructure offers multiple storage options, 8 memory channels and 3200 MT/s memory,
    PCIe Gen4 and support for processors over 250W for improved application performance.
    Complete software portfolio for all customer workloads, for node to rack management, including
    comprehensive integrated cluster management software
    Secure from the start with firmware anchored into silicon with iLO5 and silicon root of trust for the
    highest level of system security  
    Notes: *Available on select processors
     
    QuickSpecs
    HPE Apollo 2000 Gen10 Plus System
    Overview
    DA - 16526   Worldwide QuickSpecs — Version 41 — 7/1/2024
    Page  1
    What is the processor type supported by the HPE Apollo n2800 Gen10 Plus 24 SFF Flexible CTO chassis? HPE Apollo n2600 Gen10 Plus SFF CTO Chassis supports both Intel and AMD based server nodes
    HPE Apollo n2800 Gen10 Plus 24 SFF Flexible CTO chassis supports Intel based server nodes
    Backplane selection determines number and type of drives supported
    Item Description Item Description
    1 SFF hot-plug drives 3 Health LED
    2 Serial number/iLO information pull tab 4 UID button LED
     
    Chassis Rear Panel Components - 4 x 1U nodes
    Item Description Item Description
    1 Server 3 & 4 4 iLO Ports
    2 HPE Apollo Platform Manager (APM) 2.0 port 5 Server 1 & 2
    3 Power supply 1 & 2 6 Optional Rack Consolidation Module (RCM)
     
     
    QuickSpecs
    HPE Apollo 2000 Gen10 Plus System
    Overview
    DA - 16526   Worldwide QuickSpecs — Version 41 — 7/1/2024
    Page  2
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • num_train_epochs: 20
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_map@100
1.0 7 0.4864
2.0 14 0.5209
3.0 21 0.5131
4.0 28 0.5047
5.0 35 0.5480
6.0 42 0.5808
7.0 49 0.5950
7.1429 50 0.5975
8.0 56 0.6145
9.0 63 0.6268
10.0 70 0.6292
11.0 77 0.6385
12.0 84 0.6445
13.0 91 0.6279
14.0 98 0.6296
14.2857 100 0.6321
15.0 105 0.6317
16.0 112 0.6401
17.0 119 0.6590
18.0 126 0.6562
19.0 133 0.6599

Framework Versions

  • Python: 3.11.8
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.2.2+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
30
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nanos-hpe/bge-small-qs

Base model

BAAI/bge-small-en
Finetuned
(6)
this model

Evaluation results