Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

BGE base Fast-DDS summaries

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("juanlofer/bge-base-fastdds-summaries-20epochs-666seed")
# Run inference
sentences = [
    'The transport layer provides communication services between DDS entities, using UDPv4, UDPv6, TCPv4, TCPv6, and SHM transports.',
    '* **TCPv4**: TCP communication over IPv4 (see TCP Transport).',
    'The following table shows the supported primitive types and their\ncorresponding "TypeKind". The "TypeKind" is used to query the\nDynamicTypeBuilderFactory for the specific primitive DynamicType.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.3341
cosine_accuracy@3 0.4455
cosine_accuracy@5 0.5035
cosine_accuracy@10 0.5661
cosine_precision@1 0.3341
cosine_precision@3 0.1485
cosine_precision@5 0.1007
cosine_precision@10 0.0566
cosine_recall@1 0.3341
cosine_recall@3 0.4455
cosine_recall@5 0.5035
cosine_recall@10 0.5661
cosine_ndcg@10 0.4437
cosine_mrr@10 0.4054
cosine_map@100 0.416

Information Retrieval

Metric Value
cosine_accuracy@1 0.3364
cosine_accuracy@3 0.4478
cosine_accuracy@5 0.4965
cosine_accuracy@10 0.5777
cosine_precision@1 0.3364
cosine_precision@3 0.1493
cosine_precision@5 0.0993
cosine_precision@10 0.0578
cosine_recall@1 0.3364
cosine_recall@3 0.4478
cosine_recall@5 0.4965
cosine_recall@10 0.5777
cosine_ndcg@10 0.4463
cosine_mrr@10 0.4057
cosine_map@100 0.4154

Information Retrieval

Metric Value
cosine_accuracy@1 0.3271
cosine_accuracy@3 0.4478
cosine_accuracy@5 0.4988
cosine_accuracy@10 0.5754
cosine_precision@1 0.3271
cosine_precision@3 0.1493
cosine_precision@5 0.0998
cosine_precision@10 0.0575
cosine_recall@1 0.3271
cosine_recall@3 0.4478
cosine_recall@5 0.4988
cosine_recall@10 0.5754
cosine_ndcg@10 0.4414
cosine_mrr@10 0.3997
cosine_map@100 0.4105

Information Retrieval

Metric Value
cosine_accuracy@1 0.3155
cosine_accuracy@3 0.4292
cosine_accuracy@5 0.4803
cosine_accuracy@10 0.5754
cosine_precision@1 0.3155
cosine_precision@3 0.1431
cosine_precision@5 0.0961
cosine_precision@10 0.0575
cosine_recall@1 0.3155
cosine_recall@3 0.4292
cosine_recall@5 0.4803
cosine_recall@10 0.5754
cosine_ndcg@10 0.4328
cosine_mrr@10 0.389
cosine_map@100 0.3994

Information Retrieval

Metric Value
cosine_accuracy@1 0.2854
cosine_accuracy@3 0.4153
cosine_accuracy@5 0.4687
cosine_accuracy@10 0.5568
cosine_precision@1 0.2854
cosine_precision@3 0.1384
cosine_precision@5 0.0937
cosine_precision@10 0.0557
cosine_recall@1 0.2854
cosine_recall@3 0.4153
cosine_recall@5 0.4687
cosine_recall@10 0.5568
cosine_ndcg@10 0.4098
cosine_mrr@10 0.3641
cosine_map@100 0.3744

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 20
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.6584 10 5.9441 - - - - -
0.9877 15 - 0.3686 0.3792 0.3819 0.3414 0.3795
1.3128 20 4.7953 - - - - -
1.9712 30 3.77 0.3854 0.3963 0.3962 0.3682 0.3995
2.6255 40 2.9211 - - - - -
2.9547 45 - 0.3866 0.3919 0.3958 0.3759 0.3963
3.2798 50 2.4548 - - - - -
3.9383 60 2.0513 - - - - -
4.0041 61 - 0.3808 0.4018 0.3980 0.3647 0.3962
4.5926 70 1.5898 - - - - -
4.9877 76 - 0.3829 0.4029 0.4035 0.3625 0.4014
5.2469 80 1.4677 - - - - -
5.9053 90 1.1974 - - - - -
5.9712 91 - 0.3918 0.4006 0.4041 0.3654 0.4033
6.5597 100 0.9285 - - - - -
6.9547 106 - 0.3914 0.4019 0.4033 0.3678 0.4014
7.2140 110 0.9214 - - - - -
7.8724 120 0.8141 - - - - -
8.0041 122 - 0.3914 0.3993 0.4071 0.3670 0.4027
8.5267 130 0.6706 - - - - -
8.9877 137 - 0.3903 0.4033 0.4060 0.3721 0.4060
9.1811 140 0.6388 - - - - -
9.8395 150 0.5466 - - - - -
9.9712 152 - 0.3915 0.4020 0.4079 0.3673 0.4046
10.4938 160 0.466 - - - - -
10.9547 167 - 0.3963 0.4069 0.4112 0.3697 0.4078
11.1481 170 0.4709 - - - - -
11.8066 180 0.437 - - - - -
12.0041 183 - 0.4003 0.4051 0.4096 0.3701 0.4059
12.4609 190 0.3678 - - - - -
12.9877 198 - 0.3976 0.4075 0.4088 0.3713 0.4080
13.1152 200 0.3944 - - - - -
13.7737 210 0.361 - - - - -
13.9712 213 - 0.3966 0.4091 0.4096 0.3724 0.4107
14.4280 220 0.2977 - - - - -
14.9547 228 - 0.3979 0.4102 0.4149 0.3744 0.4143
15.0823 230 0.3306 - - - - -
15.7407 240 0.3075 - - - - -
16.0041 244 - 0.3991 0.4102 0.4156 0.3726 0.4148
16.3951 250 0.2777 - - - - -
16.9877 259 - 0.3990 0.4101 0.4154 0.3743 0.4167
17.0494 260 0.3044 - - - - -
17.7078 270 0.2885 - - - - -
17.9712 274 - 0.3991 0.4099 0.4153 0.3746 0.4167
18.3621 280 0.2862 - - - - -
18.9547 289 - 0.3994 0.4105 0.4154 0.3743 0.4156
19.0165 290 0.2974 - - - - -
19.6749 300 0.2648 0.3994 0.4105 0.4154 0.3744 0.4160
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2
  • Accelerate: 0.30.1
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
10
Safetensors
Model size
109M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Evaluation results