Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use agraharr/telecom-bge-base-matryoshka with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("agraharr/telecom-bge-base-matryoshka")
sentences = [
"Why is there a specific requirement for maximum uplink transmission timing difference in asynchronous EN-DC configurations?",
"The bandwidth combination set defines the specific arrangements and allowances for channel bandwidths across different NR bands in carrier aggregation configurations. For example, in the case of the CA_n2A-n5A-n48A configuration, the bandwidth combination set is set to 0, indicating that only the specified bandwidths of 5, 10, 15, and 20 MHz can be utilized without any further combinations. This structure ensures that operators can optimize their resources while adhering to technical constraints, leading to improved network efficiency and performance.<|im_end|>",
"The specific requirement for maximum uplink transmission timing difference in asynchronous EN-DC configurations is crucial for ensuring reliable and efficient communication between the User Equipment (UE) and the network. Variations in timing can lead to synchronization issues, impacting the quality of service and data transmission. By defining these timing differences based on various sub-carrier spacings, the specification aims to optimize the performance of the network while accommodating different configurations, ensuring that the UE can handle the complexities of multi-band operation without significant degradation in service quality.<|im_end|>",
"2. Integrity protection"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'cls', 'include_prompt': True})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("agraharr/telecom-bge-base-matryoshka")
# Run inference
sentences = [
'Which aspects of the repeater’s operation does the input intermodulation requirement apply to?',
'The input intermodulation requirement applies to both the uplink and downlink operations of the NR repeater. Specifically, if the base station (BS) side is confirmed to meet co-location requirements, it must also adhere to the input intermodulation co-location requirements for the downlink. Similarly, if the user equipment (UE) side meets co-location requirements, it must comply with the input intermodulation co-location requirements for the uplink. This dual applicability ensures comprehensive protection against intermodulation interference.<|im_end|>',
'The NG_RAN_PRN-Core feature enables User Equipment (UE) to acquire NPN-relevant Cell Global Identity (CGI) information from neighboring intra-frequency or inter-frequency NR NPN cells. This is achieved by the UE reading the System Information (SI) of the neighboring cell and subsequently reporting the acquired CGI information back to the network. The feature is defined in the technical specification TS 38.331, which outlines the procedures and requirements for this capability. Notably, this feature is conditionally mandatory; if the UE supports Non-Public Networks (NPN), it must also support the NG_RAN_PRN-Core feature. This capability is crucial for ensuring that the UE can effectively communicate and interact with NPN environments, enhancing overall network connectivity and performance.<|im_end|>',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7323, 0.0072],
# [0.7323, 1.0000, 0.0721],
# [0.0072, 0.0721, 1.0000]])
telecom-evalTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.9309 |
sentence_0, sentence_1, and sentence_2| sentence_0 | sentence_1 | sentence_2 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| sentence_0 | sentence_1 | sentence_2 |
|---|---|---|
What is the purpose of Wired Equivalent Privacy (WEP) in wireless networks? [IEEE 802.11] |
To provide privacy and encryption for data |
To terminate the authentication process |
What is the purpose of an optical hybrid in coherent communications? |
Combine two optical signals |
Convert optical signals to electrical signals |
Which of the following EVPN services is used for point-to-point Layer 2 communication without MAC learning? |
3. EVPN E-Line |
4. EVPN VLAN-based |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false,
"directions": [
"query_to_doc"
],
"partition_mode": "joint",
"hardness_mode": null,
"hardness_strength": 0.0
}
per_device_train_batch_size: 16disable_tqdm: Trueper_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robinper_device_train_batch_size: 16num_train_epochs: 3max_steps: -1learning_rate: 5e-05lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_steps: 0optim: adamw_torchoptim_args: Noneweight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08optim_target_modules: Nonegradient_accumulation_steps: 1average_tokens_across_devices: Truemax_grad_norm: 1label_smoothing_factor: 0.0bf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Nonetorch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneuse_liger_kernel: Falseliger_kernel_config: Noneuse_cache: Falseneftune_noise_alpha: Nonetorch_empty_cache_steps: Noneauto_find_batch_size: Falselog_on_each_node: Truelogging_nan_inf_filter: Trueinclude_num_input_tokens_seen: nolog_level: passivelog_level_replica: warningdisable_tqdm: Trueproject: huggingfacetrackio_space_id: Nonetrackio_bucket_id: Nonetrackio_static_space_id: Noneper_device_eval_batch_size: 16prediction_loss_only: Trueeval_on_start: Falseeval_do_concat_batches: Trueeval_use_gather_object: Falseeval_accumulation_steps: Noneinclude_for_metrics: []batch_eval_metrics: Falsesave_only_model: Falsesave_on_each_node: Falseenable_jit_checkpoint: Falsepush_to_hub: Falsehub_private_repo: Nonehub_model_id: Nonehub_strategy: every_savehub_always_push: Falsehub_revision: Noneload_best_model_at_end: Falseignore_data_skip: Falserestore_callback_states_from_checkpoint: Falsefull_determinism: Falseseed: 42data_seed: Noneuse_cpu: Falseaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedataloader_drop_last: Falsedataloader_num_workers: 0dataloader_pin_memory: Truedataloader_persistent_workers: Falsedataloader_prefetch_factor: Noneremove_unused_columns: Truelabel_names: Nonetrain_sampling_strategy: randomlength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falseddp_static_graph: Noneddp_backend: Noneddp_timeout: 1800fsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}deepspeed: Nonedebug: []skip_memory_metrics: Truedo_predict: Falseresume_from_checkpoint: Nonewarmup_ratio: Nonelocal_rank: -1prompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | telecom-eval_cosine_accuracy |
|---|---|---|---|
| -1 | -1 | - | 0.8999 |
| 0.1949 | 500 | 0.5379 | - |
| 0.3897 | 1000 | 0.3217 | - |
| 0.5846 | 1500 | 0.3156 | - |
| 0.7794 | 2000 | 0.2729 | - |
| 0.9743 | 2500 | 0.2682 | - |
| 1.0 | 2566 | - | 0.9261 |
| 1.1691 | 3000 | 0.1888 | - |
| 1.3640 | 3500 | 0.1704 | - |
| 1.5588 | 4000 | 0.1695 | - |
| 1.7537 | 4500 | 0.1743 | - |
| 1.9486 | 5000 | 0.1710 | - |
| 2.0 | 5132 | - | 0.9261 |
| 2.1434 | 5500 | 0.1257 | - |
| 2.3383 | 6000 | 0.1001 | - |
| 2.5331 | 6500 | 0.1195 | - |
| 2.7280 | 7000 | 0.1105 | - |
| 2.9228 | 7500 | 0.1002 | - |
| 3.0 | 7698 | - | 0.9309 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
Base model
BAAI/bge-base-en-v1.5