SentenceTransformer based on intfloat/multilingual-e5-large-instruct

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large-instruct. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-large-instruct
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'embedding_dimension': 1024, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("MinhPhuc0804/me5-512-docling-checkthat-task1-v1.2")
# Run inference
sentences = [
    'query: It’s been obvious for ages that mRNA vaccines constituted a 3+ dose series. A 3‑dose series is very effective. The fourth dose is still better and ought to be made available. Why does Canada still label partially (2 dose) vaccinated as “fully vaccinated”?',
    'passage: title: Protection against omicron severe disease 0-7 months after BNT162b2 booster\nabstract: Abstract Following a rise in cases due to the delta variant and evidence of waning immunity after 2 doses of the BNT162b2 vaccine, Israel began administering a third BNT162b2 dose (booster) in July 2021. Recent studies showed that the 3rd dose provides a much lower protection against infection with the omicron variant compared to the delta variant and that this protection wanes quickly. In this study, we used data from Israel to estimate the protection of the 3rd dose against severe disease up to 7 months from receiving the booster dose. The analysis shows that protection conferred by the 3rd dose against omicron did not wane over a 7-month period and that a 4th dose further increased protection, with a severe disease rate approximately 3-fold lower than in the 3-dose cohorts.',
    'passage: title: A fourth dose of the mRNA-1273 SARS-CoV-2 vaccine improves serum neutralization against the delta variant in kidney transplant recipients\nabstract: Abstract In immunocompetent subjects, the effectiveness of SARS-CoV-2 vaccines against the delta variant appears three- to five-fold lower than that observed against the alpha variant. Additionally, three doses of SARS-CoV-2 mRNA-based vaccines might be unable to elicit a sufficient immune response against any variant in immunocompromised kidney transplant recipients. This study describes the kinetics of the neutralizing antibody (NAbs) response against the delta strain before and after a fourth dose of a mRNA vaccine in 67 kidney transplant recipients who had experienced a weak antibody response after three doses. While only 16% of patients harbored NAbs against the delta strain prior to the fourth injection – this percentage raised to 66% afterwards. We also found that, after the fourth dose, the NAbs titer increased significantly (p=0.0001) from <7.5 (IQR : <7.5−15.1) to 47.1 (IQR <7.5−284.2). Collectively, our data indicate that a fourth dose of the mRNA-1273 vaccine in kidney transplant recipients with a weak antibody response after three previous doses improves serum neutralization against the delta variant.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8132, 0.2067],
#         [0.8132, 1.0000, 0.1794],
#         [0.2067, 0.1794, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.4748
cosine_accuracy@3 0.6582
cosine_accuracy@5 0.7148
cosine_accuracy@10 0.7782
cosine_precision@1 0.4748
cosine_precision@3 0.2194
cosine_precision@5 0.143
cosine_precision@10 0.0778
cosine_recall@1 0.4748
cosine_recall@3 0.6582
cosine_recall@5 0.7148
cosine_recall@10 0.7782
cosine_ndcg@10 0.6259
cosine_mrr@10 0.5771
cosine_map@100 0.5824

Training Details

Training Dataset

Unnamed Dataset

  • Size: 17,319 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 25 tokens
    • mean: 58.26 tokens
    • max: 105 tokens
    • min: 28 tokens
    • mean: 311.97 tokens
    • max: 512 tokens
    • min: 20 tokens
    • mean: 320.42 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    query: I was fact-checked when I covered this topic for @user last year. Since the story back then was, apparently, that damp strips of fabric dangling over people's faces for hours on end couldn't possibly spawn anything nasty - because Science™!! passage: title: Bacterial and fungal isolation from face masks under the COVID-19 pandemic
    abstract: Abstract The COVID-19 pandemic has led people to wear face masks daily in public. Although the effectiveness of face masks against viral transmission has been extensively studied, there have been few reports on potential hygiene issues due to bacteria and fungi attached to the face masks. We aimed to (1) quantify and identify the bacteria and fungi attaching to the masks, and (2) investigate whether the mask-attached microbes could be associated with the types and usage of the masks and individual lifestyles. We surveyed 109 volunteers on their mask usage and lifestyles, and cultured bacteria and fungi from either the face-side or outer-side of their masks. The bacterial colony numbers were greater on the face-side than the outer-side; the fungal colony numbers were fewer on the face-side than the outer-side. A longer mask usage significantly increased the fungal colony numbers but not ... passage: is very low.
    title: Do facemasks protect against COVID‐19? Symptomatic health-care workers should not return to work until they have been tested and found to be negative for COVID-19. The public might wear masks to avoid infection or to protect others. During the 2009 pandemic of H1N1 influenza (swine flu), encouraging the public to wash their hands reduced the incidence of infection significantly whereas wearing facemasks did not.5 There is no good evidence that facemasks protect the public against infection with respiratory viruses, including COVID-19.6 However, absence of proof of an effect is not the same as proof of absence of an effect. During the pandemics caused by swine flu and by the coronaviruses which caused SARS and MERS, many people in Asia and elsewhere walked around wearing surgical or homemade cotton masks to protect themselves. One danger of doing this is the illusion of protection. Surgical facemasks are designed to be discarded after single use.... | | query: @user If just the US government had some National Institution of Health entity which could’ve been showcasing, studying and verifying this type of advantage from data years earlier .. to apply immediately without political spin | passage: title: Chloroquine is a potent inhibitor of SARS coronavirus infection and spread abstract: Abstract Background Severe acute respiratory syndrome (SARS) is caused by a newly discovered coronavirus (SARS-CoV). No effective prophylactic or post-exposure therapy is currently available. Results We report, however, that chloroquine has strong antiviral effects on SARS-CoV infection of primate cells. These inhibitory effects are observed when the cells are treated with the drug either before or after exposure to the virus, suggesting both prophylactic and therapeutic advantage. In addition to the well-known functions of chloroquine such as elevations of endosomal pH, the drug appears to interfere with terminal glycosylation of the cellular receptor, angiotensin-converting enzyme 2. This may negatively influence the virus-receptor binding and abrogate the infection, with further ramifications by the elevation of vesicular pH, resulting in the inhibition of infection and spread of SAR... | passage: title: A National Medical Response to Crisis — The Legacy of World War II
    abstract: A National Medical Response to Crisis World War II’s massive casualties were mitigated by lives saved as a result of medical care. Many of the advances made would persist long after the war conclud...
    | | query: UNDENIABLE EVIDENCE OF MY SPIKE PROTEIN TRIGGERED WIDESPREAD AMYLOIDOSES THEORY. IT. IS. OCCURRING. | passage: title: Amyloidogenesis of SARS-CoV-2 Spike Protein abstract: ABSTRACT SARS-CoV-2 infection is associated with a surprising number of morbidities. Uncanny similarities with amyloid-disease associated blood coagulation and fibrinolytic disturbances together with neurologic and cardiac problems led us to investigate the amyloidogenicity of the SARS-CoV-2 Spike protein (S-protein). Amyloid fibril assays of peptide library mixtures and theoretical predictions identified seven amyloidogenic sequences within the S-protein. All seven peptides in isolation formed aggregates during incubation at 37°C. Three 20-amino acid long synthetic Spike peptides (sequence 191-210, 599-618, 1165-1184) fulfilled three amyloid fibril criteria: nucleation dependent polymerization kinetics by ThT, Congo red positivity and ultrastructural fibrillar morphology. Full-length folded S-protein did not form amyloid fibrils, but amyloid-like fibrils with evident branching were formed during 24 hours of S-protei... | passage: title: Amyloidogenesis of SARS-CoV-2 Spike Protein abstract: SARS-CoV-2 infection is associated with a surprising number of morbidities. Uncanny similarities with amyloid-disease associated blood coagulation and fibrinolytic disturbances together with neurologic and cardiac problems led us to investigate the amyloidogenicity of the SARS-CoV-2 spike protein (S-protein). Amyloid fibril assays of peptide library mixtures and theoretical predictions identified seven amyloidogenic sequences within the S-protein. All seven peptides in isolation formed aggregates during incubation at 37 °C. Three 20-amino acid long synthetic spike peptides (sequence 192–211, 601–620, 1166–1185) fulfilled three amyloid fibril criteria: nucleation dependent polymerization kinetics by ThT, Congo red positivity, and ultrastructural fibrillar morphology. Full-length folded S-protein did not form amyloid fibrils, but amyloid-like fibrils with evident branching were formed during 24 h of S-protein coincubat... |
  • Loss: main.TripletMNRLCombinedLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 48
  • per_device_eval_batch_size: 48
  • num_train_epochs: 20
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 48
  • per_device_eval_batch_size: 48
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss 10-percent-dev-split_cosine_ndcg@10
1.0 361 - 0.6980
1.3850 500 1.6273 -
2.0 722 - 0.7033
2.7701 1000 0.9528 -
3.0 1083 - 0.7110
4.0 1444 - 0.6994
4.1551 1500 0.6268 -
5.0 1805 - 0.6933
5.5402 2000 0.4279 -
6.0 2166 - 0.6883
6.9252 2500 0.3117 -
7.0 2527 - 0.6620
8.0 2888 - 0.6707
8.3102 3000 0.2262 -
9.0 3249 - 0.6671
9.6953 3500 0.1799 -
10.0 3610 - 0.6579
11.0 3971 - 0.6470
11.0803 4000 0.139 -
12.0 4332 - 0.6469
12.4654 4500 0.1094 -
13.0 4693 - 0.6415
13.8504 5000 0.0911 -
14.0 5054 - 0.6439
15.0 5415 - 0.6284
15.2355 5500 0.0755 -
16.0 5776 - 0.6272
16.6205 6000 0.0664 -
17.0 6137 - 0.6290
18.0 6498 - 0.6253
18.0055 6500 0.0573 -
19.0 6859 - 0.6275
19.3906 7000 0.052 -
20.0 7220 - 0.6259

Training Time

  • Training: 2.2 hours

Framework Versions

  • Python: 3.12.6
  • Sentence Transformers: 5.4.1
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu129
  • Accelerate: 1.10.1
  • Datasets: 4.8.5
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
5
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MinhPhuc0804/me5-512-docling-checkthat-task1-v1.2

Finetuned
(179)
this model

Paper for MinhPhuc0804/me5-512-docling-checkthat-task1-v1.2

Evaluation results