SentenceTransformer based on pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_10K-BioASQ

This is a sentence-transformers model finetuned from pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_10K-BioASQ. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_10K-BioASQ-PQA")
# Run inference
sentences = [
    'Are circulating microparticles elevated in carriers of factor V Leiden?',
    'This is the first study on circulating MP levels in subjects who are heterozygote for factor V Leiden. We report that circulating platelet and leukocyte MP are elevated in carriers of this mutation and may be important contributors to risk of thrombosis.',
    'Isovolemic hemodilution (approximately 5% hematocrit) with albumin, pentastarch, or hetastarch solutions does not result in significant hepatic ischemia or injury assessed by histology.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 246,166 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 21.28 tokens
    • max: 48 tokens
    • min: 9 tokens
    • mean: 50.22 tokens
    • max: 241 tokens
  • Samples:
    anchor positive
    Survival of women with gestational trophoblastic neoplasia and liver metastases: is it improving? The prognosis of patients with liver metastases from GTN has improved. Outcome may be best in those patients presenting within 2.8 years of the causative pregnancy and without very large volumes of disease.
    Do serum nitrites predict the response to prostaglandin-induced delivery at term? A reduced level of NOx is associated with a prompt clinical response to PGE-induced labor. Provided we do not know the origin of NOx in the general circulation, these data indicate NOx levels as predictors of the response to PGE-induced delivery at term and support the hypothesis that labor onset is modulated by the endogenous NO activity.
    Is sleep deprivation an additional stress for parents staying in hospital? Parental sleep deprivation needs to be acknowledged and accommodated when nurses and parents negotiate the care of children in hospital.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 27,352 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 21.61 tokens
    • max: 63 tokens
    • min: 6 tokens
    • mean: 48.56 tokens
    • max: 223 tokens
  • Samples:
    anchor positive
    Is dEAD-box protein p68 regulated by β-catenin/transcription factor 4 to maintain a positive feedback loop in control of breast cancer progression? Our findings indicate that Wnt/β-catenin signaling plays an important role in breast cancer progression through p68 upregulation.
    Are obstetric medical emergency teams a step forward in maternal safety? In the literature, there is a lack of reporting and probably of implementation of Obstetrics METs. Therefore, there is a need for more standardized experiences and reports on the implementation of various types of Obstetrics METs. We propose here a design for Obstetrics METs to be implemented in developing countries, aiming to reduce maternal mortality and morbidity resulting from obstetric hemorrhage.
    Is monocyte-Induced Prostate Cancer Cell Invasion Mediated by Chemokine ligand 2 and Nuclear Factor-κB Activity? Co-cultures with monocyte-lineage cell lines stimulated increased prostate cancer cell invasion through increased CCL2 expression and increased prostate cancer cell NF-κB activity. CCL2 and NF-κB may be useful therapeutic targets to interfere with inflammation-induced prostate cancer invasion.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 4
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • push_to_hub: True
  • resume_from_checkpoint: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: True
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0260 100 0.0201 -
0.0520 200 0.0162 -
0.0780 300 0.0118 -
0.1040 400 0.0112 -
0.1300 500 0.0102 -
0.1560 600 0.0101 -
0.1820 700 0.0121 -
0.2080 800 0.0127 -
0.2340 900 0.008 -
0.2600 1000 0.0086 -
0.2860 1100 0.0073 -
0.3120 1200 0.0113 -
0.3380 1300 0.0084 -
0.3640 1400 0.0079 -
0.3900 1500 0.0073 -
0.4160 1600 0.0048 -
0.4420 1700 0.0088 -
0.4680 1800 0.0077 -
0.4940 1900 0.0076 -
0.5200 2000 0.0064 -
0.5460 2100 0.0074 -
0.5719 2200 0.0079 -
0.5979 2300 0.0079 -
0.6239 2400 0.0078 -
0.6499 2500 0.0073 -
0.6759 2600 0.0076 -
0.7019 2700 0.0085 -
0.7279 2800 0.0081 -
0.7539 2900 0.0083 -
0.7799 3000 0.0055 -
0.8059 3100 0.0068 -
0.8319 3200 0.007 -
0.8579 3300 0.0087 -
0.8839 3400 0.007 -
0.9099 3500 0.0068 -
0.9359 3600 0.0069 -
0.9619 3700 0.0109 -
0.9879 3800 0.0053 -
0.9999 3846 - 0.0062

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
18
Safetensors
Model size
41.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_10K-BioASQ-PQA