flucold-ft-v1 / README.md
Gonalb's picture
Add new SentenceTransformer model
bfdcee3 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:400
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: >-
      QUESTION #2: What percentage of patients in the study reported
      experiencing "chills" and "feverish discomfort"?
    sentences:
      - >-
        been proven superior. Annual influenza vaccination is recommended for
        all people six months and older who do not have 

        contraindications. ( Am Fam Physician. 2019; 100(12):751-758. Copyright
        © 2019 American Academy of Family Physicians.)

        BEST PRACTICES IN INFECTIOUS DISEASE  

        Recommendations from the Choosing 

        Wisely Campaign

        Recommendation Sponsoring organization

        Do not routinely avoid 

        influenza vaccination in 

        egg-allergic patients.

        American Academy of Allergy, 

        Asthma, and Immunology

        Source:  For more information on the Choosing Wisely Campaign,
      - |-
        Review
        722  Vol 5   November 2005
        accompanied by fever and some subjects have a transient
        fall in body temperature during the early stages of
        common cold. In a study of 272 patients with sore throat
        associated with URTIs, the mean aural temperature was
        36·8ºC and around 35% of these patients said they were
        suffering from “chills” and “feverish discomfort”.49 The
        sensation of chilliness may be unrelated to any change in
        skin or body temperature. In a study of human
        volunteers, a sensation of chill still develops on
        administration of exogenous pyrogen even though the
      - |-
        ered when the results will modify management or when a 
        patient with signs or symptoms of influenza is hospitalized.19 
        TABLE 2
        Complications of Influenza
        Cardiovascular   26
        Cerebrovascular accidents
        Ischemic heart disease
        Myocarditis
        Hematologic  26
        Hemolytic uremic syndrome
        Hemophagocytic syndrome
        Thrombotic thrombocytope -
        nic purpura
        Musculoskeletal  19,26
        Myositis
        Rhabdomyolysis
        Neurologic  26
        Acute disseminated 
        encephalomyelitis
        Encephalitis
        Guillain-Barré syndrome
        Postinfluenza encephalopathy 
        (neurologic symptoms occur -
        ring after resolution but within
  - source_sentence: >-
      How do cytokines interact with the body's systems to influence the
      hypothalamus and affect body temperature?
    sentences:
      - |-
        interleukin 1, interleukin 6, and tumour necrosis factor
        alpha, as well as the anti-inflammatory cytokines
        interleukin-1 receptor antagonist and interleukin 10
        have been investigated for their pyrogenic or antipyretic
        action.17 Interleukin 1 and interleukin 6 are believed to
        be the most important cytokines that induce fever. 55
        Cytokines are believed to cross the blood–brain barrier
        or interact with the vagus nerve endings to signal the
        temperature control centre of the hypothalamus to
        increase the thermal set point.55,56 The hypothalamus
        then initiates shivering, constriction of skin blood
      - |-
        mended human dose; possible 
        risk of embryo-fetal toxicity with 
        continuous intravenous infusion 
        based on limited animal data
        Baloxavir (Xofluza), 
        available as oral 
        tablets
        NA ($160) Adults and children 12 years 
        and older:  
        88 to 174 lb (40 to 79 kg):  
        single dose of 40 mg  
        ≥ 175 lb (80 kg):  single dose 
        of 80 mg
        Treatment of uncom-
        plicated acute 
        influenza in patients 
        12 years and older who 
        have been symptom -
        atic for no more than 
        48 hours
        Contraindicated in people with 
        a history of hypersensitivity to 
        baloxavir or any component of the 
        product
      - >-
        CME  This clinical content conforms to AAFP criteria for con-

        tinuing medical education (CME). See CME Quiz on page 271.

        Author disclosure:     No relevant financial affiliations.

        Patient information:    Handouts on this topic, written by the 

        authors of this article, are available at https://  www.aafp.org/

        afp/2019/0901/p281-s1.html and https://  www.aafp.org/

        afp/2019/0901/p281-s2.html.

        Acute upper respiratory tract infections are extremely common in adults
        and children, but only a few safe and effective treat-
  - source_sentence: >-
      What are the limitations of using adamantanes (amantadine and rimantadine)
      for influenza treatment according to the context?
    sentences:
      - >-
        December 15, 2019 ◆ Volume 100, Number 12 www.aafp.org/afp  American
        Family Physician 755

        INFLUENZA

        Clinicians caring for high-risk patients can also be consid -

        ered for treatment.28

        Four antiviral drugs have been approved for the treat -

        ment of influenza (Table 4):  the NA inhibitors oseltamivir 

        (Tamiflu), zanamivir (Relenza), and peramivir (Rapivab), 

        and the cap-dependent endonuclease inhibitor baloxa -

        vir (Xofluza). 18,37 Any of these agents can be used in age-  

        appropriate, otherwise healthy outpatients with uncom -

        plicated influenza and no contraindications. 18 Baloxavir is
      - >-
        756 American Family Physician www.aafp.org/afp  Volume 100, Number 12 ◆
        December 15, 2019

        INFLUENZA

        the risk of bronchospasm. 18,28 Adamantanes (amantadine 

        and rimantadine [Flumadine]) are approved for influenza 

        treatment but are not currently recommended. These med -

        ications are not active against influenza B, and most influ -

        enza A strains have shown adamantane resistance for the 

        past 10 years.18

        There is no demonstrated benefit to treating patients 

        with more than one antiviral agent or using higher than 

        recommended dosages. 28 However, extended treatment
      - >-
        distress syndrome

        Diffuse alveolar 

        hemorrhage

        Hypoxic respiratory 

        failure

        Primary viral pneumonia

        Secondary bacterial 

        pneumonia

        Renal 26

        Acute kidney injury  

        (e.g., acute tubulo- 

        interstitial nephritis, 

        glomerulonephritis, 

        minimal change disease)

        Multiorgan failure

        Information from references 8, 19, and 25-27.

        SORT:  KEY RECOMMENDATIONS FOR PRACTICE

        Clinical recommendation

        Evidence 

        rating Comments

        Annual influenza vaccination is recommended for all people 6 months and
        older. 15,16 A Reports of expert committees
  - source_sentence: >-
      Which symptoms of colds and flu are now better understood due to new
      knowledge in molecular biology?
    sentences:
      - >-
        mechanisms that generate the familiar symptoms is poor compared with the
        amount of knowledge available on the

        molecular biology of the viruses involved. New knowledge of the effects
        of cytokines in human beings now helps to

        explain some of the symptoms of colds and flu that were previously in the
        realm of folklore rather than medicine—

        eg, fever, anorexia, malaise, chilliness, headache, and muscle aches and
        pains. The mechanisms of symptoms of

        sore throat, rhinorrhoea, sneezing, nasal congestion, cough, watery
        eyes, and sinus pain are discussed, since these
      - |-
        medicines such as ipratropium. These studies have
        demonstrated that nasal secretions in the first 4 days of a
        common cold are inhibited by intranasal administration
        of ipratropium.25 The nasal discharge also consists of a
        protein-rich plasma exudate derived from subepithelial
        capillaries,28 which may explain why anticholinergics
        only partly inhibit nasal discharge associated with
        URTIs.27
        The colour of nasal discharge and sputum is often
        used as a clinical marker to determine whether or not to
        prescribe antibiotics but there is no evidence from the
      - |-
        ing diffuse alveolar hemorrhage in immunocompetent patients:  a state-
        of-the-art review. Lung. 2013; 191(1): 9-18.
         28.  Uyeki TM, Bernstein HH, Bradley JS, et al. Clinical practice guidelines by 
        the Infectious Diseases Society of America:  2018 update on diagnosis, 
        treatment, chemoprophylaxis, and institutional outbreak management 
        of seasonal influenza. Clin Infect Dis. 2019; 68(6): 895-902.
         29.  Ebell MH, Afonso AM, Gonzales R, et al. Development and validation of 
        a clinical decision rule for the diagnosis of influenza. J Am Board Fam 
        Med. 2012; 25(1): 55-62.
  - source_sentence: >-
      QUESTION #2: How does the sneeze centre in the brainstem coordinate the
      actions involved in sneezing?
    sentences:
      - |-
        stroke, seizure disorder, dementia)
        Asthma or other chronic pulmonary disease
        Chronic kidney disease
        Chronic liver disease
        Heart disease (acquired or congenital)
        Immunosuppression (e.g., HIV infection, cancer, transplant 
        recipients, use of immunosuppressive medications)
        Long-term aspirin therapy in patients younger than 19 years
        Metabolic disorders (acquired [e.g., diabetes mellitus] or 
        inherited [e.g., mitochondrial disorders])
        Morbid obesity
        Sickle cell anemia and other hemoglobinopathies
        Special groups
        Adults 65 years and older
        American Indians and Alaska Natives
      - |-
        causes sneezing.23 The trigeminal nerves relay
        information to the sneeze centre in the brainstem and
        cause reflex activation of motor and parasympathetic
        branches of the facial nerve and activate respiratory
        muscles. A model of the sneeze reflex is illustrated in
        figure 1. The sneeze centre coordinates the inspiratory
        and expiratory actions of sneezing via respiratory
        muscles, and lacrimation and nasal congestion via
        parasympathetic branches of the facial nerve. The eyes
        are always closed during sneezing by activation of facial
        muscles, indicating a close relation between the
      - |-
        during experimental rhinovirus infections have not
        been able to find any morphological changes in the
        nasal epithelium of infected volunteers, apart from a
        substantial increase in polymorphonuclear leucocytes
        early in the course of the infection.11 The major cell
        monitoring the host for the invasion of pathogens is
        the macrophage, which has the ability to trigger an
        acute phase response when stimulated with
        components of viruses or bacteria—eg, viral RNA and
        bacterial cell wall components.12 The surface of the
        macrophage exhibits toll-like receptors that combine
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.6122448979591837
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8877551020408163
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9387755102040817
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9897959183673469
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6122448979591837
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.29591836734693877
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.1877551020408163
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09897959183673469
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6122448979591837
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8877551020408163
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9387755102040817
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9897959183673469
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8165441473931409
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7593091998704244
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7600380628441854
            name: Cosine Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Gonalb/flucold-ft-v1")
# Run inference
sentences = [
    'QUESTION #2: How does the sneeze centre in the brainstem coordinate the actions involved in sneezing?',
    'causes sneezing.23 The trigeminal nerves relay\ninformation to the sneeze centre in the brainstem and\ncause reflex activation of motor and parasympathetic\nbranches of the facial nerve and activate respiratory\nmuscles. A model of the sneeze reflex is illustrated in\nfigure 1. The sneeze centre coordinates the inspiratory\nand expiratory actions of sneezing via respiratory\nmuscles, and lacrimation and nasal congestion via\nparasympathetic branches of the facial nerve. The eyes\nare always closed during sneezing by activation of facial\nmuscles, indicating a close relation between the',
    'stroke, seizure disorder, dementia)\nAsthma or other chronic pulmonary disease\nChronic kidney disease\nChronic liver disease\nHeart disease (acquired or congenital)\nImmunosuppression (e.g., HIV infection, cancer, transplant \nrecipients, use of immunosuppressive medications)\nLong-term aspirin therapy in patients younger than 19 years\nMetabolic disorders (acquired [e.g., diabetes mellitus] or \ninherited [e.g., mitochondrial disorders])\nMorbid obesity\nSickle cell anemia and other hemoglobinopathies\nSpecial groups\nAdults 65 years and older\nAmerican Indians and Alaska Natives',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6122
cosine_accuracy@3 0.8878
cosine_accuracy@5 0.9388
cosine_accuracy@10 0.9898
cosine_precision@1 0.6122
cosine_precision@3 0.2959
cosine_precision@5 0.1878
cosine_precision@10 0.099
cosine_recall@1 0.6122
cosine_recall@3 0.8878
cosine_recall@5 0.9388
cosine_recall@10 0.9898
cosine_ndcg@10 0.8165
cosine_mrr@10 0.7593
cosine_map@100 0.76

Training Details

Training Dataset

Unnamed Dataset

  • Size: 400 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 400 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 14 tokens
    • mean: 24.87 tokens
    • max: 53 tokens
    • min: 44 tokens
    • mean: 129.25 tokens
    • max: 188 tokens
  • Samples:
    sentence_0 sentence_1
    What is the recommended age for annual influenza vaccination according to the context? recommend annual influenza vaccination for all people six
    months and older who do not have contraindications. 15,16
    Vaccination efforts should target people at increased risk of
    complicated or severe influenza (Table 117-19) and those who
    care for or live with high-risk individuals, including health
    care professionals. 15 Two previous FPM articles provided
    communication strategies and tools for increasing influenza
    vaccination rates in practice. 20,21
    Multiple formulations of the influenza vaccine are avail -
    able, including inactivated influenza vaccines (IIV); a recom-
    Who should vaccination efforts specifically target to prevent complicated or severe influenza? recommend annual influenza vaccination for all people six
    months and older who do not have contraindications. 15,16
    Vaccination efforts should target people at increased risk of
    complicated or severe influenza (Table 117-19) and those who
    care for or live with high-risk individuals, including health
    care professionals. 15 Two previous FPM articles provided
    communication strategies and tools for increasing influenza
    vaccination rates in practice. 20,21
    Multiple formulations of the influenza vaccine are avail -
    able, including inactivated influenza vaccines (IIV); a recom-
    What types of studies were included in the search regarding influenza complications and treatment? enza complications American Indians, influenza treatment, and
    influenza universal vaccine. The search included meta-analyses,
    randomized controlled trials, clinical trials, and reviews. Search
    dates: December 1, 2018, to October 5, 2019.
    The Authors
    DAVID Y. GAITONDE, MD, is a core clinical faculty member
    and chief of endocrinology service at Dwight D. Eisenhower
    Army Medical Center, Fort Gordon, Ga.
    CPT. FAITH C. MOORE, USA, MC, is a resident in the Depart -
    ment of Internal Medicine at Dwight D. Eisenhower Army
    Medical Center.
    MAJ. MACKENZIE K. MORGAN, USA, MC, is chief of infec-
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_ndcg@10
1.0 40 0.8359
1.25 50 0.8312
2.0 80 0.8304
2.5 100 0.8156
3.0 120 0.8016
3.75 150 0.7952
4.0 160 0.7880
5.0 200 0.8021
6.0 240 0.8215
6.25 250 0.8286
7.0 280 0.8079
7.5 300 0.8043
8.0 320 0.8126
8.75 350 0.8099
9.0 360 0.8126
10.0 400 0.8165

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}