ABHIiiii1's picture
Add new SentenceTransformer model.
0b46400 verified
metadata
base_model: sentence-transformers/LaBSE
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:23999
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Who led thee through that great and terrible wilderness , wherein were
      fiery serpents , and scorpions , and drought , where there was no water  ;
      who brought thee forth water out of the rock of flint  ;
    sentences:
      - bad u ai ïa ki ha u Aaron bad ki khun shynrang jong u .
      - >-
        U la ïalam ïa phi lyngba ka ri shyiap kaba ïar bad kaba ishyrkhei eh ,
        ha kaba la don ki bseiñ kiba don bih bad ki  ñianglartham . Ha kata ka
        ri kaba tyrkhong bad ka bym don um , u la pynmih um na u mawsiang na ka
        bynta jong phi .
      - >-
        Ki paidbah na ki jait ba na shatei ki phah khot ïa u , bad nangta ma ki
        baroh ki ïaleit lang sha u Rehoboam bad ki ong ha u ,
  - source_sentence: >-
      And , behold , Boaz came from Beth–lehem , and said unto the reapers ,
      The  Lord  be with you . And they answered him , The  Lord  bless thee .
    sentences:
      - >-
        Ko ki briew bymïaineh , to wan noh ; phi long ki jong nga . Ngan shim
        iwei na phi na kawei kawei ka shnong bad ar ngut na kawei kawei ka kur ,
        bad ngan wallam pat ïa phi sha u lum Seïon .
      - >-
        Hadien katto katne por u Boas da lade hi u wan poi na Bethlehem bad u ai
        khublei ïa ki nongtrei .  To U  Trai  un long ryngkat bad phi !  u ong
        .  U  Trai  u kyrkhu ïa phi !  ki jubab .
      - >-
        U Trai u la ong ha u ,  Khreh bad leit sha ‘ Ka Lynti Ba-beit ,’ bad ha
        ka ïing jong u Judas kylli ïa u briew na Tarsos uba kyrteng u Saul .
  - source_sentence: >-
      Jehovah used the prehuman Jesus as his “master worker” in creating all
      other things in heaven and on earth .
    sentences:
      - >-
        Shuwa ba un wan long briew U Jehobah u la pyndonkam ïa u Jisu kum u
        “rangbah nongtrei” ha kaba thaw ïa kiei kiei baroh kiba don ha bneng bad
        ha khyndew .
      - >-
        Shisien la don u briew uba la leit ban bet symbai . Katba u dang bet ïa
        u symbai , katto katne na u , ki la hap ha shi lynter ka lynti ïaid kjat
        , ha kaba ki la shah ïuh , bad ki sim ki la bam lut .
      - >-
        Ngan ïathuh ïa ka shatei ban shah ïa ki ban leit bad ïa ka shathie ban
        ym bat noh ïa ki . Ai ba ki briew jong nga ki wan phai na ki ri bajngai
        , na man la ki bynta baroh jong ka pyrthei .
  - source_sentence: >-
      The like figure whereunto even baptism doth also now save us ( not the
      putting away of the filth of the flesh , but the answer of a good
      conscience toward God , ) by the resurrection of Jesus Christ :
    sentences:
      - >-
        kaba long ka dak kaba kdew sha ka jingpynbaptis , kaba pyllait im ïa phi
        mynta . Kam dei ka jingsait noh ïa ka jakhlia na ka met , hynrei ka
        jingkular ba la pynlong sha U Blei na ka jingïatiplem babha . Ka pynim
        ïa phi da ka jingmihpat jong U Jisu Khrist ,
      - >-
        Ki briew kiba sniew kin ïoh ïa kaei kaba ki dei ban ïoh . Ki briew kiba
        bha kin ïoh bainong na ka bynta ki kam jong ki .
      - >-
        Nangta nga la ïohi ïa ka bneng bathymmai bad ïa ka pyrthei bathymmai .
        Ka bneng banyngkong bad ka pyrthei banyngkong ki la jah noh , bad ka
        duriaw kam don shuh .
  - source_sentence: >-
      On that day they read in the book of Moses in the audience of the people 
      ; and therein was found written , that the Ammonite and the Moabite should
      not come into the congregation of God for ever  ;
    sentences:
      - >-
        U Elisha u la ïap bad la tep ïa u . Man la ka snem ki kynhun jong ki
        Moab ki ju wan tur thma ïa ka ri Israel .
      - >-
        Katba dang pule jam ïa ka Hukum u Moses ha u paidbah , ki poi ha ka
        bynta kaba ong ba ym dei ban shah ïa u nong Amon ne u nong Moab ban
        ïasnohlang bad ki briew jong U Blei .
      - >-
        U angel u la jubab ,  U Mynsiem Bakhuid un sa wan ha pha , bad ka bor
        jong U Blei kan shong halor jong pha . Na kane ka daw , ïa i khunlung
        bakhuid yn khot U Khun U Blei .

SentenceTransformer based on sentence-transformers/LaBSE

This is a sentence-transformers model finetuned from sentence-transformers/LaBSE. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/LaBSE
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
  (3): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ABHIiiii1/LaBSE-Fine-Tuned-EN-KHA")
# Run inference
sentences = [
    'On that day they read in the book of Moses in the audience of the people  ; and therein was found written , that the Ammonite and the Moabite should not come into the congregation of God for ever  ;',
    'Katba dang pule jam ïa ka Hukum u Moses ha u paidbah , ki poi ha ka bynta kaba ong ba ym dei ban shah ïa u nong Amon ne u nong Moab ban ïasnohlang bad ki briew jong U Blei .',
    'U Elisha u la ïap bad la tep ïa u . Man la ka snem ki kynhun jong ki Moab ki ju wan tur thma ïa ka ri Israel .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 23,999 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 6 tokens
    • mean: 34.89 tokens
    • max: 87 tokens
    • min: 8 tokens
    • mean: 51.51 tokens
    • max: 127 tokens
  • Samples:
    sentence_0 sentence_1
    And Moses went out from Pharaoh , and entreated the Lord . U Moses u mihnoh na u Pharaoh , bad u kyrpad ïa U Trai ,
    In the ninth year of Hoshea the king of Assyria took Samaria , and carried Israel away into Assyria , and placed them in Halah and in Habor by the river of Gozan , and in the cities of the Medes . kaba long ka snem kaba khyndai jong ka jingsynshar u Hoshea , u patsha ka Assyria u kurup ïa ka Samaria , u rah ïa ki Israel sha Assyria kum ki koidi , bad pynsah katto katne ngut na ki ha ka nongbah Halah , katto katne pat hajan ka wah Habor ha ka distrik Gosan , bad katto katne ha ki nongbah jong ka Media .
    And the king said unto Cushi , Is the young man Absalom safe ? And Cushi answered , The enemies of my lord the king , and all that rise against thee to do thee hurt , be as that young man is . Hato u samla Absalom u dang im ? u syiem u kylli . U mraw u jubab , Ko Kynrad , nga sngew ba kaei kaba la jia ha u kan jin da la jia ha baroh ki nongshun jong ngi , bad ha baroh kiba ïaleh pyrshah ïa phi .
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.3333 500 0.542
0.6667 1000 0.135
1.0 1500 0.0926
1.3333 2000 0.0535
1.6667 2500 0.0226
2.0 3000 0.018
2.3333 3500 0.0124
2.6667 4000 0.0057
3.0 4500 0.0053

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.3
  • PyTorch: 2.1.2
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}