pritamdeka's picture
Add new SentenceTransformer model.
2d852be verified
metadata
base_model: pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:5749
  - loss:CosineSimilarityLoss
widget:
  - source_sentence: >-
      আমি "... comoving মহাজাগতিক বিশ্ৰাম ফ্ৰেমৰ তুলনাত ... সিংহ নক্ষত্ৰমণ্ডলৰ
      ফালে কিছু 371 কিলোমিটাৰ প্ৰতি ছেকেণ্ডত" আগবাঢ়িছো.
    sentences:
      - বাস্কেটবল খেলুৱৈগৰাকীয়ে নিজৰ দলৰ হৈ পইণ্ট লাভ কৰিবলৈ ওলাইছে।
      - আন কোনো বস্তুৰ লগত আপেক্ষিক নহোৱা কোনো ‘ষ্টিল’ নাই।
      - এজনী ছোৱালীয়ে বতাহ বাদ্যযন্ত্ৰ বজায়।
  - source_sentence: চাৰিটা ল’ৰা-ছোৱালীয়ে ভঁৰালৰ জীৱ-জন্তুবোৰলৈ চাই আছে।
    sentences:
      - ডাইনিং টেবুল এখনৰ চাৰিওফালে বৃদ্ধৰ দল এটাই পোজ দিছে।
      - বিকিনি পিন্ধা চাৰিগৰাকী মহিলাই বিলত ভলীবল খেলি আছে।
      - ল’ৰা-ছোৱালীয়ে ভেড়া চাই।
  - source_sentence: ডালত বহি থকা দুটা টান ঈগল।
    sentences:
      - জাতৰ জেব্ৰা ডানিঅ’ অত্যন্ত কঠোৰ মাছ, ইহঁতক হত্যা কৰাটো প্ৰায় কঠিন।
      - এটা ডালত দুটা ঈগল বহি আছে।
      - >-
        নূন্যতম মজুৰিৰ আইনসমূহে কম দক্ষ, কম উৎপাদনশীল লোকক আটাইতকৈ বেছি আঘাত
        দিয়ে।
  - source_sentence: >-
      "মই আচলতে যি বিচাৰিছো সেয়া হৈছে মুছলমান জনসংখ্যাৰ এটা অনুমান..." @ThanosK
      আৰু @T.E.D., এটা সামগ্ৰিক, সাধাৰণ জনসংখ্যাৰ অনুমান f.e.
    sentences:
      - এগৰাকী মহিলাই সেউজীয়া পিঁয়াজ কাটি আছে।
      - >-
        তলত দিয়া কথাখিনি মোৰ কুকুৰ কাণৰ দৰে কপিৰ পৰা লোৱা হৈছে নিউ পেংগুইন
        এটলাছ অৱ মেডিভেল হিষ্ট্ৰীৰ।
      - আমাৰ দৰে সৌৰজগতৰ কোনো তাৰকাৰাজ্যৰ বাহিৰত থকাটো সম্ভৱ হ’ব পাৰে।
  - source_sentence: ইণ্টাৰনেট কেমেৰাৰ জৰিয়তে এগৰাকী ছোৱালীৰ লগত কথা পাতিলে মানুহজনে।
    sentences:
      - গছৰ শাৰী এটাৰ সন্মুখত পথাৰত ভেড়া চৰিছে।
      - এজন মানুহে গীটাৰ বজাই আছে।
      - ৱেবকেমৰ জৰিয়তে এগৰাকী ছোৱালীৰ সৈতে কথা পাতিছে এজন কিশোৰে।
model-index:
  - name: >-
      SentenceTransformer based on
      pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pritamdeka/stsb assamese translated dev
          type: pritamdeka/stsb-assamese-translated-dev
        metrics:
          - type: pearson_cosine
            value: 0.8103888874564235
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.808745256408391
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.7856524098322162
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.7931254692762979
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.787635055496797
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.7951615705258325
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.7706254928060731
            name: Pearson Dot
          - type: spearman_dot
            value: 0.7771019257164439
            name: Spearman Dot
          - type: pearson_max
            value: 0.8103888874564235
            name: Pearson Max
          - type: spearman_max
            value: 0.808745256408391
            name: Spearman Max
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pritamdeka/stsb assamese translated test
          type: pritamdeka/stsb-assamese-translated-test
        metrics:
          - type: pearson_cosine
            value: 0.7701562538442139
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.7660618813636367
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.749425583772647
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.7529158472529595
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.7498757891992801
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.7531339468525071
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.7193336616396375
            name: Pearson Dot
          - type: spearman_dot
            value: 0.7151802549941848
            name: Spearman Dot
          - type: pearson_max
            value: 0.7701562538442139
            name: Pearson Max
          - type: spearman_max
            value: 0.7660618813636367
            name: Spearman Max

SentenceTransformer based on pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1

This is a sentence-transformers model finetuned from pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pritamdeka/distilbert-base-multilingual-cased-indicxnli-random-negatives-v1-sts")
# Run inference
sentences = [
    'ইণ্টাৰনেট কেমেৰাৰ জৰিয়তে এগৰাকী ছোৱালীৰ লগত কথা পাতিলে মানুহজনে।',
    'ৱেবকেমৰ জৰিয়তে এগৰাকী ছোৱালীৰ সৈতে কথা পাতিছে এজন কিশোৰে।',
    'এজন মানুহে গীটাৰ বজাই আছে।',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8104
spearman_cosine 0.8087
pearson_manhattan 0.7857
spearman_manhattan 0.7931
pearson_euclidean 0.7876
spearman_euclidean 0.7952
pearson_dot 0.7706
spearman_dot 0.7771
pearson_max 0.8104
spearman_max 0.8087

Semantic Similarity

Metric Value
pearson_cosine 0.7702
spearman_cosine 0.7661
pearson_manhattan 0.7494
spearman_manhattan 0.7529
pearson_euclidean 0.7499
spearman_euclidean 0.7531
pearson_dot 0.7193
spearman_dot 0.7152
pearson_max 0.7702
spearman_max 0.7661

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss pritamdeka/stsb-assamese-translated-dev_spearman_cosine pritamdeka/stsb-assamese-translated-test_spearman_cosine
1.1111 100 0.0386 0.0324 0.8024 -
2.2222 200 0.0238 0.0316 0.8095 -
3.3333 300 0.0141 0.0316 0.8092 -
4.4444 400 0.0086 0.0319 0.8085 -
5.5556 500 0.0065 0.0314 0.8107 -
6.6667 600 0.005 0.0318 0.8088 -
7.7778 700 0.0044 0.0320 0.8076 -
8.8889 800 0.0038 0.0317 0.8095 -
10.0 900 0.0035 0.0318 0.8087 0.7661
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}