himanshu23099's picture
Add new SentenceTransformer model
d09799c verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:3507
  - loss:GISTEmbedLoss
base_model: BAAI/bge-small-en-v1.5
widget:
  - source_sentence: >-
      Is there an option to use ride-sharing apps like Ola or Uber for travel
      from the Airport to the Mela?
    sentences:
      - >-
        Are there towing services available if my vehicle breaks down in the
        parking lot?
         Yes, towing services are available if your vehicle breaks down in the parking lot.
      - >-
        No, ride-sharing options like Ola or Uber are not available for travel
        from the Airport to the Mela. Pilgrims are encouraged to use other
        transport options like taxis, buses, or dedicated shuttle services
        provided for the event.
      - >-
        Baking bread requires certain key ingredients to achieve a perfect
        texture. Flour, water, and yeast are the base, while salt enhances
        flavor. The dough should be kneaded until smooth, then allowed to rise
        in a warm area. After a proper rise, shaping the loaf is essential for
        even baking in the oven.
  - source_sentence: What is the significance of Akshaywat?
    sentences:
      - >-
        Akshaywat, or the "immortal banyan tree," is a spiritually significant
        site in Prayagraj, especially during the Kumbh Mela. Symbolizing
        immortality and eternal life, the tree is believed to possess divine
        qualities that remain unaffected by creation and destruction cycles.
        Mythologically, it is associated with Lord Brahma, who is said to have
        performed a sacrificial ritual under it, and Lord Vishnu, who is
        believed to have blessed devotees there. Akshaywat is also a sacred spot
        for performing Pind Daan, rituals for deceased ancestors, thought to
        help achieve Moksha (liberation). As a center of spiritual wisdom and
        pilgrimage for thousands of years, it continues to be a powerful symbol
        of divine blessings and spiritual strength for Hindu devotees.
      - >-
        What are the must-visit spiritual sites near Sangam?

        The Sangam area, where the Ganga, Yamuna, and the mystical Saraswati
        rivers converge, is surrounded by revered spiritual sites:


        Bade Hanumanji Temple:Bade Hanumanji Temple, also known as Lete Hanuman
        Mandir, is a unique and revered Hindu shrine located near the Sangam in
        Prayagraj. This temple is distinctive for its reclining idol of Lord
        Hanuman, a one-of-a-kind depiction of the deity. Each year, during the
        monsoon floods, the Ganga river rises to gently wash over the feet of
        Lord Hanuman—a sacred ritual believed to be a divine blessing


        Patalpuri Temple and Akshayavat Tree: Located within the Allahabad Fort,
        the ancient Patalpuri Temple is known for the Akshayavat (Indestructible
        Banyan Tree), considered sacred and a symbol of immortality.


        Mankameshwar Temple: A dedicated Shiva temple located near the Sangam,
        known for its serene atmosphere and the belief that prayers here fulfill
        desires.
      - >-
        The uniqueness of brightly colored seashells lies in their mesmerizing
        patterns. Found along coastlines worldwide, these intricate formations
        tell stories of marine life and geological processes. Each shell serves
        as a protective covering, shielding the delicate organisms within from
        predators and environmental threats. Fishermen and beachcombers alike
        often treasure these natural artifacts, using them for decoration or as
        tools in crafting. The vibrant hues seen in shells, ranging from deep
        blues to vivid oranges, result from pigments produced by the mollusks
        themselves, influenced by their habitat and diet. Collecting seashells
        can foster a deep appreciation for marine ecosystems and the roles
        different species play within them, reminding us of the intricate
        balance of nature.
  - source_sentence: Allahabad Junction ka matlab
    sentences:
      - >-
        Where is Anand Bhavan Museum located?

        Anand Bhawan is located on Jawaharlal Nehru Road, about 5 km from
        Allahabad Junction Railway Station, Prayagraj, Uttar Pradesh.
      - >-
        Aartis are performed both in the mornings and evenings on the riverbanks
        in Prayagraj to honor the divine presence of the sacred rivers—Ganga,
        Yamuna, and mythical Saraswati—and to seek their blessings.  \n The
        morning Aarti symbolizes the beginning of a new day, invoking the divine
        to bestow grace, protection, and spiritual strength upon the devotees. 
        \n The evening Aarti serves as a ritual of gratitude, marking the end of
        the day by thanking the deities for their blessings and guidance.
      - >-
        Where is Khusro Bagh located?

        The garden is located approximately 3 km from Allahabad Junction Railway
        Station, making it easily accessible by local transport. The address is
        near the Lukarganj area, Allahabad, Uttar Pradesh.
  - source_sentence: Do E-Rickshaws have a maximum passenger limit, and what is it?
    sentences:
      - >-
        The ancient art of glassblowing dates back thousands of years. This
        intricate craft requires skill and precision, resulting in beautiful
        works that can be functional or decorative. From vases to intricate
        sculptures, the possibilities are endless.
      - >-
        E-Rickshaws have a maximum passenger limit of 4 people. It is important
        not to exceed this limit to ensure safety.
      - >-
        No, shuttle buses will not have dedicated volunteers specifically, but
        for assistance, you can reach out to the nearest information center.
  - source_sentence: Tourists visit reason
    sentences:
      - >-
        What attractions are closest to the city center?

        Near the city center, you’ll find several attractions within a short
        distance. Anand Bhavan and Swaraj Bhavan are centrally located and offer
        insights into the Nehru family and India’s freedom movement. All Saints’
        Cathedral, a magnificent Gothic-style church also known as the “Patthar
        Girja,” is located in Civil Lines and is one of Prayagraj's
        architectural gems. Company Bagh, a peaceful park, is also close by and
        ideal for a quiet stroll. Chandrashekhar Azad Park and Khusro Bagh are
        both centrally located as well, providing green spaces along with
        historical importance.
      - |-
        When and where was the last Kumbh held?
         The last Mahakumbh was held in Haridwar in 2021.
      - >-
        What is All Saints Cathedral, and why is it architecturally significant?

        All Saints Cathedral, locally known as Patthar Girja (Stone Church), is
        a renowned Anglican Christian Church located on M.G. Marg, Allahabad.
        Built in the late 19th century, it is one of the most beautiful and
        architecturally significant churches in Uttar Pradesh, attracting both
        tourists and pilgrims.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@5
  - cosine_ndcg@10
  - cosine_ndcg@100
  - cosine_mrr@5
  - cosine_mrr@10
  - cosine_mrr@100
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: val evaluator
          type: val_evaluator
        metrics:
          - type: cosine_accuracy@1
            value: 0.3580387685290764
            name: Cosine Accuracy@1
          - type: cosine_accuracy@5
            value: 0.7092360319270239
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.7993158494868872
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3580387685290764
            name: Cosine Precision@1
          - type: cosine_precision@5
            value: 0.14184720638540477
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.07993158494868871
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.3580387685290764
            name: Cosine Recall@1
          - type: cosine_recall@5
            value: 0.7092360319270239
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.7993158494868872
            name: Cosine Recall@10
          - type: cosine_ndcg@5
            value: 0.5538539564761136
            name: Cosine Ndcg@5
          - type: cosine_ndcg@10
            value: 0.5832174788373438
            name: Cosine Ndcg@10
          - type: cosine_ndcg@100
            value: 0.6189539076148961
            name: Cosine Ndcg@100
          - type: cosine_mrr@5
            value: 0.5013492968453055
            name: Cosine Mrr@5
          - type: cosine_mrr@10
            value: 0.5136020162530992
            name: Cosine Mrr@10
          - type: cosine_mrr@100
            value: 0.5210085507064763
            name: Cosine Mrr@100
          - type: cosine_map@100
            value: 0.5210085507064769
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("himanshu23099/bge_embedding_finetune_v3")
# Run inference
sentences = [
    'Tourists visit reason',
    'What is All Saints Cathedral, and why is it architecturally significant?\nAll Saints Cathedral, locally known as Patthar Girja (Stone Church), is a renowned Anglican Christian Church located on M.G. Marg, Allahabad. Built in the late 19th century, it is one of the most beautiful and architecturally significant churches in Uttar Pradesh, attracting both tourists and pilgrims.',
    "What attractions are closest to the city center?\nNear the city center, you’ll find several attractions within a short distance. Anand Bhavan and Swaraj Bhavan are centrally located and offer insights into the Nehru family and India’s freedom movement. All Saints’ Cathedral, a magnificent Gothic-style church also known as the “Patthar Girja,” is located in Civil Lines and is one of Prayagraj's architectural gems. Company Bagh, a peaceful park, is also close by and ideal for a quiet stroll. Chandrashekhar Azad Park and Khusro Bagh are both centrally located as well, providing green spaces along with historical importance.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.358
cosine_accuracy@5 0.7092
cosine_accuracy@10 0.7993
cosine_precision@1 0.358
cosine_precision@5 0.1418
cosine_precision@10 0.0799
cosine_recall@1 0.358
cosine_recall@5 0.7092
cosine_recall@10 0.7993
cosine_ndcg@5 0.5539
cosine_ndcg@10 0.5832
cosine_ndcg@100 0.619
cosine_mrr@5 0.5013
cosine_mrr@10 0.5136
cosine_mrr@100 0.521
cosine_map@100 0.521

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,507 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 11.76 tokens
    • max: 32 tokens
    • min: 8 tokens
    • mean: 116.82 tokens
    • max: 504 tokens
    • min: 19 tokens
    • mean: 121.15 tokens
    • max: 424 tokens
  • Samples:
    anchor positive negative
    Where are the shuttle bus pickup points located within the Kumbh Mela grounds? No, shuttle buses will not have dedicated volunteers specifically, but for assistance, you can reach out to the nearest information center. The ancient art of weaving has captivated many cultures worldwide. In some regions, artisans use intricate patterns to tell stories, while others focus on vibrant colors that highlight their heritage. Experimentation with different materials can yield unique textures, adding depth to the final product. Workshops often provide insights into traditional techniques, ensuring these skills are passed down through generations.
    Hotel Ilawart start place Is hotel pickup and drop-off available for the tours?
    Fixed pickup points, such as Hotel Ilawart, are provided for all tours. In some cases, pickup and drop-off can be arranged for locations within a 5 km radius of the starting point, but you must confirm this with the tour operator at the time of booking.
    What all is included in the trip package?
    The trip package typically includes transportation, tour guide services, and breakfast. Meals such as lunch and dinner can be purchased separately. Hotel bookings are usually not included in the package, so you will need to arrange accommodation independently.
    Are there food stalls or restaurants at the Railway Junction that cater to dietary restrictions for pilgrims? Yes, there are food stalls and restaurants available at the Railway Junction that cater to various dietary needs, including vegetarian and other dietary restrictions suitable for pilgrims. The sound of the ocean waves rhythmically crashing against the shore creates a soothing symphony that invites relaxation. Seagulls soar above, occasionally diving down to catch a glimpse of fish beneath the surface. Beachgoers spread out their colorful towels, soaking up the sun's golden rays while children build sandcastles, their laughter mingling with the salty breeze. A distant sailboat glides across the horizon, hinting at adventures beyond the vast expanse of blue. As the sun sets, the sky transforms into a canvas of vibrant hues, signaling the end of another beautiful day by the sea.
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Evaluation Dataset

Unnamed Dataset

  • Size: 877 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 877 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 12.21 tokens
    • max: 32 tokens
    • min: 3 tokens
    • mean: 115.93 tokens
    • max: 471 tokens
    • min: 15 tokens
    • mean: 118.09 tokens
    • max: 422 tokens
  • Samples:
    anchor positive negative
    Ganga bath benefit What is the ritual of Snan or bathing?
    Taking bath at the confluence of Ganga, Yamuna and invisible Saraswati during Mahakumbh has special significance. It is believed that by bathing in this holy confluence, all the sins of a person are washed away and he attains salvation.

    Bathing not only symbolizes personal purification, but it also conveys the message of social harmony and unity, where people from different cultures and communities come together to participate in this sacred ritual.

    It is considered that in special circumstances, the water of rivers also acquires a special life-giving quality, i.e. nectar, which not only leads to spiritual development along with purification of the mind, but also gives physical benefits by getting health.
    List of Aliases: [['Snan', 'bathing'], ]
    What benefits will I get by attending the Kumbh Mela?
    It is believed that bathing in the holy rivers during this time washes away sins and grants liberation from the cycle of life and death.

    Attending the Kumbh and taking a dip in the sacred rivers provides a unique opportunity for spiritual growth, purification, and selfrealization. ✨
    Guide provide what What is the guide-to-participant ratio for each tour?
    Each tour is led by one guide per group, ensuring a personalized experience with ample opportunity for detailed insights and engagement. The guide will provide context, historical background, and answer any questions during the tour, offering a rich, informative experience for participants.
    How many people can join a group tour?
    Group sizes depend on the type of vehicle selected. For instance, a Dzire accommodates up to 4 people, an Innova is suitable for 5-6 people, and larger groups (minimum 10 people) can travel in a Tempo Traveller. For even larger groups, multiple vehicles can be arranged to ensure everyone can travel together comfortably.
    How many rules must a Kalpvasi observe? A Kalpvasi must observe 21 rules during Kalpvas, involving disciplines of the mind, speech, and actions. The dancing colors of autumn leaves create a tapestry of nature’s beauty, inviting every eye to witness the grandeur of the changing seasons. Every gust of wind carries a whisper of nostalgia as trees shed their vibrant garments.
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 30
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 30
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss val_evaluator_cosine_ndcg@100
0.0909 10 - 1.0916 0.4285
0.1818 20 - 1.0683 0.4295
0.2727 30 - 1.0320 0.4301
0.3636 40 - 0.9845 0.4309
0.4545 50 1.8466 0.9320 0.4340
0.5455 60 - 0.8804 0.4352
0.6364 70 - 0.8284 0.4368
0.7273 80 - 0.7754 0.4420
0.8182 90 - 0.7211 0.4425
0.9091 100 1.4317 0.6711 0.4442
1.0 110 - 0.6193 0.4483
1.0909 120 - 0.5700 0.4555
1.1818 130 - 0.5271 0.4603
1.2727 140 - 0.4892 0.4620
1.3636 150 1.0007 0.4611 0.4651
1.4545 160 - 0.4276 0.4706
1.5455 170 - 0.4005 0.4698
1.6364 180 - 0.3818 0.4728
1.7273 190 - 0.3573 0.4763
1.8182 200 0.7585 0.3321 0.4783
1.9091 210 - 0.3091 0.4806
2.0 220 - 0.2963 0.4833
2.0909 230 - 0.2875 0.4834
2.1818 240 - 0.2793 0.4842
2.2727 250 0.5586 0.2729 0.4879
2.3636 260 - 0.2663 0.4885
2.4545 270 - 0.2576 0.4925
2.5455 280 - 0.2477 0.5006
2.6364 290 - 0.2353 0.5058
2.7273 300 0.4751 0.2278 0.5112
2.8182 310 - 0.2206 0.5096
2.9091 320 - 0.2130 0.5144
3.0 330 - 0.2043 0.5202
3.0909 340 - 0.1973 0.5214
3.1818 350 0.381 0.1964 0.5271
3.2727 360 - 0.1968 0.5325
3.3636 370 - 0.1922 0.5289
3.4545 380 - 0.1869 0.5329
3.5455 390 - 0.1789 0.5391
3.6364 400 0.3886 0.1743 0.5464
3.7273 410 - 0.1730 0.5472
3.8182 420 - 0.1699 0.5479
3.9091 430 - 0.1644 0.5525
4.0 440 - 0.1623 0.5511
4.0909 450 0.2977 0.1600 0.5513
4.1818 460 - 0.1540 0.5519
4.2727 470 - 0.1492 0.5589
4.3636 480 - 0.1450 0.5624
4.4545 490 - 0.1426 0.5644
4.5455 500 0.2496 0.1407 0.5629
4.6364 510 - 0.1390 0.5663
4.7273 520 - 0.1399 0.5695
4.8182 530 - 0.1377 0.5764
4.9091 540 - 0.1357 0.5753
5.0 550 0.2322 0.1364 0.5827
5.0909 560 - 0.1327 0.5804
5.1818 570 - 0.1300 0.5799
5.2727 580 - 0.1307 0.5816
5.3636 590 - 0.1331 0.5868
5.4545 600 0.2219 0.1322 0.5839
5.5455 610 - 0.1332 0.5822
5.6364 620 - 0.1323 0.5817
5.7273 630 - 0.1311 0.5845
5.8182 640 - 0.1282 0.5834
5.9091 650 0.1982 0.1253 0.5870
6.0 660 - 0.1242 0.5880
6.0909 670 - 0.1241 0.5859
6.1818 680 - 0.1265 0.5885
6.2727 690 - 0.1287 0.5964
6.3636 700 0.1613 0.1321 0.5968
6.4545 710 - 0.1332 0.5979
6.5455 720 - 0.1295 0.6016
6.6364 730 - 0.1262 0.6022
6.7273 740 - 0.1242 0.6020
6.8182 750 0.172 0.1238 0.6037
6.9091 760 - 0.1222 0.6036
7.0 770 - 0.1213 0.6038
7.0909 780 - 0.1208 0.6038
7.1818 790 - 0.1200 0.6011
7.2727 800 0.1486 0.1196 0.5979
7.3636 810 - 0.1227 0.6015
7.4545 820 - 0.1225 0.6004
7.5455 830 - 0.1195 0.6045
7.6364 840 - 0.1202 0.6045
7.7273 850 0.1501 0.1208 0.6044
7.8182 860 - 0.1177 0.6038
7.9091 870 - 0.1161 0.6031
8.0 880 - 0.1168 0.6024
8.0909 890 - 0.1175 0.6050
8.1818 900 0.1563 0.1157 0.6063
8.2727 910 - 0.1146 0.6056
8.3636 920 - 0.1152 0.6073
8.4545 930 - 0.1167 0.6077
8.5455 940 - 0.1172 0.6087
8.6364 950 0.1247 0.1169 0.6077
8.7273 960 - 0.1159 0.6056
8.8182 970 - 0.1151 0.6066
8.9091 980 - 0.1161 0.6089
9.0 990 - 0.1187 0.6071
9.0909 1000 0.1497 0.1157 0.6110
9.1818 1010 - 0.1148 0.6086
9.2727 1020 - 0.1134 0.6125
9.3636 1030 - 0.1173 0.6114
9.4545 1040 - 0.1174 0.6118
9.5455 1050 0.1025 0.1159 0.6127
9.6364 1060 - 0.1118 0.6093
9.7273 1070 - 0.1114 0.6103
9.8182 1080 - 0.1128 0.6102
9.9091 1090 - 0.1142 0.6116
10.0 1100 0.128 0.1147 0.6115
10.0909 1110 - 0.1143 0.6095
10.1818 1120 - 0.1134 0.6073
10.2727 1130 - 0.1137 0.6059
10.3636 1140 - 0.1143 0.6049
10.4545 1150 0.1413 0.1145 0.6047
10.5455 1160 - 0.1154 0.6032
10.6364 1170 - 0.1158 0.6044
10.7273 1180 - 0.1151 0.6060
10.8182 1190 - 0.1145 0.6081
10.9091 1200 0.1223 0.1133 0.6084
11.0 1210 - 0.1121 0.6090
11.0909 1220 - 0.1130 0.6129
11.1818 1230 - 0.1134 0.6089
11.2727 1240 - 0.1136 0.6112
11.3636 1250 0.1199 0.1142 0.6134
11.4545 1260 - 0.1128 0.6145
11.5455 1270 - 0.1097 0.6148
11.6364 1280 - 0.1081 0.6122
11.7273 1290 - 0.1074 0.6126
11.8182 1300 0.1143 0.1063 0.6167
11.9091 1310 - 0.1067 0.6163
12.0 1320 - 0.1067 0.6190
12.0909 1330 - 0.1075 0.6193
12.1818 1340 - 0.1092 0.6222
12.2727 1350 0.0974 0.1087 0.6199
12.3636 1360 - 0.1078 0.6183
12.4545 1370 - 0.1072 0.6180
12.5455 1380 - 0.1072 0.6172
12.6364 1390 - 0.1072 0.6209
12.7273 1400 0.1257 0.1056 0.6152
12.8182 1410 - 0.1046 0.6149
12.9091 1420 - 0.1034 0.6142
13.0 1430 - 0.1034 0.6165
13.0909 1440 - 0.1046 0.6165
13.1818 1450 0.0866 0.1064 0.6177
13.2727 1460 - 0.1070 0.6158
13.3636 1470 - 0.1055 0.6151
13.4545 1480 - 0.1040 0.6182
13.5455 1490 - 0.1042 0.6144
13.6364 1500 0.0757 0.1042 0.6151
13.7273 1510 - 0.1056 0.6169
13.8182 1520 - 0.1059 0.6172
13.9091 1530 - 0.1059 0.6181
14.0 1540 - 0.1042 0.6167
14.0909 1550 0.0754 0.1043 0.6198
14.1818 1560 - 0.1044 0.6215
14.2727 1570 - 0.1042 0.6205
14.3636 1580 - 0.1058 0.6196
14.4545 1590 - 0.1076 0.6212
14.5455 1600 0.0901 0.1098 0.6219
14.6364 1610 - 0.1095 0.6247
14.7273 1620 - 0.1084 0.6209
14.8182 1630 - 0.1063 0.6164
14.9091 1640 - 0.1049 0.6170
15.0 1650 0.1034 0.1043 0.6199
15.0909 1660 - 0.1033 0.6216
15.1818 1670 - 0.1035 0.6244
15.2727 1680 - 0.1048 0.6286
15.3636 1690 - 0.1070 0.6239
15.4545 1700 0.0821 0.1084 0.6237
15.5455 1710 - 0.1095 0.6234
15.6364 1720 - 0.1090 0.6221
15.7273 1730 - 0.1089 0.6227
15.8182 1740 - 0.1091 0.6201
15.9091 1750 0.074 0.1089 0.6195
16.0 1760 - 0.1082 0.6205
16.0909 1770 - 0.1076 0.6198
16.1818 1780 - 0.1079 0.6195
16.2727 1790 - 0.1081 0.6238
16.3636 1800 0.083 0.1066 0.6219
16.4545 1810 - 0.1055 0.6201
16.5455 1820 - 0.1045 0.6217
16.6364 1830 - 0.1030 0.6198
16.7273 1840 - 0.1012 0.6192
16.8182 1850 0.0569 0.1012 0.6198
16.9091 1860 - 0.1017 0.6224
17.0 1870 - 0.1024 0.6220
17.0909 1880 - 0.1038 0.6217
17.1818 1890 - 0.1046 0.6231
17.2727 1900 0.1054 0.1056 0.6191
17.3636 1910 - 0.1064 0.6220
17.4545 1920 - 0.1078 0.6213
17.5455 1930 - 0.1077 0.6228
17.6364 1940 - 0.1071 0.6194
17.7273 1950 0.0588 0.1073 0.6227
17.8182 1960 - 0.1073 0.6219
17.9091 1970 - 0.1074 0.6217
18.0 1980 - 0.1073 0.6239
18.0909 1990 - 0.1074 0.6210
18.1818 2000 0.0772 0.1076 0.6226
18.2727 2010 - 0.1081 0.6215
18.3636 2020 - 0.1081 0.6206
18.4545 2030 - 0.1073 0.6229
18.5455 2040 - 0.1069 0.6221
18.6364 2050 0.0669 0.1070 0.6233
18.7273 2060 - 0.1062 0.6233
18.8182 2070 - 0.1051 0.6232
18.9091 2080 - 0.1038 0.6211
19.0 2090 - 0.1028 0.6210
19.0909 2100 0.0638 0.1015 0.6214
19.1818 2110 - 0.1021 0.6208
19.2727 2120 - 0.1029 0.6205
19.3636 2130 - 0.1033 0.6205
19.4545 2140 - 0.1044 0.6206
19.5455 2150 0.0805 0.1030 0.6187
19.6364 2160 - 0.1029 0.6199
19.7273 2170 - 0.1041 0.6214
19.8182 2180 - 0.1050 0.6211
19.9091 2190 - 0.1040 0.6207
20.0 2200 0.0932 0.1028 0.6201
20.0909 2210 - 0.1019 0.6212
20.1818 2220 - 0.1030 0.6202
20.2727 2230 - 0.1034 0.6212
20.3636 2240 - 0.1029 0.6224
20.4545 2250 0.0655 0.1034 0.6203
20.5455 2260 - 0.1030 0.6229
20.6364 2270 - 0.1023 0.6193
20.7273 2280 - 0.1022 0.6185
20.8182 2290 - 0.1017 0.6189
20.9091 2300 0.0879 0.1011 0.6178
21.0 2310 - 0.1015 0.6175
21.0909 2320 - 0.1019 0.6182
21.1818 2330 - 0.1013 0.6198
21.2727 2340 - 0.1014 0.6187
21.3636 2350 0.074 0.1022 0.6205
21.4545 2360 - 0.1038 0.6213
21.5455 2370 - 0.1043 0.6236
21.6364 2380 - 0.1044 0.6231
21.7273 2390 - 0.1045 0.6221
21.8182 2400 0.0768 0.1050 0.6224
21.9091 2410 - 0.1054 0.6222
22.0 2420 - 0.1052 0.6214
22.0909 2430 - 0.1051 0.6186
22.1818 2440 - 0.1055 0.6193
22.2727 2450 0.0741 0.1055 0.6205
22.3636 2460 - 0.1053 0.6208
22.4545 2470 - 0.1052 0.6224
22.5455 2480 - 0.1037 0.6191
22.6364 2490 - 0.1032 0.6189
22.7273 2500 0.0669 0.1034 0.6189
22.8182 2510 - 0.1037 0.6224
22.9091 2520 - 0.1038 0.6226
23.0 2530 - 0.1035 0.6203
23.0909 2540 - 0.1030 0.6198
23.1818 2550 0.0762 0.1029 0.6201
23.2727 2560 - 0.1025 0.6195
23.3636 2570 - 0.1024 0.6215
23.4545 2580 - 0.1028 0.6224
23.5455 2590 - 0.1036 0.6232
23.6364 2600 0.0815 0.1037 0.6227
23.7273 2610 - 0.1039 0.6227
23.8182 2620 - 0.1036 0.6211
23.9091 2630 - 0.1034 0.6192
24.0 2640 - 0.1033 0.6193
24.0909 2650 0.0661 0.1033 0.6178
24.1818 2660 - 0.1027 0.6174
24.2727 2670 - 0.1024 0.6198
24.3636 2680 - 0.1025 0.6184
24.4545 2690 - 0.1020 0.6181
24.5455 2700 0.0679 0.1020 0.6194
24.6364 2710 - 0.1020 0.6185
24.7273 2720 - 0.1027 0.6196
24.8182 2730 - 0.1027 0.6191
24.9091 2740 - 0.1030 0.6196
25.0 2750 0.0713 0.1035 0.6208
25.0909 2760 - 0.1042 0.6187
25.1818 2770 - 0.1049 0.6181
25.2727 2780 - 0.1051 0.6200
25.3636 2790 - 0.1051 0.6204
25.4545 2800 0.0786 0.1048 0.6184
25.5455 2810 - 0.1049 0.6198
25.6364 2820 - 0.1051 0.6200
25.7273 2830 - 0.1051 0.6198
25.8182 2840 - 0.1048 0.6190
25.9091 2850 0.0613 0.1050 0.6196
26.0 2860 - 0.1050 0.6183
26.0909 2870 - 0.1047 0.6198
26.1818 2880 - 0.1046 0.6197
26.2727 2890 - 0.1045 0.6217
26.3636 2900 0.0576 0.1045 0.6208
26.4545 2910 - 0.1047 0.6192
26.5455 2920 - 0.1046 0.6220
26.6364 2930 - 0.1042 0.6189
26.7273 2940 - 0.1039 0.6204
26.8182 2950 0.066 0.1036 0.6215
26.9091 2960 - 0.1032 0.6188
27.0 2970 - 0.1030 0.6209
27.0909 2980 - 0.1027 0.6203
27.1818 2990 - 0.1026 0.6215
27.2727 3000 0.0681 0.1025 0.6212
27.3636 3010 - 0.1026 0.6193
27.4545 3020 - 0.1027 0.6189
27.5455 3030 - 0.1028 0.6195
27.6364 3040 - 0.1030 0.6196
27.7273 3050 0.081 0.1031 0.6187
27.8182 3060 - 0.1032 0.6181
27.9091 3070 - 0.1030 0.6177
28.0 3080 - 0.1029 0.6202
28.0909 3090 - 0.1030 0.6193
28.1818 3100 0.0443 0.1031 0.6195
28.2727 3110 - 0.1031 0.6195
28.3636 3120 - 0.1032 0.6177
28.4545 3130 - 0.1034 0.6187
28.5455 3140 - 0.1035 0.6189
28.6364 3150 0.0646 0.1036 0.6187
28.7273 3160 - 0.1037 0.6199
28.8182 3170 - 0.1038 0.6208
28.9091 3180 - 0.1038 0.6190
29.0 3190 - 0.1038 0.6191
29.0909 3200 0.0692 0.1038 0.6190
29.1818 3210 - 0.1038 0.6201
29.2727 3220 - 0.1038 0.6194
29.3636 3230 - 0.1037 0.6201
29.4545 3240 - 0.1037 0.6189
29.5455 3250 0.084 0.1037 0.6194
29.6364 3260 - 0.1037 0.6189
29.7273 3270 - 0.1038 0.6199
29.8182 3280 - 0.1038 0.6194
29.9091 3290 - 0.1038 0.6191
30.0 3300 0.0598 0.1038 0.6190
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.0
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}