SentenceTransformer based on BAAI/bge-base-en

This is a sentence-transformers model finetuned from BAAI/bge-base-en. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("mavihsrr/bge-final-small-retail-v2")
# Run inference
sentences = [
    'Extra Virgin Coconut Oil. Description :This cold pressed, pure, natural, extra virgin coconut oil because of its high saturated fat content, it is slow to oxidize and, thus, resistant to rancidification, lasting up to six months at 24  DegreeC without spoiling. This is the purest form of coconut oil, which retains all of its goodness.!',
    'Rubber Gloves - Cotton Lined, Soft & Non Slip, Medium. Description :The Super Strong Elbow Grease Rubber Gloves in large protect the hands from bacteria and chemicals during cleaning tasks, ideal for dishwashing, scrubbing task and using harmful chemicals. These high-quality designs are cotton lined, soft & non-slip gloves for ease to use. The Elbow Grease Rubber Gloves are the only gloves you will ever need. Great for domestic or commercial cleaning purpose.!',
    'Wax Candles - Metal, Smokeless, White, CD 05. Description :Enrich the ambience of the place as you place these captivating looking tealight candles. These are white in colour and round in shape. They are filled with wax and wick inside. It is suitable for decorating the house during the festive occasions and parties. These are smokeless candles that do not leave any soot residue behind. Also, the tealight candles burn fully without damaging your furniture or floor. Also, it has 25 pieces.!',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 174,064 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 12 tokens
    • mean: 116.54 tokens
    • max: 512 tokens
    • min: 8 tokens
    • mean: 111.22 tokens
    • max: 512 tokens
    • min: 0.1
    • mean: 0.64
    • max: 0.97
  • Samples:
    sentence1 sentence2 score
    Oil Clear Mud Face Pack. Description :Himalaya Oil Clear Mud Face Pack Rejuvenate your dead skin with Himalaya Oil Clear Mud Face Pack. This herbal formulation deep cleanses facial skin and clears clogged pores by absorbing excess oil and removing impurities. It helps maintain the natural pH of the skin and has deep cleansing and detoxifying properties, leaving the skin cleansed and revitalized. Fullers Earth removes deep-seated dirt and pollutants. It absorbs oil, clears clogged pores and blemishes and helps remove dead skin. Fullers Earth also helps lighten tanned skin caused by UV rays.! Pure White Mineral Clay Anti Pollution Purity Face Wash Foam. Description :Giving your skin an oil overhaul doesn't have to be a drag. Oil stuck in your pores is what makes your skin feel oily again after a wash! POND'S Clay Foam and Mask is the most fun way to say goodbye oil and hello to an all-day matte glow. Made with 100% natural Moroccan clay that has 4x oil absorption power, it sucks out dirt and oil stuck deep within your pores. What's left behind? Skin that's glowing and matte all day long! Pond's Clay Foam is the most enjoyable and effective way to keep your skin oil-free for longer. Revolutionise face washing with the enriching power of Mineral Clay. One of the most efficacious ingredients in deep cleansing. Its enriched with skin-loving minerals to give you a bouncy glow. So, step up your deep cleansing regimen for an oil-free glow. The clay range comes in two exciting formats. The Pond's white beauty mineral clay foam brightens and smoothens your skin for an oil-free glow!... 0.9511584211850151
    Essence - Butter Scotch. Description :Concentrate Butterscotch Essence For Sauces, Desserts, Baking And Cakes.Butterscotch Adds A Luscious Flavor Note To Mochas, Lattes And Other Hot, Frozen And Chilled Drinks.! product
    Icing Sugar Icing Sugar. Description :Icing Sugar is finel...
    Icing Sugar Icing Sugar. Description :This finely granulat...
    Name: combined, dtype: object
    0.9643093974992689
    Marie Light Biscuit - Vita Orange. Description :Sunfeast Marie Light orange offer crisp & light biscuits completed with the choicest golden grains of sun-ripened oranges and wheat. It presents the only Marie biscuit in India with a stimulating, delicious orange flavour. Whats more, there is 0% transfat and 0% cholesterol making it an appetisingly vigorous biscuit.! Premium Wafer Bites - Dark Choco 100 g + Strawberry 100 g + Tiramisu 100 g. Description :Tasties brings you the Delicious Creamy & Crunchy Wafer Bites. Indulge in the taste of 5 wafers and 4 cream layered mini wafer bites with mouth-melting dark chocolate filling.
    Tasties brings you the Delicious Creamy & Crunchy Wafer Bites. Indulge in the taste of 5 wafers and 4 cream layered mini wafer bites with mouth-melting strawberry filling.
    Tasties brings you the Delicious Creamy & Crunchy Wafer Bites. Indulge in the taste of 5 wafers and 4 cream layered mini wafer bites with mouth Tiramisu hazelnut filling.

    Munch on this and say bye to your small hunger pangs.!
    0.8838966912863657
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 21,759 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 10 tokens
    • mean: 121.44 tokens
    • max: 512 tokens
    • min: 10 tokens
    • mean: 112.13 tokens
    • max: 512 tokens
    • min: 0.1
    • mean: 0.61
    • max: 0.97
  • Samples:
    sentence1 sentence2 score
    Rose Glycerin Soap For Clean & Refreshed Skin -Cold Processed, 100% Natural & Organic. Description :Feel fresh, clean and refreshed with Rose which will leave your skin delicately scented with an uplifting rose fragrance. This soap does not use any animal product like milk or honey. It is completely vegan. Rose helps to improve the skin's appearance and to perfume the skin. It contains glycerin that softens the skin. It has no added preservatives and SLS and is 100% natural herbs and essential oils. It is a vegan product and is SLS free.! product
    Relax Moisturising Hand Wash - Lavender & Ylang-Ylang Relax Moisturising Hand Wash - Lavender & Ylan...
    Relax Moisturising Hand Wash - Lavender & Ylang-Ylang Relax Moisturising Hand Wash - Lavender & Ylan...
    Name: combined, dtype: object
    0.9641479761938232
    Dog Food - Focus Starter, Super Premium. Description :The Drools Focus, Super premium all breed formula for Puppies is formulated with the finest natural ingredients to help your dog live a long and healthy life. The result of exhaustive scientific research carried out over the years, by some of the most experienced veterinarians and nutritionists. Just like the rest of the Drools products, this one too is manufactured with a keen eye for detail and utmost care at Asias largest and most modern plant.! Erina - Coat Cleanser. Description :Action : Dandruff control : Erina prevents the formation of dandruff on your pets skin and hair coat. Antimicrobial : Its antiseptic and antibacterial cleansing eliminates germs and improves overall skin hygiene. Erina protects the body against commonly found pathogens that cause itching and bacterial infections. Deodorant : Erinas deodorizing properties eliminate foul odor. Indications : For controlling dandruff in the hair coat. Prevention and management of pruritus (itching) and pyoder(superficial bacterial infection). Used in routine bathing as a cleanser to maintain a healthy coat.! 0.9112330093194662
    Fruit & Food Nibbler With Silicone Sack - Green. Description :Introducing new foods to your babys diet can be a fun learning experience as it provides him or her with new varying tastes and flavours. With Mee Mee fruit and food nibbler, your child can safely enjoy fruit and other kinds of whole foods, without the risk of choking or hurting his or her mouth.! Trendy Stainless Steel Bottle With Sipper Cap - Steel Matt Finish, PXP 1002 DQ. Description :Now free your environment, and yourself from the unhealthy plastic bottles and get a healthier one-time product for all your needs. These high-grade stainless steel bottles are here to enhance your dining and travelling experience, saving you from the negative effects of plastic. The single-walled steel bottles are perfect add-ons to your kitchen collection if you are looking for light-weighed, durable, classy looking product. The bottle comes with sipper & wide mouth steel cap, catering to double usage. Be it going to the gym, or sending it with the kids to school, the colourful sipper can always make it a very convenient, handy and more importantly a style-statement product. You can take it to the office or just keep on the dinner table. Open the wide mouth lid and use at ease. The bottles come with the major USP of inter-changeable lid facility. Now you can make the same steel cap bottle as ... 0.14806349984585232
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • bf16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0460 500 0.1008 -
0.0092 100 0.0515 0.0453
0.0184 200 0.0532 -
0.0368 100 0.0491 0.0393
0.0735 200 0.0427 0.0333
0.1103 300 0.0373 0.0257
0.1471 400 0.0294 0.0188
0.1838 500 0.0212 0.0169
0.2206 600 0.0174 0.0131
0.2574 700 0.0145 0.0123
0.2941 800 0.0125 0.0094
0.3309 900 0.0109 0.0103
0.3676 1000 0.0102 0.0086
0.4044 1100 0.0075 0.0088
0.4412 1200 0.0077 0.0076
0.4779 1300 0.0071 0.0070
0.5147 1400 0.007 0.0072
0.5515 1500 0.0065 0.0068
0.5882 1600 0.0058 0.0073
0.625 1700 0.0064 0.0075
0.6618 1800 0.0057 0.0062
0.6985 1900 0.0055 0.0060
0.7353 2000 0.0054 0.0071
0.7721 2100 0.0055 0.0062
0.8088 2200 0.005 0.0065
0.8456 2300 0.0064 0.0061
0.8824 2400 0.0046 0.0056
0.9191 2500 0.0045 0.0051
0.9559 2600 0.0042 0.0051
0.9926 2700 0.0046 0.0055
1.0294 2800 0.0041 0.0053
1.0662 2900 0.005 0.0057
1.1029 3000 0.0033 0.0055
1.1397 3100 0.0037 0.0054
1.1765 3200 0.004 0.0052
1.2132 3300 0.0038 0.0049
1.25 3400 0.0038 0.0047
1.2868 3500 0.0035 0.0052
1.3235 3600 0.0034 0.0048
1.3603 3700 0.0035 0.0049
1.3971 3800 0.0034 0.0045
1.4338 3900 0.0037 0.0048
1.4706 4000 0.0036 0.0047
1.5074 4100 0.0031 0.0046
1.5441 4200 0.0039 0.0045
1.5809 4300 0.0033 0.0046
1.6176 4400 0.0033 0.0047
1.6544 4500 0.0035 0.0047
1.6912 4600 0.0029 0.0047
1.7279 4700 0.0035 0.0046
1.7647 4800 0.0033 0.0046
1.8015 4900 0.003 0.0046
1.8382 5000 0.0027 0.0045
1.875 5100 0.003 0.0043
1.9118 5200 0.0031 0.0046
1.9485 5300 0.0029 0.0045
1.9853 5400 0.003 0.0044
2.0221 5500 0.0031 0.0044
2.0588 5600 0.0028 0.0044
2.0956 5700 0.0032 0.0044
2.1324 5800 0.0027 0.0043
2.1691 5900 0.0032 0.0043
2.2059 6000 0.0029 0.0043
2.2426 6100 0.0028 0.0043
2.2794 6200 0.0028 0.0045
2.3162 6300 0.0032 0.0043
2.3529 6400 0.0026 0.0043
2.3897 6500 0.0026 0.0043
2.4265 6600 0.0024 0.0044
2.4632 6700 0.0024 0.0042
2.5 6800 0.0028 0.0043
2.5368 6900 0.0026 0.0043
2.5735 7000 0.0028 0.0042
2.6103 7100 0.0024 0.0043
2.6471 7200 0.0023 0.0042
2.6838 7300 0.0027 0.0041
2.7206 7400 0.0024 0.0041
2.7574 7500 0.003 0.0041
2.7941 7600 0.003 0.0041
2.8309 7700 0.0028 0.0041
2.8676 7800 0.0029 0.0041
2.9044 7900 0.0026 0.0041
2.9412 8000 0.0022 0.0041
2.9779 8100 0.0023 0.0041

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
11
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mavihsrr/retail-embedding-classifier-v1

Base model

BAAI/bge-base-en
Finetuned
(8)
this model