metadata
base_model: sentence-transformers/all-mpnet-base-v2
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:2400
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: Are there any furniture stores? (variation 536)
sentences:
- >-
Event tickets can be purchased at the customer service desk or online
through the mall's website.
- The Apple Store is located on the second floor near the food court.
- >-
Yes, there are furniture stores including IKEA and Ashley Furniture,
both located on the second floor.
- source_sentence: Is there a play area for kids? (variation 121)
sentences:
- >-
The customer service desk is located on the ground floor near the main
entrance.
- >-
Yes, there is a play area for kids on the first floor near the west
entrance.
- >-
Yes, there is a luggage store on the second floor near the central
atrium.
- source_sentence: Are there any sports stores? (variation 931)
sentences:
- Yes, there is a toy store on the first floor near the west entrance.
- >-
Event tickets can be purchased at the customer service desk or online
through the mall's website.
- >-
Yes, there are sports stores including Nike and Adidas, both located on
the first floor.
- source_sentence: Where can I charge my phone? (variation 904)
sentences:
- >-
Yes, reservations for 'The Gourmet Palace' can be made by calling their
direct line or via their website.
- >-
Yes, there is a photography studio on the first floor near the main
entrance.
- >-
Phone charging stations are available throughout the mall, including
near the food court and at the customer service desk.
- source_sentence: Does the mall have a post office? (variation 1412)
sentences:
- >-
Yes, there is a photography studio on the first floor near the main
entrance.
- Yes, there is a game arcade on the third floor next to the cinema.
- Yes, there is a post office on the ground floor near the west entrance.
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the train dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- train
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("anomys/gsm-finetunned")
# Run inference
sentences = [
'Does the mall have a post office? (variation 1412)',
'Yes, there is a post office on the ground floor near the west entrance.',
'Yes, there is a game arcade on the third floor next to the cinema.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
train
- Dataset: train
- Size: 2,400 training samples
- Columns:
question
andresponse
- Approximate statistics based on the first 1000 samples:
question response type string string details - min: 12 tokens
- mean: 15.28 tokens
- max: 21 tokens
- min: 16 tokens
- mean: 21.73 tokens
- max: 33 tokens
- Samples:
question response Where can I find an ATM in the mall? (variation 643)
ATMs are located on the ground floor next to the information desk and near the west entrance.
Is there a map of the mall available? (variation 701)
Yes, you can find interactive maps on our website and physical maps at the information desks located at each entrance.
Where can I find the customer service desk? (variation 227)
The customer service desk is located on the ground floor near the main entrance.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
train
- Dataset: train
- Size: 600 evaluation samples
- Columns:
question
andresponse
- Approximate statistics based on the first 1000 samples:
question response type string string details - min: 12 tokens
- mean: 15.22 tokens
- max: 21 tokens
- min: 16 tokens
- mean: 21.35 tokens
- max: 33 tokens
- Samples:
question response Are there any opticians in the mall? (variation 1802)
Yes, there are opticians including LensCrafters and Visionworks, both located on the first floor.
Is there a map of the mall available? (variation 1191)
Yes, you can find interactive maps on our website and physical maps at the information desks located at each entrance.
Are there any wheelchair-accessible entrances? (variation 1818)
Yes, all main entrances are wheelchair accessible, and we provide complimentary wheelchair rentals at the customer service desk.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 1warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | train loss |
---|---|---|---|
0.3333 | 50 | 0.0083 | 0.0000 |
0.6667 | 100 | 0.0 | 0.0000 |
1.0 | 150 | 0.0 | 0.0000 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.0+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}