Edit model card

SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("kperkins411/multi-qa-MiniLM-L6-cos-v1_triplet")
# Run inference
sentences = [
    'sponsor acknowledges and agrees that, notwithstanding the grant of exclusivity set forth in this section 4, team shall have the right to solicit and enter into sponsorships with other parties that are not known primarily or exclusively as suppliers or providers of any product or service within the product and services category.',
    "what does 'foregoing restriction' refer to specifically within the context of sponsorships?",
    'for the avoidance of doubt, the parties acknowledge that the foregoing restriction applies only to persistent sponsorship placement as judged by sponsor at its discretion, and not to run-of-site banner advertisements or other rotating promotional placements.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.5287
dot_accuracy 0.4732
manhattan_accuracy 0.5104
euclidean_accuracy 0.5142
max_accuracy 0.5287

Triplet

Metric Value
cosine_accuracy 0.5291
dot_accuracy 0.4709
manhattan_accuracy 0.5101
euclidean_accuracy 0.5154
max_accuracy 0.5291

Training Details

Training Dataset

Unnamed Dataset

  • Size: 32,621 training samples
  • Columns: negative, anchor, and positive
  • Approximate statistics based on the first 1000 samples:
    negative anchor positive
    type string string string
    details
    • min: 6 tokens
    • mean: 80.74 tokens
    • max: 512 tokens
    • min: 5 tokens
    • mean: 17.19 tokens
    • max: 167 tokens
    • min: 6 tokens
    • mean: 101.64 tokens
    • max: 512 tokens
  • Samples:
    negative anchor positive
    c. the obligations specified in this article shall not apply to information for which the receiving party can reasonably demonstrate that such information: iii. becomes known to the receiving party through disclosure by sources other than the disclosing party, having a right to disclose such information, what safeguards are in place to protect the information obtained from third-party sources? information we collect from other sources we may also receive information from other sources and combine that with information we collect through our services. for example: if you choose to link, create, or log in to your uber account with a payment provider (e.g., google wallet) or social media service (e.g., facebook), or if you engage with a separate app or website that uses our api (or whose api we use), we may receive information about you or your connections from that site or app.
    3.2 manufacturing standards the manufacturer covenants that it is and will remain for the term of this agreement in compliance with all international standards in production and manufacturing. is there a guarantee from the manufacturers regarding the conformity of the items to the mutually approved written standards for a certain duration? each of the suppliers warrants that the products shall comply with the specifications and documentation agreed by the relevant supplier and the company in writing that is applicable to such products for the warranty period.
    planetcad hereby grants to dassault systemes a fully-paid, non-exclusive, worldwide, revocable limited license to the server software and infrastructure for the sole purpose of (i) hosting the co-branded service and (ii) fulfilling itsobligations under this agreement. what type of authorization has the video conferencing service provided to the british virgin islands-based entity and its associated organization regarding their intellectual property, with respect to the customized software and web platform, including the conditions for customer access to enhanced functionalities that incur additional charges? skype hereby grants to online bvi and the company a limited, non-exclusive, non-sublicensable (except as set forth herein), non-transferable, non-assignable (except as provided in section 14.4), royalty-free (but subject to the provisions of section 5), license during the term to use, market, provide access to, promote, reproduce and display the skype intellectual property solely (i) as incorporated in the company-skype branded application and/or the company-skype toolbar, and (ii) as incorporated in, for the development of, and for transmission pursuant to this agreement of, the company-skype branded content and the company-skype branded web site, in each case for the sole purposes (unless otherwise mutually agreed by the parties) of promoting and distributing, pursuant to this agreement, the company-skype branded application, the company-skype toolbar, the company-skype branded content and the company-skype branded web site in the territory; (a) provided, that it is understood that the company-skype branded customers will have the right under the eula to use the company- skype branded application and the company-skype toolbar and will have the right to access the company-skype branded content, the company-skype branded web site and the online bvi web site through the internet and to otherwise receive support from the company anywhere in the world, and that the company shall be permitted to provide access to and reproduce and display the skype intellectual property through the internet anywhere in the world, and (b) provided further, that online bvi and the company shall ensure that no company-skype branded customer (or potential company-skype branded customer) shall be permitted to access, using the company-skype branded application or the company-skype toolbar or through the company-skype branded web site, any skype premium features requiring payment by the company-skype branded customer (or potential company-skype branded customer), including, but not limited to, skypein, skypeout, or skype plus, unless such company-skype branded customer (or potential company-skype branded customer) uses the payment methods made available by the company pursuant to section 2.5 for the purchase of such premium features.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 2,641 evaluation samples
  • Columns: negative, anchor, and positive
  • Approximate statistics based on the first 1000 samples:
    negative anchor positive
    type string string string
    details
    • min: 6 tokens
    • mean: 83.63 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 18.61 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 98.17 tokens
    • max: 512 tokens
  • Samples:
    negative anchor positive
    this agreement shall be governed by, and construed in accordance with the law of the state of new york. are there any exceptions to the governing law stated? this agreement shall be governed by the laws of the state of california, without regard to the conflicts of law provisions of any jurisdiction.
    you consent to the third party use, sharing and transfer of your personal information (both inside and outside of your jurisdiction) as described in this section. these third parties will use personal information to provide services to us and for their own internal use, including analytics use. we allow third parties such as analytics providers and advertising partners to collect your personal information over time and across different websites or online services when you use our services. collection of personal data legal basis? 15. notice for malaysia residents close in view of the implementation of the personal data protection act 2010 ("act"), sony mobile recognises the need to process all personal data obtained in a lawful and appropriate manner. the legal responsibility for compliance with the act lies with sony mobile, which is the "data user" under the act. compliance with this privacy policy and the act is the responsibility of all employees of sony mobile. as and when sony mobile is required to collect personal data, sony mobile and its employees must abide by the requirements of this privacy policy and the act. in the context of the act, "processing" is defined as including the collection, recording, holding or storing of personal data which includes, inter alia, nric numbers, home address and contact details.
    you can prevent peel from showing you targeted ads by sending an email to privacy@peel.com and asking to opt-out of targeted advertising. opting-out will only prevent targeted ads from being displayed so you may continue to see generic (non-targeted) ads from peel after you opt-out. for more information on interest-based ads or to stop use of tracking technologies for these purposes, go to www.aboutads.info or www.networkadvertising.org. how does one opt out from third-party analytics providers? when you use our services, we collect the following information: information about your device (including device model, os version and operator's name), time and date of the connection to the game and/or service, ip or mac address, international mobile equipment id (imei), android id, device mac address, cookie information. we also from time-to-time use services provided by third party companies that might collect information from you, and you can opt-out from this. follow the directions provided by our other third party analytics provider located at http://www.flurry.com/user-opt-out.html, https://help.chartboost.com/legal/privacy, http://privacy.adcolony.com/, http://info.tapjoy.com/about-tapjoy/privacy-policy/, http://sponsorpay.com/. if you "opt out" with our third party analytics providers, that action is specific to the information we collect specifically for that provider, and does not limit our ability to collect information from you, under the terms of this privacy policy, for other third parties.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss all-nli-dev_max_accuracy all-nli-test_max_accuracy
0 0 - - 0.7235 -
0.1961 100 4.9029 3.1938 0.6058 -
0.3922 200 2.4204 1.5424 0.5507 -
0.5882 300 1.6076 1.0643 0.5344 -
0.7843 400 1.3142 0.8831 0.5351 -
0.9804 500 1.1919 0.7455 0.5435 -
1.1745 600 1.0824 0.6599 0.5427 -
1.3706 700 0.963 0.6360 0.5518 -
1.5667 800 0.8922 0.6131 0.5397 -
1.7627 900 0.8417 0.5900 0.5302 -
1.9588 1000 0.8165 0.5662 0.5253 -
2.1529 1100 0.7774 0.5192 0.5177 -
2.3490 1200 0.7394 0.5158 0.5363 -
2.5451 1300 0.7003 0.5185 0.5363 -
2.7412 1400 0.6636 0.5004 0.5310 -
2.9373 1500 0.6586 0.4872 0.5302 -
3.1314 1600 0.6831 0.4687 0.5306 -
3.3275 1700 0.6494 0.4667 0.5268 -
3.5235 1800 0.624 0.4750 0.5321 -
3.7196 1900 0.6035 0.4735 0.5264 -
3.9157 2000 0.6136 0.4679 0.5287 -
3.9941 2040 - - - 0.5291

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.31.0
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
2
Safetensors
Model size
22.7M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results