Add new SentenceTransformer model.

b6b47ba verified 5 months ago

28.2 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: sentence-transformers
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:161
	- loss:MatryoshkaLoss
	- loss:MultipleNegativesRankingLoss
	base_model: BAAI/bge-base-en-v1.5
	datasets: []
	metrics:
	- cosine_accuracy@1
	- cosine_accuracy@3
	- cosine_accuracy@5
	- cosine_accuracy@10
	- cosine_precision@1
	- cosine_precision@3
	- cosine_precision@5
	- cosine_precision@10
	- cosine_recall@1
	- cosine_recall@3
	- cosine_recall@5
	- cosine_recall@10
	- cosine_ndcg@10
	- cosine_mrr@10
	- cosine_map@100
	widget:
	- source_sentence: 'As per Part II of the PDPA, Personal Data Protection Commission
	(PDPC) is the

	regulatory body to enforce the provisions of PDPA. The PDPC is empowered with

	broad discretion to issue remedial directions, initiate investigation

	inquiries, and impose fines and penalties on the organisations in case of any

	non-compliance of PDPA.


	1


	If organisations misuse the personal data or hide information concerning its

	collection, use, or disclosure, PDPA states penalties not exceeding **S$50,000

	(approx. $36,000)**.


	2


	Penalty for hindering a PDPC investigation can lead to a fine of not more than

	S$100,000 (approx. $72,000). The PDPA states that companies are also

	liable for their employees’ actions, whether they are aware of them or not.


	3


	New amendments to PDPA have enforced increased financial penalties for

	breaches of the PDPA up to 10% of annual gross turnover in Singapore, or

	S$ 1 million , whichever is higher.


	4


	Non-compliance with specific provisions under the PDPA may also constitute an

	offense, for which a fine or a term of imprisonment may be imposed.


	5


	An individual can bring a private civil action against an organisation for

	having suffered loss or damage directly due to a contravention of the

	provisions of the PDPA.'
	sentences:
	- What is the right to notice under the CCPA?
	- What are the risks of non-compliance with the PDPA?
	- What is the definition of personal data under the PDP Law?
	- source_sentence: The DPA requires all data controllers to take appropriate technical
	and organisational measures that are necessary to protect data from unauthorised
	destruction, negligent loss, unauthorised alteration or access and any other unauthorised
	processing of the data.
	sentences:
	- Which regulatory authority enforces GDPR in France?
	- What are the security requirements under the DPA?
	- How do PIPEDA and GDPR differ?
	- source_sentence: if the data controller or the data processor holds a valid registration
	certificate authorizing him or her to store personal data outside Rwanda
	sentences:
	- What is the difference between GDPR and a Data Protection Act?
	- What is the voluntary certification by the CPPA?
	- Where is personal data storage outside of Rwanda permitted?
	- source_sentence: The PDP law will regulate sensitive personal data as well as other
	personal data that may endanger or harm the privacy of the data subject.
	sentences:
	- What is the material scope of the PDP Law?
	- What is the definition of personal information under the DPA in the Philippines?
	- What does Securiti offer to help with data privacy compliance?
	- source_sentence: Thailand's PDPA applies to any legal entity collecting, using,
	or disclosing a natural (and alive) person's personal data.
	sentences:
	- Who does the Thailand's PDPA apply to?
	- What penalties could an organization face for infringing Kenya's Data Protection
	Act?
	- What is the CPRA?
	pipeline_tag: sentence-similarity
	model-index:
	- name: SentenceTransformer based on BAAI/bge-base-en-v1.5
	results:
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 768
	type: dim_768
	metrics:
	- type: cosine_accuracy@1
	value: 0.5
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.8333333333333334
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.9444444444444444
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 1.0
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.27777777777777773
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1888888888888889
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.10000000000000002
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.8333333333333334
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.9444444444444444
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 1.0
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.736082728585743
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6515432098765431
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6515432098765432
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 512
	type: dim_512
	metrics:
	- type: cosine_accuracy@1
	value: 0.5
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7777777777777778
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.9444444444444444
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 1.0
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.25925925925925924
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1888888888888889
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.10000000000000002
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7777777777777778
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.9444444444444444
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 1.0
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.744344523828935
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6626543209876543
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6626543209876543
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 256
	type: dim_256
	metrics:
	- type: cosine_accuracy@1
	value: 0.5
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.8888888888888888
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8888888888888888
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 1.0
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.2962962962962962
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1777777777777778
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.10000000000000002
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.8888888888888888
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8888888888888888
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 1.0
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7569877225340996
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6790123456790123
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6790123456790124
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 128
	type: dim_128
	metrics:
	- type: cosine_accuracy@1
	value: 0.5
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.8333333333333334
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8888888888888888
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9444444444444444
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.27777777777777773
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1777777777777778
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09444444444444446
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.8333333333333334
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8888888888888888
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9444444444444444
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7291386563584304
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6589506172839507
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6604938271604938
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 64
	type: dim_64
	metrics:
	- type: cosine_accuracy@1
	value: 0.4444444444444444
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.6111111111111112
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.6666666666666666
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 1.0
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.4444444444444444
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.2037037037037037
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.13333333333333336
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.10000000000000002
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.4444444444444444
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.6111111111111112
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.6666666666666666
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 1.0
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.6740519326169271
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.5768298059964727
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.5768298059964727
	name: Cosine Map@100
	---

	# SentenceTransformer based on BAAI/bge-base-en-v1.5

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	- Language: en
	- License: apache-2.0

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("MugheesAwan11/bge-base-securiti-dataset-1-v5")
	# Run inference
	sentences = [
	"Thailand's PDPA applies to any legal entity collecting, using, or disclosing a natural (and alive) person's personal data.",
	"Who does the Thailand's PDPA apply to?",
	"What penalties could an organization face for infringing Kenya's Data Protection Act?",
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Information Retrieval
	* Dataset: `dim_768`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| cosine_accuracy@1 \| 0.5 \|
	\| cosine_accuracy@3 \| 0.8333 \|
	\| cosine_accuracy@5 \| 0.9444 \|
	\| cosine_accuracy@10 \| 1.0 \|
	\| cosine_precision@1 \| 0.5 \|
	\| cosine_precision@3 \| 0.2778 \|
	\| cosine_precision@5 \| 0.1889 \|
	\| cosine_precision@10 \| 0.1 \|
	\| cosine_recall@1 \| 0.5 \|
	\| cosine_recall@3 \| 0.8333 \|
	\| cosine_recall@5 \| 0.9444 \|
	\| cosine_recall@10 \| 1.0 \|
	\| cosine_ndcg@10 \| 0.7361 \|
	\| cosine_mrr@10 \| 0.6515 \|
	\| cosine_map@100 \| 0.6515 \|

	#### Information Retrieval
	* Dataset: `dim_512`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| cosine_accuracy@1 \| 0.5 \|
	\| cosine_accuracy@3 \| 0.7778 \|
	\| cosine_accuracy@5 \| 0.9444 \|
	\| cosine_accuracy@10 \| 1.0 \|
	\| cosine_precision@1 \| 0.5 \|
	\| cosine_precision@3 \| 0.2593 \|
	\| cosine_precision@5 \| 0.1889 \|
	\| cosine_precision@10 \| 0.1 \|
	\| cosine_recall@1 \| 0.5 \|
	\| cosine_recall@3 \| 0.7778 \|
	\| cosine_recall@5 \| 0.9444 \|
	\| cosine_recall@10 \| 1.0 \|
	\| cosine_ndcg@10 \| 0.7443 \|
	\| cosine_mrr@10 \| 0.6627 \|
	\| cosine_map@100 \| 0.6627 \|

	#### Information Retrieval
	* Dataset: `dim_256`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:----------\|
	\| cosine_accuracy@1 \| 0.5 \|
	\| cosine_accuracy@3 \| 0.8889 \|
	\| cosine_accuracy@5 \| 0.8889 \|
	\| cosine_accuracy@10 \| 1.0 \|
	\| cosine_precision@1 \| 0.5 \|
	\| cosine_precision@3 \| 0.2963 \|
	\| cosine_precision@5 \| 0.1778 \|
	\| cosine_precision@10 \| 0.1 \|
	\| cosine_recall@1 \| 0.5 \|
	\| cosine_recall@3 \| 0.8889 \|
	\| cosine_recall@5 \| 0.8889 \|
	\| cosine_recall@10 \| 1.0 \|
	\| cosine_ndcg@10 \| 0.757 \|
	\| cosine_mrr@10 \| 0.679 \|
	\| cosine_map@100 \| 0.679 \|

	#### Information Retrieval
	* Dataset: `dim_128`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| cosine_accuracy@1 \| 0.5 \|
	\| cosine_accuracy@3 \| 0.8333 \|
	\| cosine_accuracy@5 \| 0.8889 \|
	\| cosine_accuracy@10 \| 0.9444 \|
	\| cosine_precision@1 \| 0.5 \|
	\| cosine_precision@3 \| 0.2778 \|
	\| cosine_precision@5 \| 0.1778 \|
	\| cosine_precision@10 \| 0.0944 \|
	\| cosine_recall@1 \| 0.5 \|
	\| cosine_recall@3 \| 0.8333 \|
	\| cosine_recall@5 \| 0.8889 \|
	\| cosine_recall@10 \| 0.9444 \|
	\| cosine_ndcg@10 \| 0.7291 \|
	\| cosine_mrr@10 \| 0.659 \|
	\| cosine_map@100 \| 0.6605 \|

	#### Information Retrieval
	* Dataset: `dim_64`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| cosine_accuracy@1 \| 0.4444 \|
	\| cosine_accuracy@3 \| 0.6111 \|
	\| cosine_accuracy@5 \| 0.6667 \|
	\| cosine_accuracy@10 \| 1.0 \|
	\| cosine_precision@1 \| 0.4444 \|
	\| cosine_precision@3 \| 0.2037 \|
	\| cosine_precision@5 \| 0.1333 \|
	\| cosine_precision@10 \| 0.1 \|
	\| cosine_recall@1 \| 0.4444 \|
	\| cosine_recall@3 \| 0.6111 \|
	\| cosine_recall@5 \| 0.6667 \|
	\| cosine_recall@10 \| 1.0 \|
	\| cosine_ndcg@10 \| 0.6741 \|
	\| cosine_mrr@10 \| 0.5768 \|
	\| cosine_map@100 \| 0.5768 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset


	* Size: 161 training samples
	* Columns: <code>positive</code> and <code>anchor</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| positive \| anchor \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 40.09 tokens</li><li>max: 481 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 13.01 tokens</li><li>max: 24 tokens</li></ul> \|
	* Samples:
	\| positive \| anchor \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------\|
	\| <code>The DPA may impose administrative fines of up to €10 million, or up to 2%<br>of<br>worldwide turnover. The DPA may also impose heavier fines up to €20 million,<br>or up to 4% of worldwide turnover.</code> \| <code>What is the penalty for non-compliance with the GDPR in Italy?</code> \|
	\| <code>As per the DPA, the data handler must seek consent in writing from the data subject to collect any sensitive personal data.</code> \| <code>What are the consent requirements under the DPA?</code> \|
	\| <code>China's cybersecurity laws include the Cybersecurity Law, which governs<br>various aspects of cybersecurity, data protection, and the obligations of<br>organizations to ensure the security of networks and data within China's<br>territory.</code> \| <code>What are the cybersecurity laws in China?</code> \|
	* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesRankingLoss",
	"matryoshka_dims": [
	768,
	512,
	256,
	128,
	64
	],
	"matryoshka_weights": [
	1,
	1,
	1,
	1,
	1
	],
	"n_dims_per_step": -1
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `eval_strategy`: epoch
	- `per_device_train_batch_size`: 32
	- `per_device_eval_batch_size`: 16
	- `gradient_accumulation_steps`: 2
	- `learning_rate`: 2e-05
	- `num_train_epochs`: 2
	- `lr_scheduler_type`: cosine
	- `warmup_ratio`: 0.1
	- `bf16`: True
	- `tf32`: True
	- `load_best_model_at_end`: True
	- `optim`: adamw_torch_fused
	- `batch_sampler`: no_duplicates

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: epoch
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 32
	- `per_device_eval_batch_size`: 16
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 2
	- `eval_accumulation_steps`: None
	- `learning_rate`: 2e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 2
	- `max_steps`: -1
	- `lr_scheduler_type`: cosine
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.1
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: True
	- `fp16`: False
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: True
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: True
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch_fused
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: False
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `batch_sampler`: no_duplicates
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	\| Epoch \| Step \| dim_128_cosine_map@100 \| dim_256_cosine_map@100 \| dim_512_cosine_map@100 \| dim_64_cosine_map@100 \| dim_768_cosine_map@100 \|
	\|:-------:\|:-----:\|:----------------------:\|:----------------------:\|:----------------------:\|:---------------------:\|:----------------------:\|
	\| 1.0 \| 3 \| 0.6510 \| 0.6691 \| 0.6534 \| 0.5641 \| 0.6515 \|
	\| 2.0 \| 6 \| 0.6605 \| 0.679 \| 0.6627 \| 0.5768 \| 0.6515 \|

	* The bold row denotes the saved checkpoint.

	### Framework Versions
	- Python: 3.10.14
	- Sentence Transformers: 3.0.1
	- Transformers: 4.41.2
	- PyTorch: 2.1.2+cu121
	- Accelerate: 0.31.0
	- Datasets: 2.19.1
	- Tokenizers: 0.19.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### MatryoshkaLoss
	```bibtex
	@misc{kusupati2024matryoshka,
	title={Matryoshka Representation Learning},
	author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
	year={2024},
	eprint={2205.13147},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```

	#### MultipleNegativesRankingLoss
	```bibtex
	@misc{henderson2017efficient,
	title={Efficient Natural Language Response Suggestion for Smart Reply},
	author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
	year={2017},
	eprint={1705.00652},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->