End of training

Browse files

Files changed (13) hide show

1_Pooling/config.json +10 -0
README.md +478 -0
config.json +31 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
runs/Oct28_13-36-18_7fc723fca212/events.out.tfevents.1730122584.7fc723fca212.223.0 +3 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
training_args.bin +3 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,478 @@

+---
+base_model: avsolatorio/GIST-small-Embedding-v0
+library_name: sentence-transformers
+metrics:
+- cosine_accuracy
+- dot_accuracy
+- manhattan_accuracy
+- euclidean_accuracy
+- max_accuracy
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:3414
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: For all components of the structural-technical infrastructure,
+    at least the intervals and requirements for inspection and maintenance recommended
+    by the manufacturer or set by standards shall be complied with. Inspections and
+    maintenance work must be recorded. Fire barriers must be checked to see if they
+    are intact. The results must be documented.
+  sentences:
+  - Security perimeters shall be defined and used to protect areas that contain information
+    and other associated assets.
+  - The use of resources shall be monitored and adjusted in line with current and
+    expected capacity requirements.
+  - A.7.1
+- source_sentence: All employees and external users must be instructed and sensitized
+    in the safe handling of IT, ICS and IoT components, as far as this is relevant
+    for their work contexts. To this end, binding, understandable and up-to-date guidelines
+    for the use of the respective components must be available. If IT, ICS or IoT
+    systems or services are used in a way that contradicts the interests of the institution,
+    this must be communicated.
+  sentences:
+  - Security perimeters shall be defined and used to protect areas that contain information
+    and other associated assets.
+  - Records shall be protected from loss, destruction, falsification, unauthorized
+    access and unauthorized release.
+  - A.5.33
+- source_sentence: Data Lost Prevention (DLP) systems should be used at network level.
+  sentences:
+  - A.5.15
+  - Information security shall be integrated into project management.
+  - Rules to control physical and logical access to information and other associated
+    assets shall be established and implemented based on busi-ness and information
+    security requirements.
+- source_sentence: Ensure that audit records contain information that establishes
+    the following:a. What type of event occurred;b. When the event occurred;c. Where
+    the event occurred;d. Source of the event;e. Outcome of the event; andf. Identity
+    of any individuals, subjects, or objects/entities associated with the event.
+  sentences:
+  - 'Für alle Arten von Übertragungseinrichtungen innerhalb der
+    Organisation und zwischen der Organisation und anderen Parteien
+    müssen Regeln, Verfahren oder Vereinbarungen zur
+    Informationsübermittlung vorhanden sein.'
+  - A.8.15
+  - 'Protokolle, die Aktivitäten, Ausnahmen, Fehler und andere relevante
+    Ereignisse aufzeichnen, müssen erstellt, gespeichert, geschützt und
+    analysiert werden.'
+- source_sentence: A security incident must inform all affected internal and external
+    bodies in a timely manner. It is necessary to check whether the Data Protection
+    Officer, the Works and Staff Council and employees from the Legal Department need
+    to be involved. Similarly, the reporting requirements for authorities and regulated
+    sectors must be taken into account. It is also necessary to ensure that relevant
+    bodies are informed of the necessary measures.
+  sentences:
+  - Rules to control physical and logical access to information and other associated
+    assets shall be established and implemented based on busi-ness and information
+    security requirements.
+  - The organization shall plan and prepare for managing information secu-rity incidents
+    by defining, establishing and communicating information security incident management
+    processes, roles and responsibilities.
+  - A.5.24
+model-index:
+- name: SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0
+  results:
+  - task:
+      type: triplet
+      name: Triplet
+    dataset:
+      name: GIST small Embedding v0 4 batch 10 epoch all data en unique split robustness
+        42 eval
+      type: GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_42_eval
+    metrics:
+    - type: cosine_accuracy
+      value: 0.8762006403415155
+      name: Cosine Accuracy
+    - type: dot_accuracy
+      value: 0.09498399146211313
+      name: Dot Accuracy
+    - type: manhattan_accuracy
+      value: 0.8697972251867663
+      name: Manhattan Accuracy
+    - type: euclidean_accuracy
+      value: 0.8762006403415155
+      name: Euclidean Accuracy
+    - type: max_accuracy
+      value: 0.8762006403415155
+      name: Max Accuracy
+---
+# SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [avsolatorio/GIST-small-Embedding-v0](https://huggingface.co/avsolatorio/GIST-small-Embedding-v0). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [avsolatorio/GIST-small-Embedding-v0](https://huggingface.co/avsolatorio/GIST-small-Embedding-v0) <!-- at revision d6c4190f9e01b9994dc7cac99cf2f2b85cfb57bc -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 384 tokens
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split")
+# Run inference
+sentences = [
+    'A security incident must inform all affected internal and external bodies in a timely manner. It is necessary to check whether the Data Protection Officer, the Works and Staff Council and employees from the Legal Department need to be involved. Similarly, the reporting requirements for authorities and regulated sectors must be taken into account. It is also necessary to ensure that relevant bodies are informed of the necessary measures.',
+    'The organization shall plan and prepare for managing information secu-rity incidents by defining, establishing and communicating information security incident management processes, roles and responsibilities.',
+    'A.5.24',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 384]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Triplet
+* Dataset: `GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_42_eval`
+* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| **cosine_accuracy** | **0.8762** |
+| dot_accuracy        | 0.095      |
+| manhattan_accuracy  | 0.8698     |
+| euclidean_accuracy  | 0.8762     |
+| max_accuracy        | 0.8762     |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 3,414 training samples
+* Columns: <code>anchor</code>, <code>positive</code>, <code>ISO_ID</code>, and <code>negative</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                             | positive                                                                            | ISO_ID                                                                          | negative                                                                            |
+  |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                              | string                                                                          | string                                                                              |
+  | details | <ul><li>min: 3 tokens</li><li>mean: 79.84 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 23.34 tokens</li><li>max: 192 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 6.99 tokens</li><li>max: 7 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 22.91 tokens</li><li>max: 154 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | positive                                                                                                                                                                                                                  | ISO_ID              | negative                                                                                                                                                                            |
+  |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>System components in the area of responsibility of the Cloud Service Provider for the provision of the cloud service are automatically checked for known vulnerabilities at least once a month in accordance with the policies for handling vulnerabilities (cf. OPS-18), the severity is assessed in accordance with defined criteria and measures for timely remediation or mitigation are initiated within defined time windows.</code>                                                                                                                                                                                                            | <code>Information about technical vulnerabilities of information systems in use shall be obtained, the organization’s exposure to such vulnerabilities shall be evaluated and appropriate measures shall be taken.</code> | <code>A.8.8</code>  | <code>Information processing facilities shall be implemented with redundancy sufficient to meet availability requirements.</code>                                                   |
+  | <code>System components in the area of responsibility of the Cloud Service Provider for the provision of the cloud service are automatically checked for known vulnerabilities at least once a month in accordance with the policies for handling vulnerabilities (cf. OPS-18), the severity is assessed in accordance with defined criteria and measures for timely remediation or mitigation are initiated within defined time windows.</code>                                                                                                                                                                                                            | <code>Changes to information processing facilities and information systems shall be subject to change management procedures.</code>                                                                                       | <code>A.8.32</code> | <code>Rules for the effective use of cryptography, including cryptographic key management, shall be defined and implemented.</code>                                                 |
+  | <code>The Cloud Service Provider retains the generated log data and keeps these in an appropriate, unchangeable and aggregated form, regardless of the source of such data, so that a central, authorised evaluation of the data is possible. Log data is deleted if it is no longer required for the purpose for which they were collected. <br><br>Between logging servers and the assets to be logged, authentication takes place to protect the integrity and authenticity of the information transmitted and stored. The transfer takes place using state-of-the-art encryption or a dedicated administration network (out-of-band management).</code> | <code>Logs that record activities, exceptions, faults and other relevant events shall be produced, stored, protected and analysed.</code>                                                                                 | <code>A.8.15</code> | <code>Configurations, including security configurations, of hardware, software, services and networks shall be established, documented, implemented, monitored and reviewed.</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Evaluation Dataset
+#### Unnamed Dataset
+* Size: 937 evaluation samples
+* Columns: <code>anchor</code>, <code>positive</code>, <code>ISO_ID</code>, and <code>negative</code>
+* Approximate statistics based on the first 937 samples:
+  |         | anchor                                                                             | positive                                                                            | ISO_ID                                                                          | negative                                                                            |
+  |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                              | string                                                                          | string                                                                              |
+  | details | <ul><li>min: 12 tokens</li><li>mean: 76.9 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 41.55 tokens</li><li>max: 495 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 6.91 tokens</li><li>max: 7 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 40.68 tokens</li><li>max: 495 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | ISO_ID             | negative                                                                                                                                                                                                                |
+  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>The Cloud Service Provider's internal and external employees are required by the employment terms and conditions to comply with applicable policies and instructions relating to information security.<br><br>The information security policy, and the policies and instructions based on it, are to be acknowledged by the internal and external personnel in a documented form before access is granted to any cloud customer data or system components under the responsibility of the Cloud Service Provider used to provide the cloud service in the production environment.</code> | <code>The employment contractual agreements shall state the personnel’s and the organization’s responsibilities for information security.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | <code>A.6.2</code> | <code>The organization shall establish and implement procedures for the identification, collection, acquisition and preservation of evidence related to information security events.</code>                             |
+  | <code>The Cloud Service Provider has established procedures for inventorying assets.<br><br>The inventory is performed automatically and/or by the people or teams responsible for the assets to ensure complete, accurate, valid and consistent inventory throughout the asset lifecycle.<br><br>Assets are recorded with the information needed to apply the Risk Management Procedure (Cf. OIS-07), including the measures taken to manage these risks throughout the asset lifecycle. Changes to this information are logged.</code>                                                       | <code>An inventory of information and other associated assets, including owners, shall be developed and maintained.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | <code>A.5.9</code> | <code>Access rights to information and other associated assets shall be provisioned, reviewed, modified and removed in accordance with the organization’s topic-specific policy on and rules for access control.</code> |
+  | <code>The Cloud Service Provider provides a training program for regular, target group-oriented security training and awareness for internal and external employees on standards and methods of secure software development and provision as well as on how to use the tools used for this purpose. The program is regularly reviewed and updated with regard to the applicable policies and instructions, the assigned roles and responsibilities and the tools used.</code>                                                                                                                  | <code>The organization shall:<br>a)     determine  the  necessary  competence  of  person(s)  doing  work  under  its  control  that  affects  its information security performance;<br>b)    ensure  that  these  persons  are  competent  on  the  basis  of  appropriate  education,  training,  or experience;<br>c)     where applicable, take actions to acquire the necessary competence, and evaluate the effectiveness of the actions taken; and<br>d)    retain appropriate documented information as evidence of competence.<br>NOTE          Applicable actions can include, for example: the provision of training to, the mentoring of, or the re- assignment of current employees; or the hiring or contracting of competent persons.</code> | <code>7.2</code>   | <code>Knowledge gained from information security incidents shall be used to strengthen and improve the information security controls.</code>                                                                            |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: epoch
+- `per_device_train_batch_size`: 4
+- `per_device_eval_batch_size`: 4
+- `num_train_epochs`: 10
+- `warmup_ratio`: 0.1
+- `bf16`: True
+- `ddp_find_unused_parameters`: True
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: epoch
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 4
+- `per_device_eval_batch_size`: 4
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 10
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: True
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: True
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: True
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss | loss   | GIST-small-Embedding-v0-4_batch_10_epoch_all_data_en_unique_split_robustness_42_eval_cosine_accuracy |
+|:------:|:----:|:-------------:|:------:|:----------------------------------------------------------------------------------------------------:|
+| 0.9977 | 425  | 1.7795        | 1.4178 | 0.8036                                                                                               |
+| 1.9977 | 850  | 1.2852        | 1.1081 | 0.8591                                                                                               |
+| 2.9977 | 1275 | 1.0536        | 1.0428 | 0.8698                                                                                               |
+| 3.9977 | 1700 | 0.9389        | 1.0188 | 0.8741                                                                                               |
+| 4.9977 | 2125 | 0.8879        | 1.0129 | 0.8709                                                                                               |
+| 5.9977 | 2550 | 0.8557        | 1.0079 | 0.8698                                                                                               |
+| 6.9977 | 2975 | 0.8355        | 1.0076 | 0.8719                                                                                               |
+| 7.9977 | 3400 | 0.8151        | 1.0067 | 0.8751                                                                                               |
+| 8.9977 | 3825 | 0.8228        | 1.0065 | 0.8751                                                                                               |
+| 9.9977 | 4250 | 0.8174        | 1.0067 | 0.8762                                                                                               |
+### Framework Versions
+- Python: 3.10.14
+- Sentence Transformers: 3.1.0
+- Transformers: 4.45.1
+- PyTorch: 2.4.1+cu121
+- Accelerate: 0.34.2
+- Datasets: 3.0.1
+- Tokenizers: 0.20.0
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_name_or_path": "avsolatorio/GIST-small-Embedding-v0",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.45.1",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.0",
+    "transformers": "4.45.1",
+    "pytorch": "2.4.1+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d990b6ef6d2741ade04d1fd25e7acf7ac756a3ea52e626af22c15be5b6fa3872
+size 66742184

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

runs/Oct28_13-36-18_7fc723fca212/events.out.tfevents.1730122584.7fc723fca212.223.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9cc1bcbb81b8294e60eda38b7d9c4d33256644af1f6cce11e7dcfde26a9eae85
+size 16860

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:941b26cae7eea7bf38d1f26b5ebfcdfdae7b7d832c3f3c8ba7dc99921b254ed7
+size 5688

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff