Add new SentenceTransformer model.

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +470 -0
config.json +24 -0
config_sentence_transformers.json +9 -0
model.safetensors +3 -0
modules.json +14 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +51 -0
tokenizer.json +0 -0
tokenizer_config.json +65 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,470 @@

+---
+language:
+- en
+library_name: sentence-transformers
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated
+base_model: microsoft/mpnet-base
+metrics:
+- accuracy
+widget:
+- source_sentence: Many youth are lazy.
+  sentences:
+  - Lincoln took his hat off.
+  - At the end of the fourth century was when baked goods flourished.
+  - DOD's common practice for managing this environment has been to create aggressive
+    risk reduction efforts in its programs.
+- source_sentence: a guy on a bike
+  sentences:
+  - A man is on a bike.
+  - two men sit in a train car
+  - She is the boy's aunt.
+- source_sentence: The dog is wet.
+  sentences:
+  - A child and small dog running.
+  - The man is riding a sheep.
+  - The man is doing a bike trick.
+- source_sentence: yeah really no kidding
+  sentences:
+  - 'Really? No kidding! '
+  - yeah i mean just when uh the they military paid for her education
+  - Changes were made to the Grant Renewal Application to provide extra information
+    to the LSC.
+- source_sentence: 'Harlem did a great job '
+  sentences:
+  - 'Missouri was happy to continue it''s planning efforts. '
+  - yeah i mean just when uh the they military paid for her education
+  - I know exactly.
+pipeline_tag: sentence-similarity
+co2_eq_emissions:
+  emissions: 18.165192544667764
+  source: codecarbon
+  training_type: fine-tuning
+  on_cloud: false
+  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
+  ram_total_size: 31.777088165283203
+  hours_used: 0.141
+  hardware_used: 1 x NVIDIA GeForce RTX 3090
+---
+# SentenceTransformer
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli), [snli](https://huggingface.co/datasets/stanfordnlp/snli) and [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base)
+- **Maximum Sequence Length:** 384 tokens
+- **Output Dimensionality:** 768 tokens
+- **Training Datasets:**
+    - [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli)
+    - [snli](https://huggingface.co/datasets/stanfordnlp/snli)
+    - [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts)
+- **Language:** en
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("tomaarsen/st-v3-test-mpnet-base-allnli-stsb")
+# Run inference
+sentences = [
+    "Harlem did a great job ",
+    "Missouri was happy to continue it's planning efforts. ",
+    "yeah i mean just when uh the they military paid for her education",
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Datasets
+#### multi_nli
+* Dataset: [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli) at [da70db2](https://huggingface.co/datasets/nyu-mll/multi_nli/tree/da70db2af9d09693783c3320c4249840212ee221)
+* Size: 10,000 training samples
+* Columns: <code>premise</code>, <code>hypothesis</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | premise                                                                            | hypothesis                                                                        | label                                                              |
+  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                            | int                                                                |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 26.95 tokens</li><li>max: 189 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 14.11 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~34.30%</li><li>1: ~28.20%</li><li>2: ~37.50%</li></ul> |
+* Samples:
+  | premise                                                                                                                                                                                                                                                                                              | hypothesis                                                                        | label          |
+  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------|
+  | <code>Conceptually cream skimming has two basic dimensions - product and geography.</code>                                                                                                                                                                                                           | <code>Product and geography are what make cream skimming work. </code>            | <code>1</code> |
+  | <code>you know during the season and i guess at at your level uh you lose them to the next level if if they decide to recall the the parent team the Braves decide to call to recall a guy from triple A then a double A guy goes up to replace him and a single A guy goes up to replace him</code> | <code>You lose the things to the following level if the people recall.</code>     | <code>0</code> |
+  | <code>One of our number will carry out your instructions minutely.</code>                                                                                                                                                                                                                            | <code>A member of my team will execute your orders with immense precision.</code> | <code>0</code> |
+* Loss: [<code>sentence_transformers.losses.SoftmaxLoss.SoftmaxLoss</code>](https://sbert.net/docs/package_reference/losses.html#softmaxloss)
+#### snli
+* Dataset: [snli](https://huggingface.co/datasets/stanfordnlp/snli) at [cdb5c3d](https://huggingface.co/datasets/stanfordnlp/snli/tree/cdb5c3d5eed6ead6e5a341c8e56e669bb666725b)
+* Size: 10,000 training samples
+* Columns: <code>snli_premise</code>, <code>hypothesis</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | snli_premise                                                                      | hypothesis                                                                       | label                                                              |
+  |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                           | int                                                                |
+  | details | <ul><li>min: 6 tokens</li><li>mean: 17.38 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 10.7 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>0: ~33.40%</li><li>1: ~33.30%</li><li>2: ~33.30%</li></ul> |
+* Samples:
+  | snli_premise                                                        | hypothesis                                                     | label          |
+  |:--------------------------------------------------------------------|:---------------------------------------------------------------|:---------------|
+  | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is training his horse for a competition.</code> | <code>1</code> |
+  | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is at a diner, ordering an omelette.</code>     | <code>2</code> |
+  | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code>                 | <code>0</code> |
+* Loss: [<code>sentence_transformers.losses.SoftmaxLoss.SoftmaxLoss</code>](https://sbert.net/docs/package_reference/losses.html#softmaxloss)
+#### stsb
+* Dataset: [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) at [8913289](https://huggingface.co/datasets/mteb/stsbenchmark-sts/tree/8913289635987208e6e7c72789e4be2fe94b6abd)
+* Size: 5,749 training samples
+* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                        | sentence2                                                                        | label                                                          |
+  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
+  | type    | string                                                                           | string                                                                           | float                                                          |
+  | details | <ul><li>min: 6 tokens</li><li>mean: 10.0 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.95 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.54</li><li>max: 1.0</li></ul> |
+* Samples:
+  | sentence1                                                  | sentence2                                                             | label             |
+  |:-----------------------------------------------------------|:----------------------------------------------------------------------|:------------------|
+  | <code>A plane is taking off.</code>                        | <code>An air plane is taking off.</code>                              | <code>1.0</code>  |
+  | <code>A man is playing a large flute.</code>               | <code>A man is playing a flute.</code>                                | <code>0.76</code> |
+  | <code>A man is spreading shreded cheese on a pizza.</code> | <code>A man is spreading shredded cheese on an uncooked pizza.</code> | <code>0.76</code> |
+* Loss: [<code>sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/losses.html#cosinesimilarityloss) with these parameters:
+  ```json
+  {
+      "loss_fct": "torch.nn.modules.loss.MSELoss"
+  }
+  ```
+### Evaluation Datasets
+#### multi_nli
+* Dataset: [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli) at [da70db2](https://huggingface.co/datasets/nyu-mll/multi_nli/tree/da70db2af9d09693783c3320c4249840212ee221)
+* Size: 100 evaluation samples
+* Columns: <code>premise</code>, <code>hypothesis</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | premise                                                                            | hypothesis                                                                        | label                                                              |
+  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                            | int                                                                |
+  | details | <ul><li>min: 5 tokens</li><li>mean: 27.67 tokens</li><li>max: 138 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 13.48 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>0: ~35.00%</li><li>1: ~31.00%</li><li>2: ~34.00%</li></ul> |
+* Samples:
+  | premise                                                                                                                                      | hypothesis                                                                                        | label          |
+  |:---------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:---------------|
+  | <code>The new rights are nice enough</code>                                                                                                  | <code>Everyone really likes the newest benefits </code>                                           | <code>1</code> |
+  | <code>This site includes a list of all award winners and a searchable database of Government Executive articles.</code>                      | <code>The Government Executive articles housed on the website are not able to be searched.</code> | <code>2</code> |
+  | <code>uh i don't know i i have mixed emotions about him uh sometimes i like him but at the same times i love to see somebody beat him</code> | <code>I like him for the most part, but would still enjoy seeing someone beat him.</code>         | <code>0</code> |
+* Loss: [<code>sentence_transformers.losses.SoftmaxLoss.SoftmaxLoss</code>](https://sbert.net/docs/package_reference/losses.html#softmaxloss)
+#### snli
+* Dataset: [snli](https://huggingface.co/datasets/stanfordnlp/snli) at [cdb5c3d](https://huggingface.co/datasets/stanfordnlp/snli/tree/cdb5c3d5eed6ead6e5a341c8e56e669bb666725b)
+* Size: 9,842 evaluation samples
+* Columns: <code>snli_premise</code>, <code>hypothesis</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | snli_premise                                                                      | hypothesis                                                                        | label                                                              |
+  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                            | int                                                                |
+  | details | <ul><li>min: 6 tokens</li><li>mean: 18.44 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 10.57 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>0: ~33.10%</li><li>1: ~33.30%</li><li>2: ~33.60%</li></ul> |
+* Samples:
+  | snli_premise                                                       | hypothesis                                                                                         | label          |
+  |:-------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
+  | <code>Two women are embracing while holding to go packages.</code> | <code>The sisters are hugging goodbye while holding to go packages after just eating lunch.</code> | <code>1</code> |
+  | <code>Two women are embracing while holding to go packages.</code> | <code>Two woman are holding packages.</code>                                                       | <code>0</code> |
+  | <code>Two women are embracing while holding to go packages.</code> | <code>The men are fighting outside a deli.</code>                                                  | <code>2</code> |
+* Loss: [<code>sentence_transformers.losses.SoftmaxLoss.SoftmaxLoss</code>](https://sbert.net/docs/package_reference/losses.html#softmaxloss)
+#### stsb
+* Dataset: [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) at [8913289](https://huggingface.co/datasets/mteb/stsbenchmark-sts/tree/8913289635987208e6e7c72789e4be2fe94b6abd)
+* Size: 1,500 evaluation samples
+* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                        | sentence2                                                                         | label                                                          |
+  |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
+  | type    | string                                                                           | string                                                                            | float                                                          |
+  | details | <ul><li>min: 5 tokens</li><li>mean: 15.1 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.11 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
+* Samples:
+  | sentence1                                         | sentence2                                             | label             |
+  |:--------------------------------------------------|:------------------------------------------------------|:------------------|
+  | <code>A man with a hard hat is dancing.</code>    | <code>A man wearing a hard hat is dancing.</code>     | <code>1.0</code>  |
+  | <code>A young child is riding a horse.</code>     | <code>A child is riding a horse.</code>               | <code>0.95</code> |
+  | <code>A man is feeding a mouse to a snake.</code> | <code>The man is feeding a mouse to the snake.</code> | <code>1.0</code>  |
+* Loss: [<code>sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/losses.html#cosinesimilarityloss) with these parameters:
+  ```json
+  {
+      "loss_fct": "torch.nn.modules.loss.MSELoss"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- per_device_train_batch_size: 128
+- per_device_eval_batch_size: 128
+- learning_rate: 2e-05
+- num_train_epochs: 1
+- warmup_ratio: 0.1
+- seed: 33
+- bf16: True
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- overwrite_output_dir: False
+- do_predict: False
+- prediction_loss_only: False
+- per_device_train_batch_size: 128
+- per_device_eval_batch_size: 128
+- per_gpu_train_batch_size: None
+- per_gpu_eval_batch_size: None
+- gradient_accumulation_steps: 1
+- eval_accumulation_steps: None
+- learning_rate: 2e-05
+- weight_decay: 0.0
+- adam_beta1: 0.9
+- adam_beta2: 0.999
+- adam_epsilon: 1e-08
+- max_grad_norm: 1.0
+- num_train_epochs: 1
+- max_steps: -1
+- lr_scheduler_type: linear
+- lr_scheduler_kwargs: {}
+- warmup_ratio: 0.1
+- warmup_steps: 0
+- log_level: passive
+- log_level_replica: warning
+- log_on_each_node: True
+- logging_nan_inf_filter: True
+- save_safetensors: True
+- save_on_each_node: False
+- save_only_model: False
+- no_cuda: False
+- use_cpu: False
+- use_mps_device: False
+- seed: 33
+- data_seed: None
+- jit_mode_eval: False
+- use_ipex: False
+- bf16: True
+- fp16: False
+- fp16_opt_level: O1
+- half_precision_backend: auto
+- bf16_full_eval: False
+- fp16_full_eval: False
+- tf32: None
+- local_rank: 0
+- ddp_backend: None
+- tpu_num_cores: None
+- tpu_metrics_debug: False
+- debug: []
+- dataloader_drop_last: False
+- dataloader_num_workers: 0
+- dataloader_prefetch_factor: None
+- past_index: -1
+- disable_tqdm: False
+- remove_unused_columns: True
+- label_names: None
+- load_best_model_at_end: False
+- ignore_data_skip: False
+- fsdp: []
+- fsdp_min_num_params: 0
+- fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- fsdp_transformer_layer_cls_to_wrap: None
+- accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}
+- deepspeed: None
+- label_smoothing_factor: 0.0
+- optim: adamw_torch
+- optim_args: None
+- adafactor: False
+- group_by_length: False
+- length_column_name: length
+- ddp_find_unused_parameters: None
+- ddp_bucket_cap_mb: None
+- ddp_broadcast_buffers: None
+- dataloader_pin_memory: True
+- dataloader_persistent_workers: False
+- skip_memory_metrics: True
+- use_legacy_prediction_loop: False
+- push_to_hub: False
+- resume_from_checkpoint: None
+- hub_model_id: None
+- hub_strategy: every_save
+- hub_private_repo: False
+- hub_always_push: False
+- gradient_checkpointing: False
+- gradient_checkpointing_kwargs: None
+- include_inputs_for_metrics: False
+- fp16_backend: auto
+- push_to_hub_model_id: None
+- push_to_hub_organization: None
+- mp_parameters:
+- auto_find_batch_size: False
+- full_determinism: False
+- torchdynamo: None
+- ray_scope: last
+- ddp_timeout: 1800
+- torch_compile: False
+- torch_compile_backend: None
+- torch_compile_mode: None
+- dispatch_batches: None
+- split_batches: None
+- include_tokens_per_second: False
+- include_num_input_tokens_seen: False
+- neftune_noise_alpha: None
+- optim_target_modules: None
+- round_robin_sampler: False
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss | multi_nli | snli   | stsb   |
+|:------:|:----:|:-------------:|:---------:|:------:|:------:|
+| 0.0493 | 10   | 0.9204        | 1.0998    | 1.1022 | 0.2997 |
+| 0.0985 | 20   | 1.0074        | 1.0983    | 1.0971 | 0.2499 |
+| 0.1478 | 30   | 1.0037        | 1.0994    | 1.0939 | 0.1667 |
+| 0.1970 | 40   | 0.7961        | 1.0945    | 1.0877 | 0.0814 |
+| 0.2463 | 50   | 0.9882        | 1.0950    | 1.0806 | 0.0840 |
+| 0.2956 | 60   | 0.7814        | 1.0873    | 1.0711 | 0.0681 |
+| 0.3448 | 70   | 0.6678        | 1.0829    | 1.0673 | 0.0504 |
+| 0.3941 | 80   | 0.7669        | 1.0771    | 1.0638 | 0.0501 |
+| 0.4433 | 90   | 0.9718        | 1.0704    | 1.0517 | 0.0482 |
+| 0.4926 | 100  | 0.8494        | 1.0609    | 1.0388 | 0.0526 |
+| 0.5419 | 110  | 0.745         | 1.0631    | 1.0285 | 0.0527 |
+| 0.5911 | 120  | 0.6416        | 1.0564    | 1.0148 | 0.0588 |
+| 0.6404 | 130  | 1.0331        | 1.0504    | 1.0026 | 0.0627 |
+| 0.6897 | 140  | 0.8305        | 1.0417    | 1.0023 | 0.0664 |
+| 0.7389 | 150  | 0.7362        | 1.0282    | 0.9937 | 0.0672 |
+| 0.7882 | 160  | 0.7164        | 1.0288    | 0.9930 | 0.0688 |
+| 0.8374 | 170  | 0.8217        | 1.0264    | 0.9819 | 0.0677 |
+| 0.8867 | 180  | 0.9046        | 1.0200    | 0.9734 | 0.0742 |
+| 0.9360 | 190  | 0.5327        | 1.0221    | 0.9764 | 0.0698 |
+| 0.9852 | 200  | 0.8974        | 1.0233    | 0.9776 | 0.0691 |
+### Environmental Impact
+Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
+- **Carbon Emitted**: 0.018 kg of CO2
+- **Hours Used**: 0.141 hours
+### Training Hardware
+- **On Cloud**: No
+- **GPU Model**: 1 x NVIDIA GeForce RTX 3090
+- **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
+- **RAM Size**: 31.78 GB
+### Framework Versions
+- Python: 3.11.6
+- Sentence Transformers: 2.7.0.dev0
+- Transformers: 4.39.3
+- PyTorch: 2.1.0+cu121
+- Accelerate: 0.26.1
+- Datasets: 2.18.0
+- Tokenizers: 0.15.2
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "_name_or_path": "microsoft/mpnet-base",
+  "architectures": [
+    "MPNetModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "mpnet",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
+  "relative_attention_num_buckets": 32,
+  "torch_dtype": "float32",
+  "transformers_version": "4.39.3",
+  "vocab_size": 30527
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "__version__": {
+    "sentence_transformers": "2.7.0.dev0",
+    "transformers": "4.39.3",
+    "pytorch": "2.1.0+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d3cb8b37cb903e8fff0694a014f7a025675929a40ee90b9d5f887df4530a281e
+size 437967672

modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 384,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "104": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "30526": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "do_lower_case": true,
+  "eos_token": "</s>",
+  "mask_token": "<mask>",
+  "model_max_length": 384,
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "MPNetTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff