Upload 13 files

Browse files

Files changed (13) hide show

1_Pooling/config.json +10 -0
README.md +128 -3
config.json +31 -0
config_sentence_transformers.json +9 -0
eval/mse_evaluation_talks-en-ru-dev.tsv.gz_results.csv +121 -0
eval/translation_evaluation_talks-en-ru-dev.tsv.gz_results.csv +121 -0
model.safetensors +3 -0
modules.json +14 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +55 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md CHANGED Viewed

@@ -1,3 +1,128 @@
----
-license: mit
----

+---
+library_name: sentence-transformers
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+- transformers
+---
+# {MODEL_NAME}
+This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
+<!--- Describe your model here -->
+## Usage (Sentence-Transformers)
+Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
+```
+pip install -U sentence-transformers
+```
+Then you can use the model like this:
+```python
+from sentence_transformers import SentenceTransformer
+sentences = ["This is an example sentence", "Each sentence is converted"]
+model = SentenceTransformer('{MODEL_NAME}')
+embeddings = model.encode(sentences)
+print(embeddings)
+```
+## Usage (HuggingFace Transformers)
+Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
+```python
+from transformers import AutoTokenizer, AutoModel
+import torch
+#Mean Pooling - Take attention mask into account for correct averaging
+def mean_pooling(model_output, attention_mask):
+    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
+    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
+    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
+# Sentences we want sentence embeddings for
+sentences = ['This is an example sentence', 'Each sentence is converted']
+# Load model from HuggingFace Hub
+tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
+model = AutoModel.from_pretrained('{MODEL_NAME}')
+# Tokenize sentences
+encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
+# Compute token embeddings
+with torch.no_grad():
+    model_output = model(**encoded_input)
+# Perform pooling. In this case, mean pooling.
+sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
+print("Sentence embeddings:")
+print(sentence_embeddings)
+```
+## Evaluation Results
+<!--- Describe how your model was evaluated -->
+For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
+## Training
+The model was trained with the parameters:
+**DataLoader**:
+`torch.utils.data.dataloader.DataLoader` of length 573 with parameters:
+```
+{'batch_size': 64, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
+```
+**Loss**:
+`sentence_transformers.losses.MSELoss.MSELoss`
+Parameters of the fit()-Method:
+```
+{
+    "epochs": 20,
+    "evaluation_steps": 100,
+    "evaluator": "sentence_transformers.evaluation.SequentialEvaluator.SequentialEvaluator",
+    "max_grad_norm": 1,
+    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
+    "optimizer_params": {
+        "eps": 1e-06,
+        "lr": 2e-05
+    },
+    "scheduler": "WarmupLinear",
+    "steps_per_epoch": null,
+    "warmup_steps": 10000,
+    "weight_decay": 0.01
+}
+```
+## Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+)
+```
+## Citing & Authors
+<!--- Describe where people can find more information -->

config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_name_or_path": "bert-base-multilingual-uncased",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "directionality": "bidi",
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "pooler_fc_size": 768,
+  "pooler_num_attention_heads": 12,
+  "pooler_num_fc_layers": 3,
+  "pooler_size_per_head": 128,
+  "pooler_type": "first_token_transform",
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.40.2",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 105879
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "__version__": {
+    "sentence_transformers": "2.7.0",
+    "transformers": "4.40.2",
+    "pytorch": "2.3.0+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null
+}

eval/mse_evaluation_talks-en-ru-dev.tsv.gz_results.csv ADDED Viewed

	@@ -0,0 +1,121 @@

+epoch,steps,MSE
+0,100,2.8928473591804504
+0,200,2.5566164404153824
+0,300,2.2533923387527466
+0,400,2.1163906902074814
+0,500,2.0060131326317787
+0,-1,1.9447039812803268
+1,100,1.8183402717113495
+1,200,1.7153803259134293
+1,300,1.6524117439985275
+1,400,1.6069402918219566
+1,500,1.557883806526661
+1,-1,1.5349174849689007
+2,100,1.5173410065472126
+2,200,1.491781324148178
+2,300,1.465703547000885
+2,400,1.4556304551661015
+2,500,1.4481942169368267
+2,-1,1.426389440894127
+3,100,1.39577966183424
+3,200,1.397869735956192
+3,300,1.3791386038064957
+3,400,1.3692201115190983
+3,500,1.3584598898887634
+3,-1,1.3564773835241795
+4,100,1.3375140726566315
+4,200,1.350619550794363
+4,300,1.3260378502309322
+4,400,1.3261851854622364
+4,500,1.3078000396490097
+4,-1,1.3175654225051403
+5,100,1.3159305788576603
+5,200,1.312108337879181
+5,300,1.3263111002743244
+5,400,1.2919967994093895
+5,500,1.294526644051075
+5,-1,1.279385481029749
+6,100,1.2747685424983501
+6,200,1.2767380103468895
+6,300,1.254996843636036
+6,400,1.2535860762000084
+6,500,1.2596610002219677
+6,-1,1.2512801215052605
+7,100,1.2597366236150265
+7,200,1.2454991228878498
+7,300,1.2466656975448132
+7,400,1.246230211108923
+7,500,1.2257634662091732
+7,-1,1.232503354549408
+8,100,1.2363849207758904
+8,200,1.2266166508197784
+8,300,1.2310810387134552
+8,400,1.222414243966341
+8,500,1.2164912186563015
+8,-1,1.219892967492342
+9,100,1.227688044309616
+9,200,1.2137078680098057
+9,300,1.2234610505402088
+9,400,1.1979183182120323
+9,500,1.2115975841879845
+9,-1,1.2157601304352283
+10,100,1.1952522210776806
+10,200,1.206289790570736
+10,300,1.2036706320941448
+10,400,1.2108158320188522
+10,500,1.2024701572954655
+10,-1,1.2090140953660011
+11,100,1.2182735837996006
+11,200,1.2103605084121227
+11,300,1.1928360909223557
+11,400,1.177027728408575
+11,500,1.1905891820788383
+11,-1,1.1902272701263428
+12,100,1.186790969222784
+12,200,1.186472550034523
+12,300,1.191321574151516
+12,400,1.1821781285107136
+12,500,1.188904233276844
+12,-1,1.1863697320222855
+13,100,1.1789905838668346
+13,200,1.1883879080414772
+13,300,1.1763601563870907
+13,400,1.170448400080204
+13,500,1.180951576679945
+13,-1,1.1650272645056248
+14,100,1.1710796505212784
+14,200,1.2021472677588463
+14,300,1.1722701601684093
+14,400,1.1790373362600803
+14,500,1.1628378182649612
+14,-1,1.1735523119568825
+15,100,1.1557640507817268
+15,200,1.1770031414926052
+15,300,1.1650467291474342
+15,400,1.1652232147753239
+15,500,1.170414499938488
+15,-1,1.1694228276610374
+16,100,1.1606036685407162
+16,200,1.1659459210932255
+16,300,1.1610891669988632
+16,400,1.1743003502488136
+16,500,1.1713393963873386
+16,-1,1.1815223842859268
+17,100,1.170582603663206
+17,200,1.1600999161601067
+17,300,1.1696469970047474
+17,400,1.1701789684593678
+17,500,1.1631796136498451
+17,-1,1.1578342877328396
+18,100,1.1727497912943363
+18,200,1.1746379546821117
+18,300,1.1589095927774906
+18,400,1.1534282937645912
+18,500,1.16147855296731
+18,-1,1.1479795910418034
+19,100,1.1532638221979141
+19,200,1.1479231528937817
+19,300,1.1513863690197468
+19,400,1.148157473653555
+19,500,1.1485524475574493
+19,-1,1.1491804383695126

eval/translation_evaluation_talks-en-ru-dev.tsv.gz_results.csv ADDED Viewed

	@@ -0,0 +1,121 @@

+epoch,steps,src2trg,trg2src
+0,100,0.764,0.69
+0,200,0.776,0.706
+0,300,0.785,0.734
+0,400,0.793,0.747
+0,500,0.8,0.758
+0,-1,0.808,0.763
+1,100,0.807,0.777
+1,200,0.81,0.787
+1,300,0.813,0.793
+1,400,0.824,0.796
+1,500,0.828,0.796
+1,-1,0.837,0.801
+2,100,0.839,0.806
+2,200,0.841,0.804
+2,300,0.849,0.808
+2,400,0.853,0.812
+2,500,0.854,0.809
+2,-1,0.858,0.81
+3,100,0.865,0.817
+3,200,0.875,0.816
+3,300,0.877,0.82
+3,400,0.877,0.82
+3,500,0.882,0.824
+3,-1,0.884,0.82
+4,100,0.881,0.823
+4,200,0.884,0.823
+4,300,0.888,0.828
+4,400,0.889,0.822
+4,500,0.894,0.823
+4,-1,0.888,0.812
+5,100,0.89,0.819
+5,200,0.889,0.82
+5,300,0.891,0.82
+5,400,0.887,0.826
+5,500,0.891,0.823
+5,-1,0.892,0.827
+6,100,0.891,0.831
+6,200,0.893,0.826
+6,300,0.893,0.834
+6,400,0.895,0.832
+6,500,0.896,0.829
+6,-1,0.894,0.837
+7,100,0.897,0.836
+7,200,0.897,0.832
+7,300,0.9,0.829
+7,400,0.896,0.831
+7,500,0.902,0.838
+7,-1,0.9,0.834
+8,100,0.899,0.842
+8,200,0.899,0.842
+8,300,0.903,0.839
+8,400,0.904,0.845
+8,500,0.901,0.841
+8,-1,0.898,0.846
+9,100,0.901,0.851
+9,200,0.901,0.851
+9,300,0.9,0.85
+9,400,0.904,0.855
+9,500,0.906,0.854
+9,-1,0.902,0.852
+10,100,0.908,0.857
+10,200,0.897,0.856
+10,300,0.904,0.857
+10,400,0.904,0.851
+10,500,0.905,0.861
+10,-1,0.901,0.856
+11,100,0.906,0.856
+11,200,0.903,0.862
+11,300,0.911,0.864
+11,400,0.904,0.867
+11,500,0.907,0.867
+11,-1,0.906,0.868
+12,100,0.907,0.868
+12,200,0.914,0.867
+12,300,0.902,0.869
+12,400,0.909,0.867
+12,500,0.908,0.866
+12,-1,0.905,0.861
+13,100,0.907,0.871
+13,200,0.913,0.875
+13,300,0.909,0.875
+13,400,0.913,0.873
+13,500,0.909,0.871
+13,-1,0.904,0.88
+14,100,0.911,0.871
+14,200,0.914,0.876
+14,300,0.917,0.877
+14,400,0.913,0.871
+14,500,0.917,0.878
+14,-1,0.914,0.879
+15,100,0.917,0.887
+15,200,0.91,0.885
+15,300,0.908,0.882
+15,400,0.915,0.886
+15,500,0.914,0.883
+15,-1,0.919,0.883
+16,100,0.922,0.885
+16,200,0.922,0.887
+16,300,0.924,0.893
+16,400,0.915,0.891
+16,500,0.914,0.885
+16,-1,0.915,0.887
+17,100,0.916,0.889
+17,200,0.918,0.888
+17,300,0.921,0.882
+17,400,0.921,0.89
+17,500,0.915,0.887
+17,-1,0.918,0.895
+18,100,0.916,0.888
+18,200,0.916,0.884
+18,300,0.922,0.889
+18,400,0.921,0.897
+18,500,0.92,0.895
+18,-1,0.921,0.892
+19,100,0.921,0.895
+19,200,0.923,0.897
+19,300,0.921,0.897
+19,400,0.921,0.895
+19,500,0.922,0.895
+19,-1,0.918,0.895

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0cf601c0bb9ffa9ed108f3383d6aca9776929bf35b211a3919efd8704a6b21bf
+size 669448040

modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,55 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff