evilfreelancer commited on
Commit
bc29382
1 Parent(s): 1ca0329

Upload 13 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -1,3 +1,128 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: sentence-transformers
3
+ pipeline_tag: sentence-similarity
4
+ tags:
5
+ - sentence-transformers
6
+ - feature-extraction
7
+ - sentence-similarity
8
+ - transformers
9
+
10
+ ---
11
+
12
+ # {MODEL_NAME}
13
+
14
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
15
+
16
+ <!--- Describe your model here -->
17
+
18
+ ## Usage (Sentence-Transformers)
19
+
20
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
21
+
22
+ ```
23
+ pip install -U sentence-transformers
24
+ ```
25
+
26
+ Then you can use the model like this:
27
+
28
+ ```python
29
+ from sentence_transformers import SentenceTransformer
30
+ sentences = ["This is an example sentence", "Each sentence is converted"]
31
+
32
+ model = SentenceTransformer('{MODEL_NAME}')
33
+ embeddings = model.encode(sentences)
34
+ print(embeddings)
35
+ ```
36
+
37
+
38
+
39
+ ## Usage (HuggingFace Transformers)
40
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
41
+
42
+ ```python
43
+ from transformers import AutoTokenizer, AutoModel
44
+ import torch
45
+
46
+
47
+ #Mean Pooling - Take attention mask into account for correct averaging
48
+ def mean_pooling(model_output, attention_mask):
49
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
50
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
51
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
52
+
53
+
54
+ # Sentences we want sentence embeddings for
55
+ sentences = ['This is an example sentence', 'Each sentence is converted']
56
+
57
+ # Load model from HuggingFace Hub
58
+ tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
59
+ model = AutoModel.from_pretrained('{MODEL_NAME}')
60
+
61
+ # Tokenize sentences
62
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
63
+
64
+ # Compute token embeddings
65
+ with torch.no_grad():
66
+ model_output = model(**encoded_input)
67
+
68
+ # Perform pooling. In this case, mean pooling.
69
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
70
+
71
+ print("Sentence embeddings:")
72
+ print(sentence_embeddings)
73
+ ```
74
+
75
+
76
+
77
+ ## Evaluation Results
78
+
79
+ <!--- Describe how your model was evaluated -->
80
+
81
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
82
+
83
+
84
+ ## Training
85
+ The model was trained with the parameters:
86
+
87
+ **DataLoader**:
88
+
89
+ `torch.utils.data.dataloader.DataLoader` of length 573 with parameters:
90
+ ```
91
+ {'batch_size': 64, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
92
+ ```
93
+
94
+ **Loss**:
95
+
96
+ `sentence_transformers.losses.MSELoss.MSELoss`
97
+
98
+ Parameters of the fit()-Method:
99
+ ```
100
+ {
101
+ "epochs": 20,
102
+ "evaluation_steps": 100,
103
+ "evaluator": "sentence_transformers.evaluation.SequentialEvaluator.SequentialEvaluator",
104
+ "max_grad_norm": 1,
105
+ "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
106
+ "optimizer_params": {
107
+ "eps": 1e-06,
108
+ "lr": 2e-05
109
+ },
110
+ "scheduler": "WarmupLinear",
111
+ "steps_per_epoch": null,
112
+ "warmup_steps": 10000,
113
+ "weight_decay": 0.01
114
+ }
115
+ ```
116
+
117
+
118
+ ## Full Model Architecture
119
+ ```
120
+ SentenceTransformer(
121
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
122
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
123
+ )
124
+ ```
125
+
126
+ ## Citing & Authors
127
+
128
+ <!--- Describe where people can find more information -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "bert-base-multilingual-uncased",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "directionality": "bidi",
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "pooler_fc_size": 768,
21
+ "pooler_num_attention_heads": 12,
22
+ "pooler_num_fc_layers": 3,
23
+ "pooler_size_per_head": 128,
24
+ "pooler_type": "first_token_transform",
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.40.2",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 105879
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.7.0",
4
+ "transformers": "4.40.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null
9
+ }
eval/mse_evaluation_talks-en-ru-dev.tsv.gz_results.csv ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,MSE
2
+ 0,100,2.8928473591804504
3
+ 0,200,2.5566164404153824
4
+ 0,300,2.2533923387527466
5
+ 0,400,2.1163906902074814
6
+ 0,500,2.0060131326317787
7
+ 0,-1,1.9447039812803268
8
+ 1,100,1.8183402717113495
9
+ 1,200,1.7153803259134293
10
+ 1,300,1.6524117439985275
11
+ 1,400,1.6069402918219566
12
+ 1,500,1.557883806526661
13
+ 1,-1,1.5349174849689007
14
+ 2,100,1.5173410065472126
15
+ 2,200,1.491781324148178
16
+ 2,300,1.465703547000885
17
+ 2,400,1.4556304551661015
18
+ 2,500,1.4481942169368267
19
+ 2,-1,1.426389440894127
20
+ 3,100,1.39577966183424
21
+ 3,200,1.397869735956192
22
+ 3,300,1.3791386038064957
23
+ 3,400,1.3692201115190983
24
+ 3,500,1.3584598898887634
25
+ 3,-1,1.3564773835241795
26
+ 4,100,1.3375140726566315
27
+ 4,200,1.350619550794363
28
+ 4,300,1.3260378502309322
29
+ 4,400,1.3261851854622364
30
+ 4,500,1.3078000396490097
31
+ 4,-1,1.3175654225051403
32
+ 5,100,1.3159305788576603
33
+ 5,200,1.312108337879181
34
+ 5,300,1.3263111002743244
35
+ 5,400,1.2919967994093895
36
+ 5,500,1.294526644051075
37
+ 5,-1,1.279385481029749
38
+ 6,100,1.2747685424983501
39
+ 6,200,1.2767380103468895
40
+ 6,300,1.254996843636036
41
+ 6,400,1.2535860762000084
42
+ 6,500,1.2596610002219677
43
+ 6,-1,1.2512801215052605
44
+ 7,100,1.2597366236150265
45
+ 7,200,1.2454991228878498
46
+ 7,300,1.2466656975448132
47
+ 7,400,1.246230211108923
48
+ 7,500,1.2257634662091732
49
+ 7,-1,1.232503354549408
50
+ 8,100,1.2363849207758904
51
+ 8,200,1.2266166508197784
52
+ 8,300,1.2310810387134552
53
+ 8,400,1.222414243966341
54
+ 8,500,1.2164912186563015
55
+ 8,-1,1.219892967492342
56
+ 9,100,1.227688044309616
57
+ 9,200,1.2137078680098057
58
+ 9,300,1.2234610505402088
59
+ 9,400,1.1979183182120323
60
+ 9,500,1.2115975841879845
61
+ 9,-1,1.2157601304352283
62
+ 10,100,1.1952522210776806
63
+ 10,200,1.206289790570736
64
+ 10,300,1.2036706320941448
65
+ 10,400,1.2108158320188522
66
+ 10,500,1.2024701572954655
67
+ 10,-1,1.2090140953660011
68
+ 11,100,1.2182735837996006
69
+ 11,200,1.2103605084121227
70
+ 11,300,1.1928360909223557
71
+ 11,400,1.177027728408575
72
+ 11,500,1.1905891820788383
73
+ 11,-1,1.1902272701263428
74
+ 12,100,1.186790969222784
75
+ 12,200,1.186472550034523
76
+ 12,300,1.191321574151516
77
+ 12,400,1.1821781285107136
78
+ 12,500,1.188904233276844
79
+ 12,-1,1.1863697320222855
80
+ 13,100,1.1789905838668346
81
+ 13,200,1.1883879080414772
82
+ 13,300,1.1763601563870907
83
+ 13,400,1.170448400080204
84
+ 13,500,1.180951576679945
85
+ 13,-1,1.1650272645056248
86
+ 14,100,1.1710796505212784
87
+ 14,200,1.2021472677588463
88
+ 14,300,1.1722701601684093
89
+ 14,400,1.1790373362600803
90
+ 14,500,1.1628378182649612
91
+ 14,-1,1.1735523119568825
92
+ 15,100,1.1557640507817268
93
+ 15,200,1.1770031414926052
94
+ 15,300,1.1650467291474342
95
+ 15,400,1.1652232147753239
96
+ 15,500,1.170414499938488
97
+ 15,-1,1.1694228276610374
98
+ 16,100,1.1606036685407162
99
+ 16,200,1.1659459210932255
100
+ 16,300,1.1610891669988632
101
+ 16,400,1.1743003502488136
102
+ 16,500,1.1713393963873386
103
+ 16,-1,1.1815223842859268
104
+ 17,100,1.170582603663206
105
+ 17,200,1.1600999161601067
106
+ 17,300,1.1696469970047474
107
+ 17,400,1.1701789684593678
108
+ 17,500,1.1631796136498451
109
+ 17,-1,1.1578342877328396
110
+ 18,100,1.1727497912943363
111
+ 18,200,1.1746379546821117
112
+ 18,300,1.1589095927774906
113
+ 18,400,1.1534282937645912
114
+ 18,500,1.16147855296731
115
+ 18,-1,1.1479795910418034
116
+ 19,100,1.1532638221979141
117
+ 19,200,1.1479231528937817
118
+ 19,300,1.1513863690197468
119
+ 19,400,1.148157473653555
120
+ 19,500,1.1485524475574493
121
+ 19,-1,1.1491804383695126
eval/translation_evaluation_talks-en-ru-dev.tsv.gz_results.csv ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,src2trg,trg2src
2
+ 0,100,0.764,0.69
3
+ 0,200,0.776,0.706
4
+ 0,300,0.785,0.734
5
+ 0,400,0.793,0.747
6
+ 0,500,0.8,0.758
7
+ 0,-1,0.808,0.763
8
+ 1,100,0.807,0.777
9
+ 1,200,0.81,0.787
10
+ 1,300,0.813,0.793
11
+ 1,400,0.824,0.796
12
+ 1,500,0.828,0.796
13
+ 1,-1,0.837,0.801
14
+ 2,100,0.839,0.806
15
+ 2,200,0.841,0.804
16
+ 2,300,0.849,0.808
17
+ 2,400,0.853,0.812
18
+ 2,500,0.854,0.809
19
+ 2,-1,0.858,0.81
20
+ 3,100,0.865,0.817
21
+ 3,200,0.875,0.816
22
+ 3,300,0.877,0.82
23
+ 3,400,0.877,0.82
24
+ 3,500,0.882,0.824
25
+ 3,-1,0.884,0.82
26
+ 4,100,0.881,0.823
27
+ 4,200,0.884,0.823
28
+ 4,300,0.888,0.828
29
+ 4,400,0.889,0.822
30
+ 4,500,0.894,0.823
31
+ 4,-1,0.888,0.812
32
+ 5,100,0.89,0.819
33
+ 5,200,0.889,0.82
34
+ 5,300,0.891,0.82
35
+ 5,400,0.887,0.826
36
+ 5,500,0.891,0.823
37
+ 5,-1,0.892,0.827
38
+ 6,100,0.891,0.831
39
+ 6,200,0.893,0.826
40
+ 6,300,0.893,0.834
41
+ 6,400,0.895,0.832
42
+ 6,500,0.896,0.829
43
+ 6,-1,0.894,0.837
44
+ 7,100,0.897,0.836
45
+ 7,200,0.897,0.832
46
+ 7,300,0.9,0.829
47
+ 7,400,0.896,0.831
48
+ 7,500,0.902,0.838
49
+ 7,-1,0.9,0.834
50
+ 8,100,0.899,0.842
51
+ 8,200,0.899,0.842
52
+ 8,300,0.903,0.839
53
+ 8,400,0.904,0.845
54
+ 8,500,0.901,0.841
55
+ 8,-1,0.898,0.846
56
+ 9,100,0.901,0.851
57
+ 9,200,0.901,0.851
58
+ 9,300,0.9,0.85
59
+ 9,400,0.904,0.855
60
+ 9,500,0.906,0.854
61
+ 9,-1,0.902,0.852
62
+ 10,100,0.908,0.857
63
+ 10,200,0.897,0.856
64
+ 10,300,0.904,0.857
65
+ 10,400,0.904,0.851
66
+ 10,500,0.905,0.861
67
+ 10,-1,0.901,0.856
68
+ 11,100,0.906,0.856
69
+ 11,200,0.903,0.862
70
+ 11,300,0.911,0.864
71
+ 11,400,0.904,0.867
72
+ 11,500,0.907,0.867
73
+ 11,-1,0.906,0.868
74
+ 12,100,0.907,0.868
75
+ 12,200,0.914,0.867
76
+ 12,300,0.902,0.869
77
+ 12,400,0.909,0.867
78
+ 12,500,0.908,0.866
79
+ 12,-1,0.905,0.861
80
+ 13,100,0.907,0.871
81
+ 13,200,0.913,0.875
82
+ 13,300,0.909,0.875
83
+ 13,400,0.913,0.873
84
+ 13,500,0.909,0.871
85
+ 13,-1,0.904,0.88
86
+ 14,100,0.911,0.871
87
+ 14,200,0.914,0.876
88
+ 14,300,0.917,0.877
89
+ 14,400,0.913,0.871
90
+ 14,500,0.917,0.878
91
+ 14,-1,0.914,0.879
92
+ 15,100,0.917,0.887
93
+ 15,200,0.91,0.885
94
+ 15,300,0.908,0.882
95
+ 15,400,0.915,0.886
96
+ 15,500,0.914,0.883
97
+ 15,-1,0.919,0.883
98
+ 16,100,0.922,0.885
99
+ 16,200,0.922,0.887
100
+ 16,300,0.924,0.893
101
+ 16,400,0.915,0.891
102
+ 16,500,0.914,0.885
103
+ 16,-1,0.915,0.887
104
+ 17,100,0.916,0.889
105
+ 17,200,0.918,0.888
106
+ 17,300,0.921,0.882
107
+ 17,400,0.921,0.89
108
+ 17,500,0.915,0.887
109
+ 17,-1,0.918,0.895
110
+ 18,100,0.916,0.888
111
+ 18,200,0.916,0.884
112
+ 18,300,0.922,0.889
113
+ 18,400,0.921,0.897
114
+ 18,500,0.92,0.895
115
+ 18,-1,0.921,0.892
116
+ 19,100,0.921,0.895
117
+ 19,200,0.923,0.897
118
+ 19,300,0.921,0.897
119
+ 19,400,0.921,0.895
120
+ 19,500,0.922,0.895
121
+ 19,-1,0.918,0.895
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cf601c0bb9ffa9ed108f3383d6aca9776929bf35b211a3919efd8704a6b21bf
3
+ size 669448040
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "model_max_length": 512,
49
+ "pad_token": "[PAD]",
50
+ "sep_token": "[SEP]",
51
+ "strip_accents": null,
52
+ "tokenize_chinese_chars": true,
53
+ "tokenizer_class": "BertTokenizer",
54
+ "unk_token": "[UNK]"
55
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff