efederici commited on
Commit
9311468
1 Parent(s): f3edb33

Upload with huggingface_hub

Browse files
0_SentenceTransformer/1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
0_SentenceTransformer/README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ license: apache-2.0
4
+ language:
5
+ - it
6
+ tags:
7
+ - sentence-transformers
8
+ - feature-extraction
9
+ - sentence-similarity
10
+ - transformers
11
+ ---
12
+
13
+ # sentence-BERTino
14
+
15
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. It was trained on a dataset made from question/context pairs ([squad-it](https://github.com/crux82/squad-it)) and tags/news-article pairs (via scraping).
16
+
17
+ ## Usage (Sentence-Transformers)
18
+
19
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
20
+
21
+ ```
22
+ pip install -U sentence-transformers
23
+ ```
24
+
25
+ Then you can use the model like this:
26
+
27
+ ```python
28
+ from sentence_transformers import SentenceTransformer
29
+ sentences = ["Questo è un esempio di frase", "Questo è un ulteriore esempio"]
30
+
31
+ model = SentenceTransformer('efederici/sentence-BERTino')
32
+ embeddings = model.encode(sentences)
33
+ print(embeddings)
34
+ ```
35
+
36
+
37
+
38
+ ## Usage (HuggingFace Transformers)
39
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
40
+
41
+ ```python
42
+ from transformers import AutoTokenizer, AutoModel
43
+ import torch
44
+
45
+
46
+ #Mean Pooling - Take attention mask into account for correct averaging
47
+ def mean_pooling(model_output, attention_mask):
48
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
49
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
50
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
51
+
52
+
53
+ # Sentences we want sentence embeddings for
54
+ sentences = ["Questo è un esempio di frase", "Questo è un ulteriore esempio"]
55
+
56
+ # Load model from HuggingFace Hub
57
+ tokenizer = AutoTokenizer.from_pretrained('efederici/sentence-BERTino')
58
+ model = AutoModel.from_pretrained('efederici/sentence-BERTino')
59
+
60
+ # Tokenize sentences
61
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
62
+
63
+ # Compute token embeddings
64
+ with torch.no_grad():
65
+ model_output = model(**encoded_input)
66
+
67
+ # Perform pooling. In this case, mean pooling.
68
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
69
+
70
+ print("Sentence embeddings:")
71
+ print(sentence_embeddings)
72
+ ```
73
+
74
+ ## Full Model Architecture
75
+ ```
76
+ SentenceTransformer(
77
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
78
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
79
+ )
80
+ ```
0_SentenceTransformer/config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/root/.cache/torch/sentence_transformers/efederici_sentence-BERTino/",
3
+ "activation": "gelu",
4
+ "architectures": [
5
+ "DistilBertModel"
6
+ ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
11
+ "initializer_range": 0.02,
12
+ "max_position_embeddings": 512,
13
+ "model_type": "distilbert",
14
+ "n_heads": 12,
15
+ "n_layers": 3,
16
+ "output_hidden_states": true,
17
+ "pad_token_id": 0,
18
+ "qa_dropout": 0.1,
19
+ "seq_classif_dropout": 0.2,
20
+ "sinusoidal_pos_embds": false,
21
+ "tie_weights_": true,
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.23.1",
24
+ "vocab_size": 32102
25
+ }
0_SentenceTransformer/config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.0",
4
+ "transformers": "4.17.0",
5
+ "pytorch": "1.10.0+cu111"
6
+ }
7
+ }
0_SentenceTransformer/modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
0_SentenceTransformer/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edeaf53e3122eab7cf32f5f3deaaf2710c440a7b0da5afe4c281673ee0830667
3
+ size 185267017
0_SentenceTransformer/sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
0_SentenceTransformer/special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
0_SentenceTransformer/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
0_SentenceTransformer/tokenizer_config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "do_basic_tokenize": true,
4
+ "do_lower_case": true,
5
+ "full_tokenizer_file": null,
6
+ "mask_token": "[MASK]",
7
+ "max_len": 512,
8
+ "name_or_path": "/root/.cache/torch/sentence_transformers/efederici_sentence-BERTino/",
9
+ "never_split": null,
10
+ "pad_token": "[PAD]",
11
+ "sep_token": "[SEP]",
12
+ "special_tokens_map_file": null,
13
+ "strip_accents": null,
14
+ "tokenize_chinese_chars": true,
15
+ "tokenizer_class": "DistilBertTokenizer",
16
+ "unk_token": "[UNK]"
17
+ }
0_SentenceTransformer/vocab.txt ADDED
The diff for this file is too large to render. See raw diff
 
1_Dense/config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"in_features": 768, "out_features": 64, "bias": true, "activation_function": "torch.nn.modules.activation.Tanh"}
1_Dense/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:239c2707843befbc4780fe214749fc03e0614fdc7e9255dadafe47beae81a24f
3
+ size 197927
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+
8
+ ---
9
+
10
+ # {MODEL_NAME}
11
+
12
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 64 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
+
14
+ <!--- Describe your model here -->
15
+
16
+ ## Usage (Sentence-Transformers)
17
+
18
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
19
+
20
+ ```
21
+ pip install -U sentence-transformers
22
+ ```
23
+
24
+ Then you can use the model like this:
25
+
26
+ ```python
27
+ from sentence_transformers import SentenceTransformer
28
+ sentences = ["This is an example sentence", "Each sentence is converted"]
29
+
30
+ model = SentenceTransformer('{MODEL_NAME}')
31
+ embeddings = model.encode(sentences)
32
+ print(embeddings)
33
+ ```
34
+
35
+
36
+
37
+ ## Evaluation Results
38
+
39
+ <!--- Describe how your model was evaluated -->
40
+
41
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
42
+
43
+
44
+ ## Training
45
+ The model was trained with the parameters:
46
+
47
+ **DataLoader**:
48
+
49
+ `torch.utils.data.dataloader.DataLoader` of length 1724 with parameters:
50
+ ```
51
+ {'batch_size': 64, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
52
+ ```
53
+
54
+ **Loss**:
55
+
56
+ `sentence_transformers.losses.MSELoss.MSELoss`
57
+
58
+ Parameters of the fit()-Method:
59
+ ```
60
+ {
61
+ "epochs": 1,
62
+ "evaluation_steps": 500,
63
+ "evaluator": "sentence_transformers.evaluation.SequentialEvaluator.SequentialEvaluator",
64
+ "max_grad_norm": 1,
65
+ "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
66
+ "optimizer_params": {
67
+ "eps": 1e-06,
68
+ "lr": 1e-07
69
+ },
70
+ "scheduler": "WarmupLinear",
71
+ "steps_per_epoch": null,
72
+ "warmup_steps": 100,
73
+ "weight_decay": 0.01
74
+ }
75
+ ```
76
+
77
+
78
+ ## Full Model Architecture
79
+ ```
80
+ SentenceTransformer(
81
+ (0): SentenceTransformer(
82
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
83
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
84
+ )
85
+ (1): Dense({'in_features': 768, 'out_features': 64, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
86
+ )
87
+ ```
88
+
89
+ ## Citing & Authors
90
+
91
+ <!--- Describe where people can find more information -->
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.2",
4
+ "transformers": "4.23.1",
5
+ "pytorch": "1.12.1+cu113"
6
+ }
7
+ }
eval/mse_evaluation__results.csv ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,MSE
2
+ 0,500,160.0018858909607
3
+ 0,1000,157.27769136428833
4
+ 0,1500,155.6653618812561
5
+ 0,-1,155.4488182067871
6
+ 0,500,98.15781712532043
7
+ 0,1000,76.94787383079529
8
+ 0,1500,68.42235922813416
9
+ 0,-1,66.3095772266388
10
+ 1,500,63.229817152023315
11
+ 1,1000,61.31596565246582
12
+ 1,1500,60.14971137046814
13
+ 1,-1,59.68950390815735
14
+ 2,500,59.01828408241272
15
+ 2,1000,58.600419759750366
16
+ 2,1500,58.39851498603821
17
+ 2,-1,58.3707332611084
18
+ 0,500,58.216989040374756
19
+ 0,1000,58.06503891944885
20
+ 0,1500,57.93534517288208
21
+ 0,-1,57.89196491241455
22
+ 1,500,57.828253507614136
23
+ 1,1000,57.760089635849
24
+ 1,1500,57.728540897369385
25
+ 1,-1,57.72488713264465
26
+ 0,500,57.713496685028076
27
+ 0,1000,57.70211219787598
28
+ 0,1500,57.697200775146484
29
+ 0,-1,57.69643187522888
eval/similarity_evaluation_sts-dev_results.csv ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
2
+ 0,500,0.5865738040483492,0.6080073489584343,0.6135697139081404,0.6209144397277622,0.6080297443609035,0.6150423947449148,0.30760717559007184,0.31234556074873465
3
+ 0,1000,0.5932329861734787,0.6145486559695778,0.6209473347843074,0.6285679752654623,0.6156857991443561,0.6230859536837934,0.2902995170696891,0.29410016166955844
4
+ 0,1500,0.5997207615500473,0.6197545563640058,0.6266288668984643,0.6333906867830073,0.6216103921025053,0.6283979609131071,0.29927639170944115,0.3039065427395013
5
+ 0,-1,0.6004929935405958,0.6203905760382272,0.6272614520062297,0.634024206793723,0.622256060456697,0.6291191927282038,0.3009699947338997,0.3056782821262222
6
+ 0,500,0.6082876631364406,0.6383080638463929,0.6713194802395258,0.6666769766877335,0.6696520375976636,0.6676769987310615,0.4984370002871881,0.48647455814121354
7
+ 0,1000,0.6695328825953906,0.6838212732319437,0.7159880183243681,0.7080623220109675,0.7104452645368556,0.7043184333029457,0.6001198237468279,0.5904204021173427
8
+ 0,1500,0.6938622625314218,0.7027819962982079,0.7300484668389849,0.7227559033386031,0.7239534109735188,0.7171358229426745,0.6432735292808788,0.6345696105211257
9
+ 0,-1,0.6942694267356905,0.7037752140663714,0.7307942823803945,0.7232597720141782,0.7239209364944754,0.7174398302701811,0.6447556662871575,0.6365812702723829
10
+ 1,500,0.7026776271111842,0.7103170015048932,0.7351407131757489,0.7278891702308841,0.7283502666392208,0.7219225344353434,0.6598208883116933,0.6523325409289799
11
+ 1,1000,0.7051954773480473,0.712368926410317,0.7357815021563106,0.7286132668166436,0.7292599158711848,0.7226115402495245,0.6661954715118892,0.6596294534379665
12
+ 1,1500,0.7082188011794502,0.7152173340725155,0.7378006755328732,0.7308711900850527,0.7315619929431638,0.7252391043427502,0.6710356608646818,0.6647982399231508
13
+ 1,-1,0.70904626023681,0.7157006511324661,0.7386734692835749,0.7315364734629127,0.7322374623337847,0.7257611918921245,0.6716495784644925,0.6648813111548563
14
+ 2,500,0.7100771821625964,0.7167180521913615,0.7391966431541905,0.7324477676869227,0.7327015552647981,0.7262537331577372,0.6736893242334752,0.667143259335068
15
+ 2,1000,0.7108737865640337,0.7169675381288808,0.7395318432365394,0.7325689012225832,0.7331158055268099,0.726662598208946,0.6749191654461464,0.6684612228780606
16
+ 2,1500,0.7117215803275778,0.7177861472354646,0.7399094584535856,0.7329357199813928,0.7336377259374134,0.7272406822382951,0.6764804931321545,0.6700393218318923
17
+ 2,-1,0.711675752291452,0.71773214541879,0.7398356721932455,0.7328735135327141,0.7335426460029394,0.7271314488233886,0.6764356227769781,0.6698884267099644
18
+ 0,500,0.7121460455239379,0.7181123574252887,0.7399511395506606,0.7331083588907218,0.7336774177428855,0.7272558299355949,0.6771142222790999,0.6705752866145787
19
+ 0,1000,0.7123861478820172,0.7181627442194691,0.7400716088926611,0.7330949496287871,0.7338919056678781,0.727296143130711,0.6775155455308138,0.6708743366430204
20
+ 0,1500,0.7122284907285029,0.7180972453152019,0.7401198061461532,0.7332444924367473,0.7337385450265813,0.7272772676728283,0.6771726702603322,0.6703615444741903
21
+ 0,-1,0.7120537711301091,0.7179151099580299,0.739928382899804,0.7328671734324397,0.7336732597926334,0.7270930896096183,0.6770149506244509,0.6703754003556386
22
+ 1,500,0.712251279207934,0.7180605495603078,0.7399009046552868,0.7328821274176701,0.7337147222065724,0.7271828049741026,0.6775682024150373,0.6710392299952491
23
+ 1,1000,0.7125332144097017,0.7182585034337489,0.7401266535032988,0.7332453749996348,0.7339200062049424,0.7273773976352566,0.6778856175804594,0.6712869419099701
24
+ 1,1500,0.7126742859770369,0.7183989294362636,0.740284929037369,0.7333443804061257,0.7340499087406678,0.7274551307850554,0.6779089373279104,0.6712837070322928
25
+ 1,-1,0.712748264487567,0.7184477986871575,0.7403131695377074,0.7333492451781706,0.7340709425847874,0.7274524030251635,0.6779903965891169,0.6714038352290735
26
+ 0,500,0.7126967895258003,0.7184174555008799,0.7402840486509272,0.733308238032397,0.7340461198296704,0.7274432339796818,0.6779725544896934,0.6713037515301116
27
+ 0,1000,0.712916278926716,0.7186633827170787,0.7404429389807402,0.7335128164655754,0.7342023524428132,0.727619871112414,0.678277663787251,0.6716970293815879
28
+ 0,1500,0.7128783807214034,0.7186522314638217,0.7404290748112105,0.7334913484792108,0.7341734466453587,0.727559420892707,0.6782398243891938,0.6716509314850065
29
+ 0,-1,0.712884474626804,0.7186568382285708,0.7404293672433467,0.7334823929570085,0.7341750526470242,0.7275848746469504,0.678234793119852,0.6716365702651312
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "0_SentenceTransformer",
6
+ "type": "sentence_transformers.SentenceTransformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Dense",
12
+ "type": "sentence_transformers.models.Dense"
13
+ }
14
+ ]