efederici commited on
Commit
04750b5
1 Parent(s): 2656a34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -59
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  pipeline_tag: sentence-similarity
 
 
3
  tags:
4
  - sentence-transformers
5
  - feature-extraction
@@ -7,11 +9,9 @@ tags:
7
  - transformers
8
  ---
9
 
10
- # {MODEL_NAME}
11
 
12
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
-
14
- <!--- Describe your model here -->
15
 
16
  ## Usage (Sentence-Transformers)
17
 
@@ -25,9 +25,9 @@ Then you can use the model like this:
25
 
26
  ```python
27
  from sentence_transformers import SentenceTransformer
28
- sentences = ["This is an example sentence", "Each sentence is converted"]
29
 
30
- model = SentenceTransformer('{MODEL_NAME}')
31
  embeddings = model.encode(sentences)
32
  print(embeddings)
33
  ```
@@ -50,11 +50,11 @@ def mean_pooling(model_output, attention_mask):
50
 
51
 
52
  # Sentences we want sentence embeddings for
53
- sentences = ['This is an example sentence', 'Each sentence is converted']
54
 
55
  # Load model from HuggingFace Hub
56
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
57
- model = AutoModel.from_pretrained('{MODEL_NAME}')
58
 
59
  # Tokenize sentences
60
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
@@ -70,59 +70,10 @@ print("Sentence embeddings:")
70
  print(sentence_embeddings)
71
  ```
72
 
73
-
74
-
75
- ## Evaluation Results
76
-
77
- <!--- Describe how your model was evaluated -->
78
-
79
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
80
-
81
-
82
- ## Training
83
- The model was trained with the parameters:
84
-
85
- **DataLoader**:
86
-
87
- `torch.utils.data.dataloader.DataLoader` of length 5121 with parameters:
88
- ```
89
- {'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
90
- ```
91
-
92
- **Loss**:
93
-
94
- `sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
95
- ```
96
- {'scale': 20.0, 'similarity_fct': 'cos_sim'}
97
- ```
98
-
99
- Parameters of the fit()-Method:
100
- ```
101
- {
102
- "epochs": 2,
103
- "evaluation_steps": 0,
104
- "evaluator": "NoneType",
105
- "max_grad_norm": 1,
106
- "optimizer_class": "<class 'transformers.optimization.AdamW'>",
107
- "optimizer_params": {
108
- "lr": 2e-05
109
- },
110
- "scheduler": "WarmupLinear",
111
- "steps_per_epoch": null,
112
- "warmup_steps": 1024,
113
- "weight_decay": 0.01
114
- }
115
- ```
116
-
117
-
118
  ## Full Model Architecture
119
  ```
120
  SentenceTransformer(
121
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
122
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
123
  )
124
- ```
125
-
126
- ## Citing & Authors
127
-
128
- <!--- Describe where people can find more information -->
 
1
  ---
2
  pipeline_tag: sentence-similarity
3
+ language:
4
+ - it
5
  tags:
6
  - sentence-transformers
7
  - feature-extraction
 
9
  - transformers
10
  ---
11
 
12
+ # sentence-BERTino
13
 
14
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. It was trained on a dataset made from question/context [squad-it](https://github.com/crux82/squad-it) (54k) and tags/news-article pairs (28k) (via scraping).
 
 
15
 
16
  ## Usage (Sentence-Transformers)
17
 
 
25
 
26
  ```python
27
  from sentence_transformers import SentenceTransformer
28
+ sentences = ["Questo è un esempio di frase", "Questo è un ulteriore esempio"]
29
 
30
+ model = SentenceTransformer('efederici/sentence-BERTino')
31
  embeddings = model.encode(sentences)
32
  print(embeddings)
33
  ```
 
50
 
51
 
52
  # Sentences we want sentence embeddings for
53
+ sentences = ["Questo è un esempio di frase", "Questo è un ulteriore esempio"]
54
 
55
  # Load model from HuggingFace Hub
56
+ tokenizer = AutoTokenizer.from_pretrained('efederici/sentence-BERTino')
57
+ model = AutoModel.from_pretrained('efederici/sentence-BERTino')
58
 
59
  # Tokenize sentences
60
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
70
  print(sentence_embeddings)
71
  ```
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ## Full Model Architecture
74
  ```
75
  SentenceTransformer(
76
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
77
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
78
  )
79
+ ```