Commit
·
745a00c
1
Parent(s):
7cbe116
Update README.md
Browse files
README.md
CHANGED
@@ -22,19 +22,28 @@ pipeline_tag: text-to-image
|
|
22 |
## Description
|
23 |
In order to improve the [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/))
|
24 |
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
|
25 |
-
- Define [sentence-transformer](https://www.sbert.net/) and
|
26 |
- Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
|
27 |
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
|
28 |
the entries are composed of a pair of descriptive sentences and their similarity value.
|
29 |
- Implement [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) as a baseline model for transformer training.
|
30 |
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
|
31 |
-
|
|
|
|
|
32 |
|
33 |
-
The total training time using the
|
34 |
|
35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
-
|
|
|
38 |
```python
|
39 |
from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation
|
40 |
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI')
|
|
|
22 |
## Description
|
23 |
In order to improve the [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/))
|
24 |
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
|
25 |
+
- Define [sentence-transformer](https://www.sbert.net/) and _torch_ libraries for the implementation of the encoder.
|
26 |
- Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
|
27 |
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
|
28 |
the entries are composed of a pair of descriptive sentences and their similarity value.
|
29 |
- Implement [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) as a baseline model for transformer training.
|
30 |
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
|
31 |
+
vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated and compares with the real
|
32 |
+
similarity value obtained from the training corpus. The performance measurement of the model during training was calculated using Spearman's correlation coefficient
|
33 |
+
between the real similarity vector and the calculated similarity vector.
|
34 |
|
35 |
+
The total training time using the _sentence-transformer_ library in Python was 42 days using all the available GPUs of the server, and with exclusive dedication.
|
36 |
|
37 |
+
A comparison was made between the Spearman's correlation for 1000 test sentences between the base model and our trained model.
|
38 |
+
As can be seen in the following table, our model obtains better results (correlation closer to 1).
|
39 |
+
|
40 |
+
| Models | Spearman's correlation |
|
41 |
+
| :---: | :---: |
|
42 |
+
| RoBERTa-base-bne | 0.827176427 |
|
43 |
+
| RoBERTa-celebA-Sp | 0.999913276 |
|
44 |
|
45 |
+
## How to use
|
46 |
+
Downloading the model results in a directory called **roberta-large-bne-celebAEs-UNI** that contains its main files. To make use of the model use the following code in Python:
|
47 |
```python
|
48 |
from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation
|
49 |
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI')
|