JulienRPA
/

BERT2BERT_pretrained_LC-QuAD_2.0

Text2Text Generation

encoder-decoder

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

JulienRPA commited on May 23, 2023

Commit

2c1c3e8

•

1 Parent(s): adcf6be

update model card README.md

Files changed (1) hide show

README.md +26 -6

README.md CHANGED Viewed

@@ -1,6 +1,8 @@
 ---
 tags:
 - generated_from_trainer
 model-index:
 - name: BERT2BERT_pretrained_LC-QuAD_2.0
   results: []
@@ -12,6 +14,12 @@ should probably proofread and complete it, then remove this comment. -->
 # BERT2BERT_pretrained_LC-QuAD_2.0
 This model was trained from scratch on an unknown dataset.
 ## Model description
@@ -32,22 +40,34 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 8
-- eval_batch_size: 16
 - seed: 42
-- gradient_accumulation_steps: 100
-- total_train_batch_size: 800
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 500
-- num_epochs: 1.0
 ### Training results
 ### Framework versions
-- Transformers 4.29.2
 - Pytorch 2.0.1+cu118
 - Datasets 2.12.0
 - Tokenizers 0.13.3

 ---
 tags:
 - generated_from_trainer
+metrics:
+- bleu
 model-index:
 - name: BERT2BERT_pretrained_LC-QuAD_2.0
   results: []
 # BERT2BERT_pretrained_LC-QuAD_2.0
 This model was trained from scratch on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.9002
+- Bleu: 79.5906
+- Em: 0.08
+- Rm: 0.9565
+- Gen Len: 48.48
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 2500
+- num_epochs: 10.0
 ### Training results
+| Training Loss | Epoch | Step  | Bleu    | Em   | Gen Len | Validation Loss | Rm     |
+|:-------------:|:-----:|:-----:|:-------:|:----:|:-------:|:---------------:|:------:|
+| 7.2445        | 0.83  | 2000  | 1.523   | 0.0  | 204.48  | 7.5619          | nan    |
+| 3.4458        | 1.66  | 4000  | 16.0372 | 0.0  | 76.36   | 3.4202          | nan    |
+| 2.2571        | 2.49  | 6000  | 34.5826 | 0.0  | 48.98   | 2.1971          | 0.8125 |
+| 1.5696        | 3.32  | 8000  | 53.8822 | 0.0  | 48.7    | 1.6802          | 1.0    |
+| 1.1359        | 4.15  | 10000 | 64.5591 | 0.02 | 45.2    | 1.3464          | 0.973  |
+| 0.9994        | 4.98  | 12000 | 68.0869 | 0.02 | 47.76   | 1.0576          | 0.8889 |
+| 0.7275        | 5.81  | 14000 | 74.1032 | 0.02 | 46.52   | 0.9522          | 0.9556 |
+| 0.5868        | 6.64  | 16000 | 0.9307  | 71.6124| 0.02    | 0.9556          | 47.52  |
+| 0.4499        | 7.47  | 18000 | 0.8866  | 77.237| 0.06    | 0.9574          | 46.0   |
+| 0.3515        | 8.3   | 20000 | 0.9070  | 77.5798| 0.08    | 0.9574          | 47.5   |
+| 0.292         | 9.13  | 22000 | 0.8905  | 78.649| 0.06    | 0.9574          | 47.96  |
+| 0.2658        | 9.96  | 24000 | 0.9002  | 79.5906| 0.08    | 0.9565          | 48.48  |
 ### Framework versions
+- Transformers 4.30.0.dev0
 - Pytorch 2.0.1+cu118
 - Datasets 2.12.0
 - Tokenizers 0.13.3