metadata
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: barthez-deft-sciences_de_l_information
results:
- task:
name: Summarization
type: summarization
metrics:
- name: Rouge1
type: rouge
value: 34.5672
barthez-deft-sciences_de_l_information
This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.
Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)
It achieves the following results on the evaluation set:
- Loss: 2.0258
- Rouge1: 34.5672
- Rouge2: 16.7861
- Rougel: 27.5573
- Rougelsum: 27.6099
- Gen Len: 17.8857
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
3.3405 | 1.0 | 106 | 2.3682 | 31.3511 | 12.1973 | 25.6977 | 25.6851 | 14.9714 |
2.4219 | 2.0 | 212 | 2.1891 | 30.1154 | 13.3459 | 25.4854 | 25.5403 | 14.0429 |
2.0789 | 3.0 | 318 | 2.0994 | 32.153 | 15.3865 | 26.1859 | 26.1672 | 15.2 |
1.869 | 4.0 | 424 | 2.0258 | 34.5797 | 16.4194 | 27.6909 | 27.7201 | 16.9857 |
1.6569 | 5.0 | 530 | 2.0417 | 34.3854 | 16.5237 | 28.7036 | 28.8258 | 15.2429 |
1.5414 | 6.0 | 636 | 2.0503 | 33.1768 | 15.4851 | 27.2818 | 27.2884 | 16.0143 |
1.4461 | 7.0 | 742 | 2.0293 | 35.4273 | 16.118 | 27.3622 | 27.393 | 16.6857 |
1.3435 | 8.0 | 848 | 2.0336 | 35.3471 | 15.9695 | 27.668 | 27.6749 | 17.2 |
1.2624 | 9.0 | 954 | 2.0779 | 35.9201 | 17.2547 | 27.409 | 27.3293 | 17.1857 |
1.1807 | 10.0 | 1060 | 2.1301 | 35.7061 | 15.9138 | 27.3968 | 27.4716 | 17.1286 |
1.0972 | 11.0 | 1166 | 2.1726 | 34.3194 | 16.1313 | 27.0367 | 27.0737 | 17.1429 |
1.0224 | 12.0 | 1272 | 2.1704 | 34.9278 | 16.7958 | 27.8754 | 27.932 | 16.6571 |
1.0181 | 13.0 | 1378 | 2.2458 | 34.472 | 15.9111 | 28.2938 | 28.2946 | 16.7571 |
0.9769 | 14.0 | 1484 | 2.3405 | 35.1592 | 16.3135 | 29.0956 | 29.0858 | 16.5429 |
0.8866 | 15.0 | 1590 | 2.3303 | 34.8732 | 15.6709 | 27.5858 | 27.6169 | 16.2429 |
0.8888 | 16.0 | 1696 | 2.2976 | 35.3034 | 16.8011 | 27.7988 | 27.7569 | 17.5143 |
0.8358 | 17.0 | 1802 | 2.3349 | 35.505 | 16.8851 | 28.3651 | 28.413 | 16.8143 |
0.8026 | 18.0 | 1908 | 2.3738 | 35.2328 | 17.0358 | 28.544 | 28.6211 | 16.6143 |
0.7487 | 19.0 | 2014 | 2.4103 | 34.0793 | 15.4468 | 27.8057 | 27.8586 | 16.7286 |
0.7722 | 20.0 | 2120 | 2.3991 | 34.8116 | 15.8706 | 27.9173 | 27.983 | 16.9286 |
Framework versions
- Transformers 4.10.2
- Pytorch 1.7.1+cu110
- Datasets 1.11.0
- Tokenizers 0.10.3