José Ángel González
Update README.md
a4e92bd
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: barthez-deft-sciences_de_l_information
    results:
      - task:
          name: Summarization
          type: summarization
        metrics:
          - name: Rouge1
            type: rouge
            value: 34.5672

barthez-deft-sciences_de_l_information

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

  • Loss: 2.0258
  • Rouge1: 34.5672
  • Rouge2: 16.7861
  • Rougel: 27.5573
  • Rougelsum: 27.6099
  • Gen Len: 17.8857

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.3405 1.0 106 2.3682 31.3511 12.1973 25.6977 25.6851 14.9714
2.4219 2.0 212 2.1891 30.1154 13.3459 25.4854 25.5403 14.0429
2.0789 3.0 318 2.0994 32.153 15.3865 26.1859 26.1672 15.2
1.869 4.0 424 2.0258 34.5797 16.4194 27.6909 27.7201 16.9857
1.6569 5.0 530 2.0417 34.3854 16.5237 28.7036 28.8258 15.2429
1.5414 6.0 636 2.0503 33.1768 15.4851 27.2818 27.2884 16.0143
1.4461 7.0 742 2.0293 35.4273 16.118 27.3622 27.393 16.6857
1.3435 8.0 848 2.0336 35.3471 15.9695 27.668 27.6749 17.2
1.2624 9.0 954 2.0779 35.9201 17.2547 27.409 27.3293 17.1857
1.1807 10.0 1060 2.1301 35.7061 15.9138 27.3968 27.4716 17.1286
1.0972 11.0 1166 2.1726 34.3194 16.1313 27.0367 27.0737 17.1429
1.0224 12.0 1272 2.1704 34.9278 16.7958 27.8754 27.932 16.6571
1.0181 13.0 1378 2.2458 34.472 15.9111 28.2938 28.2946 16.7571
0.9769 14.0 1484 2.3405 35.1592 16.3135 29.0956 29.0858 16.5429
0.8866 15.0 1590 2.3303 34.8732 15.6709 27.5858 27.6169 16.2429
0.8888 16.0 1696 2.2976 35.3034 16.8011 27.7988 27.7569 17.5143
0.8358 17.0 1802 2.3349 35.505 16.8851 28.3651 28.413 16.8143
0.8026 18.0 1908 2.3738 35.2328 17.0358 28.544 28.6211 16.6143
0.7487 19.0 2014 2.4103 34.0793 15.4468 27.8057 27.8586 16.7286
0.7722 20.0 2120 2.3991 34.8116 15.8706 27.9173 27.983 16.9286

Framework versions

  • Transformers 4.10.2
  • Pytorch 1.7.1+cu110
  • Datasets 1.11.0
  • Tokenizers 0.10.3