barthez-deft-archeologie

This model is a fine-tuned version of moussaKam/barthez on an unknown dataset.

Note: this model is one of the preliminary experiments and it underperforms the models published in the paper (using MBartHez and HAL/Wiki pre-training + copy mechanisms)

It achieves the following results on the evaluation set:

  • Loss: 2.0733
  • Rouge1: 37.1845
  • Rouge2: 16.9534
  • Rougel: 28.8416
  • Rougelsum: 29.077
  • Gen Len: 34.4028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.4832 1.0 108 2.4237 22.6662 10.009 19.8729 19.8814 15.8333
2.557 2.0 216 2.2328 24.8102 11.9911 20.4773 20.696 19.0139
2.2702 3.0 324 2.2002 25.6482 11.6191 21.8383 21.9341 18.1944
2.1119 4.0 432 2.1266 25.5806 11.9765 21.3973 21.3503 19.4306
1.9582 5.0 540 2.1072 25.6578 12.2709 22.182 22.0548 19.1528
1.8137 6.0 648 2.1008 26.5272 11.4033 22.359 22.3259 19.4722
1.7725 7.0 756 2.1074 25.0405 11.1773 21.1369 21.1847 19.1806
1.6772 8.0 864 2.0959 26.5237 11.6028 22.5018 22.3931 19.3333
1.5798 9.0 972 2.0976 27.7443 11.9898 22.4052 22.2954 19.7222
1.4753 10.0 1080 2.0733 28.3502 12.9162 22.6352 22.6015 19.8194
1.4646 11.0 1188 2.1091 27.9198 12.8591 23.0718 23.0779 19.6111
1.4082 12.0 1296 2.1036 28.8509 13.0987 23.4189 23.5044 19.4861
1.2862 13.0 1404 2.1222 28.6641 12.8157 22.6799 22.7051 19.8611
1.2612 14.0 1512 2.1487 26.9709 11.6084 22.0312 22.0543 19.875
1.2327 15.0 1620 2.1808 28.218 12.6239 22.7372 22.7881 19.7361
1.2264 16.0 1728 2.1778 26.7393 11.4474 21.6057 21.555 19.7639
1.1848 17.0 1836 2.1995 27.6902 12.1082 22.0406 22.0101 19.6806
1.133 18.0 1944 2.2038 27.0402 12.1846 21.7793 21.7513 19.8056
1.168 19.0 2052 2.2116 27.5149 11.9876 22.1113 22.1527 19.7222
1.1206 20.0 2160 2.2133 28.2321 12.677 22.749 22.8485 19.5972

Framework versions

  • Transformers 4.10.2
  • Pytorch 1.7.1+cu110
  • Datasets 1.11.0
  • Tokenizers 0.10.3
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.