t5-small_adafactor / README.md
oMateos2020's picture
update model card README.md
b77d82d
|
raw
history blame
5.91 kB
metadata
tags:
  - generated_from_trainer
datasets:
  - xsum
metrics:
  - rouge
model-index:
  - name: t5-small_adafactor
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: xsum
          type: xsum
          args: default
        metrics:
          - name: Rouge1
            type: rouge
            value: 32.3784

t5-small_adafactor

This model was trained from scratch on the xsum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1513
  • Rouge1: 32.3784
  • Rouge2: 11.2335
  • Rougel: 26.1197
  • Rougelsum: 26.1212
  • Gen Len: 18.8066

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.4206 0.02 200 2.2951 30.6414 9.9248 24.5953 24.6021 18.7814
2.4363 0.05 400 2.3041 30.969 9.9594 24.9531 24.9484 18.7812
2.4442 0.07 600 2.3042 30.9605 9.8821 24.9273 24.9343 18.787
2.4402 0.09 800 2.2985 31.1667 9.9976 25.034 25.0346 18.7505
2.4394 0.12 1000 2.2951 30.8935 9.8125 24.8084 24.8066 18.878
2.4148 0.14 1200 2.2965 31.4419 10.1935 25.1234 25.1165 18.8134
2.4329 0.16 1400 2.2891 30.735 9.7912 24.6127 24.6084 18.7797
2.4308 0.19 1600 2.2950 31.0388 10.13 24.9166 24.9086 18.8409
2.4302 0.21 1800 2.2808 30.978 10.0544 24.9191 24.9158 18.8147
2.4165 0.24 2000 2.2785 31.2423 10.2329 25.2027 25.192 18.7531
2.4227 0.26 2200 2.2705 30.8977 10.0552 24.8875 24.8869 18.8472
2.4117 0.28 2400 2.2691 30.9478 10.1551 24.8565 24.8527 18.8049
2.4229 0.31 2600 2.2635 31.1634 10.2055 25.0868 25.084 18.8424
2.4163 0.33 2800 2.2554 31.2877 10.4018 25.2972 25.2924 18.8127
2.4109 0.35 3000 2.2498 31.5192 10.3888 25.3461 25.3489 18.8066
2.3883 0.38 3200 2.2473 31.4033 10.3393 25.2324 25.2297 18.8657
2.3946 0.4 3400 2.2443 31.9869 10.7348 25.7509 25.7521 18.7703
2.3726 0.42 3600 2.2398 31.6649 10.4532 25.4268 25.4221 18.8244
2.3949 0.45 3800 2.2335 31.7186 10.6587 25.5281 25.5234 18.7766
2.387 0.47 4000 2.2267 32.015 10.7906 25.7612 25.7634 18.7552
2.3737 0.49 4200 2.2262 31.7823 10.7758 25.6306 25.6343 18.7436
2.37 0.52 4400 2.2238 31.5111 10.6443 25.3768 25.3782 18.7801
2.3748 0.54 4600 2.2166 31.6585 10.5958 25.4283 25.4321 18.7989
2.3789 0.56 4800 2.2100 31.829 10.7779 25.6561 25.648 18.7688
2.3659 0.59 5000 2.2064 32.0499 10.9069 25.8784 25.8725 18.8464
2.3656 0.61 5200 2.2032 31.8874 10.7972 25.6996 25.6948 18.75
2.3593 0.64 5400 2.1987 31.9182 10.7176 25.672 25.6662 18.8595
2.3445 0.66 5600 2.1935 31.9871 10.803 25.7289 25.7247 18.7972
2.3439 0.68 5800 2.1870 32.1788 10.9332 25.9597 25.9605 18.8062
2.3489 0.71 6000 2.1845 32.0946 10.9864 25.9296 25.9342 18.8307
2.3759 0.73 6200 2.1796 32.3321 11.0971 26.084 26.0843 18.7956
2.3611 0.75 6400 2.1759 32.0703 10.8886 25.8437 25.8369 18.7629
2.3319 0.78 6600 2.1722 31.8674 10.8993 25.6791 25.686 18.8292
2.3445 0.8 6800 2.1686 32.1679 11.0594 25.8591 25.8604 18.817
2.3523 0.82 7000 2.1667 32.2232 11.1537 25.9326 25.9359 18.8073
2.3439 0.85 7200 2.1641 32.246 11.1854 26.015 26.0097 18.7954
2.3496 0.87 7400 2.1603 32.1141 11.0758 25.9561 25.9623 18.7639
2.3368 0.89 7600 2.1580 32.3447 11.1661 26.0906 26.0888 18.7936
2.3634 0.92 7800 2.1553 32.3039 11.2246 26.0819 26.0828 18.7922
2.3396 0.94 8000 2.1534 32.2979 11.262 26.0726 26.071 18.8069
2.3645 0.96 8200 2.1520 32.4169 11.292 26.1811 26.187 18.7921
2.341 0.99 8400 2.1513 32.3784 11.2335 26.1197 26.1212 18.8066

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1