---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- xlsum
metrics:
- rouge
model-index:
- name: mt5-summarize-ch_trad-v2
  results:
  - task:
      name: Sequence-to-sequence Language Modeling
      type: text2text-generation
    dataset:
      name: xlsum
      type: xlsum
      config: chinese_traditional
      split: validation
      args: chinese_traditional
    metrics:
    - name: Rouge1
      type: rouge
      value: 0.292
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-summarize-ch_trad-v2

This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the xlsum dataset.
It achieves the following results on the evaluation set:
- Loss: 3.1706
- Rouge1: 0.292
- Rouge2: 0.1413
- Rougel: 0.2218
- Rougelsum: 0.2383
- Gen Len: 126.9946

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 10
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len  |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:--------:|
| 7.599         | 0.43  | 500   | 5.9495          | 0.2214 | 0.0975 | 0.1686 | 0.1785    | 124.4867 |
| 6.051         | 0.86  | 1000  | 5.1437          | 0.2508 | 0.1156 | 0.1915 | 0.2024    | 126.4152 |
| 5.2303        | 1.28  | 1500  | 4.4085          | 0.2586 | 0.1206 | 0.1985 | 0.2091    | 126.5906 |
| 4.6814        | 1.71  | 2000  | 4.0174          | 0.281  | 0.1314 | 0.2124 | 0.2282    | 126.8248 |
| 4.388         | 2.14  | 2500  | 3.7829          | 0.2732 | 0.1278 | 0.2101 | 0.223     | 126.8782 |
| 4.1681        | 2.57  | 3000  | 3.6421          | 0.2655 | 0.1251 | 0.2068 | 0.2171    | 126.8794 |
| 4.0634        | 3.0   | 3500  | 3.5647          | 0.2732 | 0.129  | 0.2099 | 0.2217    | 126.9833 |
| 3.9309        | 3.42  | 4000  | 3.4990          | 0.2758 | 0.1295 | 0.2114 | 0.2254    | 126.9901 |
| 3.868         | 3.85  | 4500  | 3.4264          | 0.2769 | 0.1328 | 0.2152 | 0.2252    | 126.9861 |
| 3.7944        | 4.28  | 5000  | 3.4014          | 0.2857 | 0.1378 | 0.2187 | 0.2326    | 126.9694 |
| 3.7583        | 4.71  | 5500  | 3.3351          | 0.2822 | 0.136  | 0.2186 | 0.2311    | 126.9944 |
| 3.6907        | 5.14  | 6000  | 3.3172          | 0.2792 | 0.1335 | 0.2144 | 0.2273    | 126.9874 |
| 3.6542        | 5.57  | 6500  | 3.2911          | 0.2798 | 0.1343 | 0.2147 | 0.228     | 126.9916 |
| 3.6186        | 5.99  | 7000  | 3.2548          | 0.2802 | 0.134  | 0.2152 | 0.2277    | 126.9916 |
| 3.5894        | 6.42  | 7500  | 3.2287          | 0.2859 | 0.1376 | 0.2181 | 0.2328    | 126.9972 |
| 3.5615        | 6.85  | 8000  | 3.2264          | 0.2872 | 0.1374 | 0.2179 | 0.2343    | 126.9972 |
| 3.5321        | 7.28  | 8500  | 3.2069          | 0.2861 | 0.1374 | 0.2178 | 0.233     | 126.9972 |
| 3.5242        | 7.71  | 9000  | 3.2076          | 0.289  | 0.1385 | 0.2193 | 0.2357    | 126.9919 |
| 3.5195        | 8.13  | 9500  | 3.1825          | 0.2878 | 0.1384 | 0.2189 | 0.2352    | 126.9919 |
| 3.4815        | 8.56  | 10000 | 3.1852          | 0.289  | 0.1386 | 0.22   | 0.2358    | 126.9944 |
| 3.4823        | 8.99  | 10500 | 3.1775          | 0.2918 | 0.1413 | 0.2218 | 0.2383    | 127.0    |
| 3.4705        | 9.42  | 11000 | 3.1704          | 0.2912 | 0.1407 | 0.2218 | 0.2375    | 126.9972 |
| 3.4634        | 9.85  | 11500 | 3.1706          | 0.292  | 0.1413 | 0.2218 | 0.2383    | 126.9946 |


### Framework versions

- Transformers 4.26.1
- Pytorch 1.13.0
- Datasets 2.1.0
- Tokenizers 0.13.2