mt5_keep_training
This model is a fine-tuned version of kyle0518/mt5_baseline on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.3480
- Rouge-1: {'r': 0.24188216677604132, 'p': 0.3075288971142052, 'f': 0.2611479632252186}
- Rouge-2: {'r': 0.09316873118754515, 'p': 0.11360644537729306, 'f': 0.09885626415809506}
- Rouge-l: {'r': 0.21624535609324255, 'p': 0.2753054971328456, 'f': 0.2334192537650015}
- Gen Len: 20.4044
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- total_eval_batch_size: 2
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 10.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge-1 | Rouge-2 | Rouge-l | Gen Len |
---|---|---|---|---|---|---|---|
3.5447 | 1.0 | 1221 | 3.3790 | {'r': 0.23784785397988958, 'p': 0.30439920486956745, 'f': 0.2571232281810187} | {'r': 0.09058120996245961, 'p': 0.11107673870097613, 'f': 0.09611256335904114} | {'r': 0.21365476216010548, 'p': 0.27379373959526926, 'f': 0.23094657086285145} | 20.5440 |
3.5237 | 2.0 | 2442 | 3.3656 | {'r': 0.23716744545662055, 'p': 0.30347215524971394, 'f': 0.25644099572235274} | {'r': 0.09094131228908721, 'p': 0.11068068383132365, 'f': 0.0962432398231907} | {'r': 0.21270409540251395, 'p': 0.27202975968116894, 'f': 0.22974900965917708} | 20.4450 |
3.5214 | 3.0 | 3663 | 3.3586 | {'r': 0.23897862979041465, 'p': 0.3042472315693906, 'f': 0.2579145138360098} | {'r': 0.09104488924500853, 'p': 0.11099503621779311, 'f': 0.09651312517289976} | {'r': 0.21413906977834038, 'p': 0.273060927124794, 'f': 0.2311229256081215} | 20.4749 |
3.5134 | 4.0 | 4884 | 3.3520 | {'r': 0.23950214865662794, 'p': 0.3070823485995523, 'f': 0.2594826424731843} | {'r': 0.09171226581874202, 'p': 0.11199843376161128, 'f': 0.09742116651193937} | {'r': 0.21414716172876794, 'p': 0.2750761594339632, 'f': 0.23204107943337418} | 20.3257 |
3.4993 | 5.0 | 6105 | 3.3530 | {'r': 0.2404696251507306, 'p': 0.30653841362934947, 'f': 0.259901001807806} | {'r': 0.0924362657515393, 'p': 0.1126726581277754, 'f': 0.09809607014066778} | {'r': 0.21554504617194553, 'p': 0.2750045023966863, 'f': 0.23291172821730913} | 20.3947 |
3.4818 | 6.0 | 7326 | 3.3510 | {'r': 0.2399176469534912, 'p': 0.30602838752959266, 'f': 0.2592564631122185} | {'r': 0.0917330686298591, 'p': 0.11213011475513528, 'f': 0.09732212805120544} | {'r': 0.21435755587904412, 'p': 0.27365982325493976, 'f': 0.23156684680927045} | 20.3197 |
3.4701 | 7.0 | 8547 | 3.3478 | {'r': 0.2420024353167058, 'p': 0.3088020635950286, 'f': 0.2616097514115912} | {'r': 0.09360029653501188, 'p': 0.11432231694204327, 'f': 0.0994129529655908} | {'r': 0.21661906498749253, 'p': 0.27656398792461684, 'f': 0.2340430303667163} | 20.3832 |
3.4628 | 8.0 | 9768 | 3.3494 | {'r': 0.24157995165740645, 'p': 0.30718548271993334, 'f': 0.2606764786406747} | {'r': 0.093124643830029, 'p': 0.11358747674559151, 'f': 0.09873509568033847} | {'r': 0.216186503705635, 'p': 0.27521490938646087, 'f': 0.2332181498940174} | 20.4224 |
3.4575 | 9.0 | 10989 | 3.3482 | {'r': 0.2416243454789221, 'p': 0.3069717158507894, 'f': 0.26070624414427557} | {'r': 0.09302422828604139, 'p': 0.1132862972237393, 'f': 0.09861011965625359} | {'r': 0.21609218322588264, 'p': 0.27482726675479946, 'f': 0.23308537171573893} | 20.4127 |
3.483 | 10.0 | 12210 | 3.3480 | {'r': 0.24188216677604132, 'p': 0.3075288971142052, 'f': 0.2611479632252186} | {'r': 0.09316873118754515, 'p': 0.11360644537729306, 'f': 0.09885626415809506} | {'r': 0.21624535609324255, 'p': 0.2753054971328456, 'f': 0.2334192537650015} | 20.4044 |
Framework versions
- Transformers 4.18.0.dev0
- Pytorch 2.0.0
- Datasets 2.14.5
- Tokenizers 0.12.1
- Downloads last month
- 2