mqy
/

mt5-small-finetuned

@@ -17,10 +17,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.3471
-- Rouge1: 20.34
-- Rouge2: 5.8
-- Rougel: 19.87
 ## Model description
@@ -45,38 +45,24 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 26
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|
-| 5.1562        | 1.0   | 345  | 2.6737          | 15.12  | 4.3    | 14.87  |
-| 3.301         | 2.0   | 690  | 2.5121          | 17.39  | 5.58   | 17.25  |
-| 3.0661        | 3.0   | 1035 | 2.4983          | 17.72  | 5.2    | 17.49  |
-| 2.9253        | 4.0   | 1380 | 2.4899          | 18.07  | 5.22   | 17.84  |
-| 2.8201        | 5.0   | 1725 | 2.4338          | 17.52  | 4.82   | 17.11  |
-| 2.7411        | 6.0   | 2070 | 2.4187          | 18.93  | 5.67   | 18.69  |
-| 2.6735        | 7.0   | 2415 | 2.3956          | 19.17  | 5.38   | 18.86  |
-| 2.618         | 8.0   | 2760 | 2.3944          | 19.75  | 5.92   | 19.39  |
-| 2.5555        | 9.0   | 3105 | 2.3756          | 19.25  | 5.74   | 18.94  |
-| 2.5162        | 10.0  | 3450 | 2.3706          | 19.16  | 5.67   | 18.8   |
-| 2.4727        | 11.0  | 3795 | 2.3772          | 20.0   | 6.35   | 19.66  |
-| 2.4329        | 12.0  | 4140 | 2.3552          | 19.32  | 5.67   | 18.83  |
-| 2.3985        | 13.0  | 4485 | 2.3809          | 20.35  | 6.42   | 19.94  |
-| 2.3661        | 14.0  | 4830 | 2.3490          | 19.27  | 5.86   | 18.93  |
-| 2.3252        | 15.0  | 5175 | 2.3438          | 19.77  | 5.8    | 19.23  |
-| 2.3107        | 16.0  | 5520 | 2.3544          | 20.1   | 5.76   | 19.59  |
-| 2.2988        | 17.0  | 5865 | 2.3460          | 20.25  | 5.89   | 19.82  |
-| 2.2671        | 18.0  | 6210 | 2.3490          | 20.04  | 5.6    | 19.7   |
-| 2.2545        | 19.0  | 6555 | 2.3348          | 20.81  | 5.85   | 20.41  |
-| 2.241         | 20.0  | 6900 | 2.3454          | 19.78  | 6.0    | 19.36  |
-| 2.2076        | 21.0  | 7245 | 2.3371          | 20.31  | 5.86   | 19.96  |
-| 2.2056        | 22.0  | 7590 | 2.3447          | 20.5   | 6.15   | 20.13  |
-| 2.2027        | 23.0  | 7935 | 2.3384          | 20.25  | 5.86   | 19.84  |
-| 2.1831        | 24.0  | 8280 | 2.3462          | 20.44  | 5.77   | 20.08  |
-| 2.1931        | 25.0  | 8625 | 2.3537          | 20.39  | 5.96   | 19.98  |
-| 2.16          | 26.0  | 8970 | 2.3471          | 20.34  | 5.8    | 19.87  |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.3994
+- Rouge1: 20.69
+- Rouge2: 6.09
+- Rougel: 20.15
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 40
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|
+| 4.7204        | 1.45  | 500  | 2.6053          | 16.93  | 4.91   | 16.71  |
+| 3.1289        | 2.9   | 1000 | 2.4878          | 18.05  | 5.26   | 17.79  |
+| 2.8862        | 4.35  | 1500 | 2.4109          | 17.45  | 5.06   | 17.04  |
+| 2.7669        | 5.8   | 2000 | 2.4006          | 18.61  | 5.28   | 18.12  |
+| 2.6433        | 7.25  | 2500 | 2.4017          | 18.81  | 5.67   | 18.5   |
+| 2.5514        | 8.7   | 3000 | 2.3917          | 19.5   | 5.88   | 19.09  |
+| 2.4947        | 10.14 | 3500 | 2.3994          | 20.69  | 6.09   | 20.15  |
+| 2.3995        | 11.59 | 4000 | 2.3608          | 20.2   | 6.51   | 19.67  |
+| 2.3798        | 13.04 | 4500 | 2.3251          | 20.1   | 6.25   | 19.71  |
+| 2.3029        | 14.49 | 5000 | 2.3387          | 19.75  | 6.11   | 19.37  |
+| 2.2563        | 15.94 | 5500 | 2.3372          | 20.28  | 6.32   | 19.74  |
+| 2.2109        | 17.39 | 6000 | 2.3410          | 20.67  | 6.38   | 20.13  |
 ### Framework versions