rob2rand_chen_w_prefix_tc
This model is a fine-tuned version of imamnurby/rob2rand_chen_w_prefix on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2749
- Bleu: 83.9120
- Em: 86.2159
- Bleu Em: 85.0639
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Em | Bleu Em |
---|---|---|---|---|---|---|
0.6922 | 0.71 | 500 | 0.2425 | 68.5819 | 79.7927 | 74.1873 |
0.086 | 1.42 | 1000 | 0.2480 | 70.9791 | 79.5855 | 75.2823 |
0.0865 | 2.13 | 1500 | 0.2567 | 68.7037 | 78.8256 | 73.7646 |
0.0758 | 2.84 | 2000 | 0.2483 | 69.4605 | 80.2418 | 74.8512 |
0.0683 | 3.55 | 2500 | 0.2662 | 68.3732 | 78.4456 | 73.4094 |
0.0643 | 4.26 | 3000 | 0.2700 | 66.5413 | 78.3765 | 72.4589 |
0.0596 | 4.97 | 3500 | 0.2611 | 67.4313 | 78.9637 | 73.1975 |
0.0519 | 5.68 | 4000 | 0.2697 | 68.3717 | 79.1019 | 73.7368 |
0.0478 | 6.39 | 4500 | 0.2914 | 69.7507 | 77.7202 | 73.7354 |
0.0461 | 7.1 | 5000 | 0.2776 | 68.5387 | 79.1019 | 73.8203 |
0.04 | 7.81 | 5500 | 0.2975 | 67.6316 | 78.1693 | 72.9004 |
0.0373 | 8.52 | 6000 | 0.2922 | 68.0161 | 79.4473 | 73.7317 |
0.0345 | 9.23 | 6500 | 0.3032 | 69.4580 | 79.2401 | 74.3490 |
0.032 | 9.94 | 7000 | 0.3104 | 67.2595 | 79.0328 | 73.1462 |
0.0294 | 10.65 | 7500 | 0.3077 | 65.8142 | 78.4801 | 72.1472 |
0.0269 | 11.36 | 8000 | 0.3092 | 70.2072 | 78.8601 | 74.5337 |
0.026 | 12.07 | 8500 | 0.3117 | 70.4504 | 79.4473 | 74.9489 |
0.0229 | 12.78 | 9000 | 0.3114 | 69.4635 | 79.2401 | 74.3518 |
0.0215 | 13.49 | 9500 | 0.3143 | 67.3601 | 79.3092 | 73.3346 |
0.0205 | 14.2 | 10000 | 0.3176 | 68.4031 | 78.9983 | 73.7007 |
0.0195 | 14.91 | 10500 | 0.3253 | 66.5673 | 78.9637 | 72.7655 |
0.0173 | 15.62 | 11000 | 0.3377 | 68.7553 | 78.7219 | 73.7386 |
0.0164 | 16.34 | 11500 | 0.3377 | 69.2474 | 79.1364 | 74.1919 |
0.0161 | 17.05 | 12000 | 0.3371 | 69.0846 | 79.6200 | 74.3523 |
0.0148 | 17.76 | 12500 | 0.3457 | 70.8330 | 79.3782 | 75.1056 |
0.0137 | 18.47 | 13000 | 0.3516 | 69.5576 | 79.2401 | 74.3988 |
0.0135 | 19.18 | 13500 | 0.3573 | 70.3232 | 79.1364 | 74.7298 |
0.0127 | 19.89 | 14000 | 0.3574 | 70.2481 | 79.1019 | 74.6750 |
0.0115 | 20.6 | 14500 | 0.3694 | 65.7587 | 78.3765 | 72.0676 |
0.0107 | 21.31 | 15000 | 0.3696 | 68.7923 | 78.5838 | 73.6880 |
0.0107 | 22.02 | 15500 | 0.3607 | 69.4452 | 78.8256 | 74.1354 |
0.0101 | 22.73 | 16000 | 0.3770 | 68.6731 | 78.5492 | 73.6112 |
0.0095 | 23.44 | 16500 | 0.3648 | 69.8402 | 79.7237 | 74.7819 |
0.0088 | 24.15 | 17000 | 0.3822 | 69.6238 | 79.0328 | 74.3283 |
0.0088 | 24.86 | 17500 | 0.3816 | 68.5422 | 79.1364 | 73.8393 |
0.0079 | 25.57 | 18000 | 0.3822 | 69.1359 | 79.2401 | 74.1880 |
0.0073 | 26.28 | 18500 | 0.3742 | 69.8331 | 79.6891 | 74.7611 |
0.007 | 26.99 | 19000 | 0.3849 | 69.5048 | 79.2746 | 74.3897 |
0.0072 | 27.7 | 19500 | 0.3881 | 69.6135 | 79.2055 | 74.4095 |
0.0059 | 28.41 | 20000 | 0.3922 | 70.2656 | 79.2746 | 74.7701 |
0.0069 | 29.12 | 20500 | 0.3936 | 68.2044 | 78.7910 | 73.4977 |
0.0059 | 29.83 | 21000 | 0.3983 | 69.6257 | 79.4473 | 74.5365 |
0.0055 | 30.54 | 21500 | 0.3973 | 70.4039 | 79.5509 | 74.9774 |
0.0057 | 31.25 | 22000 | 0.3960 | 70.3015 | 79.6546 | 74.9780 |
0.0056 | 31.96 | 22500 | 0.3945 | 69.9785 | 79.5855 | 74.7820 |
0.0049 | 32.67 | 23000 | 0.3947 | 70.1822 | 79.6546 | 74.9184 |
0.0049 | 33.38 | 23500 | 0.3957 | 69.1207 | 79.3437 | 74.2322 |
0.0048 | 34.09 | 24000 | 0.4097 | 68.8815 | 78.9292 | 73.9053 |
0.0043 | 34.8 | 24500 | 0.4039 | 70.0982 | 79.4473 | 74.7727 |
0.0044 | 35.51 | 25000 | 0.4080 | 69.3472 | 79.5164 | 74.4318 |
0.0042 | 36.22 | 25500 | 0.4066 | 69.0213 | 79.0674 | 74.0443 |
0.0038 | 36.93 | 26000 | 0.4128 | 69.1452 | 79.3092 | 74.2272 |
0.0037 | 37.64 | 26500 | 0.4134 | 69.2672 | 79.5164 | 74.3918 |
0.0034 | 38.35 | 27000 | 0.4161 | 69.7751 | 79.5509 | 74.6630 |
0.0038 | 39.06 | 27500 | 0.4037 | 69.4092 | 79.6546 | 74.5319 |
0.0031 | 39.77 | 28000 | 0.4041 | 69.3912 | 79.6546 | 74.5229 |
0.0032 | 40.48 | 28500 | 0.4185 | 69.1159 | 79.4473 | 74.2816 |
0.0031 | 41.19 | 29000 | 0.4245 | 68.6867 | 78.9983 | 73.8425 |
0.003 | 41.9 | 29500 | 0.4202 | 69.4091 | 79.3092 | 74.3591 |
0.0027 | 42.61 | 30000 | 0.4249 | 68.7400 | 79.0328 | 73.8864 |
0.0026 | 43.32 | 30500 | 0.4175 | 69.9729 | 79.8273 | 74.9001 |
0.0027 | 44.03 | 31000 | 0.4189 | 69.6688 | 79.5855 | 74.6271 |
0.0027 | 44.74 | 31500 | 0.4203 | 69.4071 | 79.5855 | 74.4963 |
0.0025 | 45.45 | 32000 | 0.4265 | 69.3197 | 79.1019 | 74.2108 |
0.0023 | 46.16 | 32500 | 0.4255 | 69.7513 | 79.3437 | 74.5475 |
0.0023 | 46.88 | 33000 | 0.4227 | 69.2893 | 79.5509 | 74.4201 |
0.0023 | 47.59 | 33500 | 0.4233 | 69.6060 | 79.5509 | 74.5785 |
0.002 | 48.3 | 34000 | 0.4239 | 69.0113 | 79.4819 | 74.2466 |
0.0024 | 49.01 | 34500 | 0.4239 | 68.9754 | 79.4128 | 74.1941 |
0.0019 | 49.72 | 35000 | 0.4228 | 68.9220 | 79.3782 | 74.1501 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.7.1
- Datasets 2.1.0
- Tokenizers 0.12.1
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.