metadata
license: cc-by-nc-4.0
base_model: >-
davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1
tags:
- generated_from_trainer
model-index:
- name: ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2
results: []
ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2
This model is a fine-tuned version of davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0162
- Rewards/real: -8.1731
- Rewards/generated: -31.3826
- Rewards/accuracies: 0.9917
- Rewards/margins: 23.2095
- Logps/generated: -956.3063
- Logps/real: -525.1735
- Logits/generated: -1.5719
- Logits/real: -1.7813
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6097 | 0.04 | 25 | 0.4147 | -0.6192 | -1.4312 | 0.9250 | 0.8120 | -656.7919 | -449.6341 | -2.0004 | -2.0773 |
0.2137 | 0.08 | 50 | 0.1745 | -2.0300 | -5.0060 | 0.9519 | 2.9761 | -692.5404 | -463.7422 | -1.9306 | -2.0237 |
0.1292 | 0.12 | 75 | 0.1012 | -2.8227 | -7.4967 | 0.9685 | 4.6740 | -717.4471 | -471.6697 | -1.8843 | -1.9887 |
0.0665 | 0.16 | 100 | 0.0676 | -3.2936 | -9.3177 | 0.9778 | 6.0240 | -735.6567 | -476.3786 | -1.8508 | -1.9628 |
0.0429 | 0.21 | 125 | 0.0477 | -3.7328 | -11.2722 | 0.9824 | 7.5395 | -755.2025 | -480.7701 | -1.8123 | -1.9332 |
0.0299 | 0.25 | 150 | 0.0369 | -4.2161 | -13.2599 | 0.9870 | 9.0437 | -775.0787 | -485.6039 | -1.7938 | -1.9226 |
0.0252 | 0.29 | 175 | 0.0320 | -4.7201 | -15.0489 | 0.9880 | 10.3288 | -792.9691 | -490.6432 | -1.7758 | -1.9116 |
0.0249 | 0.33 | 200 | 0.0301 | -5.0757 | -16.3570 | 0.9880 | 11.2813 | -806.0497 | -494.1995 | -1.7515 | -1.8923 |
0.0175 | 0.37 | 225 | 0.0273 | -5.4299 | -17.6751 | 0.9880 | 12.2451 | -819.2310 | -497.7419 | -1.7362 | -1.8821 |
0.0183 | 0.41 | 250 | 0.0254 | -5.4183 | -18.3899 | 0.9889 | 12.9715 | -826.3791 | -497.6259 | -1.7300 | -1.8793 |
0.0182 | 0.45 | 275 | 0.0245 | -6.0900 | -20.5760 | 0.9889 | 14.4860 | -848.2401 | -504.3426 | -1.6961 | -1.8564 |
0.0253 | 0.49 | 300 | 0.0224 | -5.9239 | -20.7184 | 0.9898 | 14.7944 | -849.6640 | -502.6819 | -1.6938 | -1.8573 |
0.0075 | 0.53 | 325 | 0.0234 | -7.0436 | -24.1126 | 0.9898 | 17.0691 | -883.6064 | -513.8781 | -1.6522 | -1.8252 |
0.0141 | 0.58 | 350 | 0.0212 | -5.5696 | -20.9714 | 0.9898 | 15.4017 | -852.1937 | -499.1387 | -1.7082 | -1.8693 |
0.0135 | 0.62 | 375 | 0.0182 | -5.2646 | -20.3901 | 0.9907 | 15.1254 | -846.3809 | -496.0890 | -1.7285 | -1.8897 |
0.014 | 0.66 | 400 | 0.0182 | -5.5057 | -21.1579 | 0.9907 | 15.6522 | -854.0594 | -498.4994 | -1.7137 | -1.8783 |
0.0122 | 0.7 | 425 | 0.0172 | -5.3398 | -20.7520 | 0.9907 | 15.4122 | -849.9997 | -496.8405 | -1.7231 | -1.8857 |
0.0144 | 0.74 | 450 | 0.0164 | -4.6606 | -19.3766 | 0.9917 | 14.7160 | -836.2463 | -490.0483 | -1.7465 | -1.9042 |
0.0103 | 0.78 | 475 | 0.0160 | -4.8739 | -20.1058 | 0.9907 | 15.2319 | -843.5385 | -492.1819 | -1.7445 | -1.9064 |
0.0147 | 0.82 | 500 | 0.0156 | -5.1220 | -20.9607 | 0.9917 | 15.8387 | -852.0875 | -494.6623 | -1.7434 | -1.9092 |
0.0154 | 0.86 | 525 | 0.0155 | -5.1481 | -21.3994 | 0.9917 | 16.2513 | -856.4740 | -494.9235 | -1.7357 | -1.9040 |
0.0158 | 0.91 | 550 | 0.0151 | -5.6088 | -22.9532 | 0.9917 | 17.3444 | -872.0123 | -499.5304 | -1.7139 | -1.8881 |
0.0053 | 0.95 | 575 | 0.0149 | -5.7209 | -23.5217 | 0.9917 | 17.8008 | -877.6972 | -500.6515 | -1.7113 | -1.8888 |
0.008 | 0.99 | 600 | 0.0147 | -5.7523 | -23.7474 | 0.9917 | 17.9952 | -879.9544 | -500.9651 | -1.7086 | -1.8878 |
0.0049 | 1.03 | 625 | 0.0154 | -6.1839 | -24.8883 | 0.9907 | 18.7044 | -891.3632 | -505.2818 | -1.6731 | -1.8585 |
0.0057 | 1.07 | 650 | 0.0155 | -6.4947 | -25.8924 | 0.9917 | 19.3977 | -901.4037 | -508.3892 | -1.6592 | -1.8484 |
0.0076 | 1.11 | 675 | 0.0158 | -6.8543 | -26.9217 | 0.9917 | 20.0674 | -911.6970 | -511.9859 | -1.6407 | -1.8339 |
0.004 | 1.15 | 700 | 0.0158 | -7.1325 | -27.7743 | 0.9917 | 20.6418 | -920.2236 | -514.7678 | -1.6269 | -1.8236 |
0.0168 | 1.19 | 725 | 0.0157 | -6.9019 | -26.2791 | 0.9917 | 19.3772 | -905.2711 | -512.4611 | -1.6566 | -1.8448 |
0.0022 | 1.23 | 750 | 0.0163 | -6.9586 | -26.5145 | 0.9917 | 19.5559 | -907.6251 | -513.0281 | -1.6533 | -1.8423 |
0.0039 | 1.28 | 775 | 0.0165 | -7.5386 | -28.2224 | 0.9917 | 20.6837 | -924.7038 | -518.8289 | -1.6369 | -1.8327 |
0.002 | 1.32 | 800 | 0.0165 | -7.6568 | -28.6441 | 0.9907 | 20.9872 | -928.9208 | -520.0109 | -1.6365 | -1.8344 |
0.002 | 1.36 | 825 | 0.0165 | -7.7989 | -29.2028 | 0.9917 | 21.4038 | -934.5078 | -521.4318 | -1.6348 | -1.8352 |
0.0019 | 1.4 | 850 | 0.0165 | -7.8978 | -29.5958 | 0.9917 | 21.6980 | -938.4382 | -522.4203 | -1.6166 | -1.8169 |
0.0041 | 1.44 | 875 | 0.0162 | -7.9696 | -29.7930 | 0.9917 | 21.8234 | -940.4100 | -523.1380 | -1.6165 | -1.8176 |
0.0023 | 1.48 | 900 | 0.0164 | -8.2086 | -30.6909 | 0.9917 | 22.4823 | -949.3892 | -525.5286 | -1.6045 | -1.8093 |
0.0038 | 1.52 | 925 | 0.0166 | -8.1217 | -30.6727 | 0.9917 | 22.5510 | -949.2076 | -524.6597 | -1.5919 | -1.7978 |
0.0096 | 1.56 | 950 | 0.0162 | -7.8257 | -30.1144 | 0.9917 | 22.2887 | -943.6237 | -521.6992 | -1.5909 | -1.7956 |
0.0057 | 1.6 | 975 | 0.0166 | -8.0335 | -30.6654 | 0.9917 | 22.6319 | -949.1342 | -523.7775 | -1.5854 | -1.7919 |
0.0046 | 1.65 | 1000 | 0.0165 | -8.1757 | -31.0139 | 0.9917 | 22.8382 | -952.6191 | -525.2000 | -1.5768 | -1.7852 |
0.0009 | 1.69 | 1025 | 0.0165 | -8.0553 | -30.7565 | 0.9917 | 22.7012 | -950.0453 | -523.9951 | -1.5757 | -1.7830 |
0.002 | 1.73 | 1050 | 0.0164 | -8.1838 | -31.3365 | 0.9917 | 23.1528 | -955.8453 | -525.2800 | -1.5692 | -1.7790 |
0.0069 | 1.77 | 1075 | 0.0163 | -8.1908 | -31.4118 | 0.9917 | 23.2210 | -956.5981 | -525.3508 | -1.5749 | -1.7850 |
0.0029 | 1.81 | 1100 | 0.0166 | -8.4138 | -32.0830 | 0.9917 | 23.6692 | -963.3098 | -527.5802 | -1.5624 | -1.7752 |
0.0047 | 1.85 | 1125 | 0.0166 | -8.4223 | -32.1526 | 0.9917 | 23.7304 | -964.0065 | -527.6652 | -1.5631 | -1.7759 |
0.0037 | 1.89 | 1150 | 0.0163 | -8.1563 | -31.3209 | 0.9917 | 23.1646 | -955.6895 | -525.0057 | -1.5739 | -1.7832 |
0.0026 | 1.93 | 1175 | 0.0163 | -8.2107 | -31.5009 | 0.9917 | 23.2901 | -957.4888 | -525.5498 | -1.5708 | -1.7807 |
0.0058 | 1.98 | 1200 | 0.0162 | -8.1731 | -31.3826 | 0.9917 | 23.2095 | -956.3063 | -525.1735 | -1.5719 | -1.7813 |
Framework versions
- Transformers 4.37.0
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2