test / README.md
ahmedabdelwahed's picture
Mojiz-DPO
f9e34ac
|
raw
history blame
No virus
9.47 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: ahmedabdelwahed/Mojiz-sft
model-index:
  - name: test
    results: []

test

This model is a fine-tuned version of ahmedabdelwahed/Mojiz-sft on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0004
  • Rewards/chosen: 4.9321
  • Rewards/rejected: -3.1907
  • Rewards/accuracies: 1.0
  • Rewards/margins: 8.1228
  • Logps/rejected: -102.9468
  • Logps/chosen: -276.9925
  • Logits/rejected: -11.5352
  • Logits/chosen: -12.8189

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 150
  • training_steps: 4000

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.59 0.41 100 0.5689 0.2142 -0.0529 1.0 0.2670 -71.5682 -324.1714 -11.7019 -12.9321
0.3343 0.82 200 0.2046 1.1503 -0.4195 1.0 1.5698 -75.2343 -314.8096 -11.5589 -12.6981
0.1352 1.22 300 0.0544 2.1788 -0.9611 1.0 3.1399 -80.6501 -304.5251 -11.3938 -12.4321
0.0394 1.63 400 0.0235 2.7263 -1.2443 1.0 3.9706 -83.4828 -299.0500 -11.3316 -12.3395
0.0406 2.04 500 0.0129 3.0958 -1.4814 1.0 4.5771 -85.8533 -295.3555 -11.2949 -12.2909
0.0288 2.45 600 0.0085 3.3507 -1.6368 1.0 4.9875 -87.4073 -292.8057 -11.2931 -12.2987
0.0168 2.86 700 0.0058 3.5726 -1.7858 1.0 5.3584 -88.8972 -290.5872 -11.2839 -12.2960
0.0124 3.27 800 0.0044 3.7498 -1.8907 1.0 5.6405 -89.9465 -288.8147 -11.2953 -12.3275
0.0087 3.67 900 0.0035 3.8535 -2.0148 1.0 5.8684 -91.1878 -287.7778 -11.2906 -12.3302
0.007 4.08 1000 0.0029 3.9744 -2.0910 1.0 6.0654 -91.9495 -286.5689 -11.3025 -12.3602
0.0122 4.49 1100 0.0023 4.0733 -2.1965 1.0 6.2698 -93.0046 -285.5799 -11.3092 -12.3815
0.0062 4.9 1200 0.0020 4.1431 -2.2995 1.0 6.4426 -94.0349 -284.8822 -11.3084 -12.3870
0.0113 5.31 1300 0.0017 4.2353 -2.3425 1.0 6.5778 -94.4643 -283.9596 -11.3304 -12.4328
0.0096 5.71 1400 0.0015 4.3008 -2.4035 1.0 6.7044 -95.0748 -283.3046 -11.3438 -12.4598
0.0054 6.12 1500 0.0013 4.3515 -2.4735 1.0 6.8250 -95.7742 -282.7977 -11.3476 -12.4711
0.0127 6.53 1600 0.0012 4.4101 -2.5237 1.0 6.9338 -96.2767 -282.2122 -11.3609 -12.4973
0.0046 6.94 1700 0.0011 4.4473 -2.5927 1.0 7.0400 -96.9663 -281.8404 -11.3694 -12.5148
0.0049 7.35 1800 0.0010 4.4882 -2.6539 1.0 7.1421 -97.5784 -281.4307 -11.3787 -12.5361
0.0039 7.76 1900 0.0009 4.5274 -2.7045 1.0 7.2319 -98.0843 -281.0386 -11.3905 -12.5571
0.0044 8.16 2000 0.0008 4.5889 -2.7282 1.0 7.3171 -98.3214 -280.4241 -11.4088 -12.5904
0.0065 8.57 2100 0.0008 4.6306 -2.7613 1.0 7.3919 -98.6527 -280.0075 -11.4171 -12.6086
0.0033 8.98 2200 0.0007 4.6508 -2.8105 1.0 7.4613 -99.1442 -279.8048 -11.4214 -12.6158
0.0041 9.39 2300 0.0007 4.6876 -2.8441 1.0 7.5316 -99.4804 -279.4375 -11.4311 -12.6356
0.0025 9.8 2400 0.0006 4.7079 -2.8946 1.0 7.6026 -99.9857 -279.2337 -11.4363 -12.6466
0.0113 10.2 2500 0.0006 4.7531 -2.9135 1.0 7.6665 -100.1742 -278.7823 -11.4524 -12.6747
0.004 10.61 2600 0.0006 4.7825 -2.9382 1.0 7.7207 -100.4218 -278.4879 -11.4644 -12.6965
0.0035 11.02 2700 0.0005 4.7952 -2.9793 1.0 7.7745 -100.8324 -278.3613 -11.4722 -12.7103
0.0018 11.43 2800 0.0005 4.8215 -3.0007 1.0 7.8222 -101.0460 -278.0980 -11.4836 -12.7306
0.0036 11.84 2900 0.0005 4.8430 -3.0254 1.0 7.8684 -101.2935 -277.8831 -11.4928 -12.7447
0.0015 12.24 3000 0.0005 4.8521 -3.0583 1.0 7.9104 -101.6220 -277.7916 -11.4961 -12.7505
0.0052 12.65 3100 0.0005 4.8654 -3.0809 1.0 7.9463 -101.8486 -277.6590 -11.5056 -12.7670
0.0022 13.06 3200 0.0005 4.8844 -3.0964 1.0 7.9807 -102.0031 -277.4695 -11.5118 -12.7789
0.002 13.47 3300 0.0004 4.8951 -3.1182 1.0 8.0133 -102.2215 -277.3618 -11.5162 -12.7874
0.0022 13.88 3400 0.0004 4.9070 -3.1334 1.0 8.0404 -102.3737 -277.2430 -11.5216 -12.7971
0.0021 14.29 3500 0.0004 4.9110 -3.1522 1.0 8.0632 -102.5618 -277.2032 -11.5242 -12.8018
0.0016 14.69 3600 0.0004 4.9225 -3.1606 1.0 8.0831 -102.6452 -277.0878 -11.5305 -12.8112
0.0038 15.1 3700 0.0004 4.9251 -3.1769 1.0 8.1020 -102.8086 -277.0620 -11.5312 -12.8121
0.0081 15.51 3800 0.0004 4.9290 -3.1834 1.0 8.1124 -102.8736 -277.0229 -11.5335 -12.8159
0.0021 15.92 3900 0.0004 4.9308 -3.1902 1.0 8.1210 -102.9418 -277.0052 -11.5346 -12.8179
0.0019 16.33 4000 0.0004 4.9321 -3.1907 1.0 8.1228 -102.9468 -276.9925 -11.5352 -12.8189

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0