test / README.md
ahmedabdelwahed's picture
Mojiz-DPO
ce1f2a6
|
raw
history blame
3.71 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: ahmedabdelwahed/Mojiz-sft
model-index:
  - name: test
    results: []

test

This model is a fine-tuned version of ahmedabdelwahed/Mojiz-sft on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1612
  • Rewards/chosen: 1.5401
  • Rewards/rejected: -0.4134
  • Rewards/accuracies: 1.0
  • Rewards/margins: 1.9534
  • Logps/rejected: -71.8662
  • Logps/chosen: -323.2329
  • Logits/rejected: -11.6886
  • Logits/chosen: -12.9104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 150
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
1.3422 0.41 100 0.6447 0.0816 -0.0178 1.0 0.0994 -71.0751 -326.1499 -11.7285 -12.9749
1.9054 0.82 200 0.5151 0.3250 -0.0755 1.0 0.4005 -71.1904 -325.6630 -11.7221 -12.9647
1.7913 1.22 300 0.4000 0.5897 -0.1444 1.0 0.7341 -71.3283 -325.1337 -11.7152 -12.9536
1.6915 1.63 400 0.3189 0.8238 -0.2053 1.0 1.0290 -71.4500 -324.6655 -11.7087 -12.9431
1.4041 2.04 500 0.2596 1.0374 -0.2638 1.0 1.3012 -71.5670 -324.2383 -11.7029 -12.9335
1.1989 2.45 600 0.2187 1.2160 -0.3160 1.0 1.5320 -71.6715 -323.8811 -11.6980 -12.9256
1.2192 2.86 700 0.1915 1.3556 -0.3570 1.0 1.7126 -71.7533 -323.6018 -11.6939 -12.9188
1.1718 3.27 800 0.1738 1.4588 -0.3884 1.0 1.8472 -71.8162 -323.3954 -11.6909 -12.9141
0.9623 3.67 900 0.1643 1.5190 -0.4068 1.0 1.9258 -71.8531 -323.2751 -11.6892 -12.9113
1.0207 4.08 1000 0.1612 1.5401 -0.4134 1.0 1.9534 -71.8662 -323.2329 -11.6886 -12.9104

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0