Edit model card

WeniGPT-QA-Zephyr-7B-4.0.1-KTO

This model is a fine-tuned version of HuggingFaceH4/zephyr-7b-beta on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0038
  • Rewards/chosen: 6.5225
  • Rewards/rejected: -33.7048
  • Rewards/margins: 40.2273
  • Kl: 0.0
  • Logps/chosen: -111.4407
  • Logps/rejected: -539.2156

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 786
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/margins Kl Logps/chosen Logps/rejected
0.204 0.38 50 0.0275 5.6328 -20.3465 25.9793 0.0 -120.3375 -405.6329
0.073 0.76 100 0.0143 5.7898 -19.2664 25.0562 0.0 -118.7677 -394.8320
0.0553 1.13 150 0.0225 5.8121 -29.9815 35.7935 0.0 -118.5453 -501.9826
0.0232 1.51 200 0.0048 6.4515 -27.5911 34.0425 0.0 -112.1512 -478.0785
0.0519 1.89 250 0.0081 6.4814 -30.4910 36.9724 0.0 -111.8522 -507.0782
0.0095 2.27 300 0.0154 6.4081 -33.8838 40.2919 0.0 -112.5852 -541.0063
0.0098 2.65 350 0.0052 6.5962 -41.2733 47.8696 0.0 -110.7035 -614.9014
0.0038 3.02 400 0.0038 6.5225 -33.7048 40.2273 0.0 -111.4407 -539.2156
0.0068 3.4 450 0.0080 6.3449 -43.3527 49.6976 0.0 -113.2169 -635.6954
0.0037 3.78 500 0.0071 6.5639 -44.5033 51.0672 0.0 -111.0268 -647.2004
0.0032 4.16 550 0.0085 6.6333 -29.5095 36.1428 0.0 -110.3333 -497.2631
0.0029 4.54 600 0.0048 6.5574 -42.0858 48.6432 0.0 -111.0921 -623.0258
0.0028 4.91 650 0.0041 6.6663 -41.3645 48.0309 0.0 -110.0026 -615.8130
0.0032 5.29 700 0.0040 6.6773 -41.2318 47.9091 0.0 -109.8931 -614.4858
0.003 5.67 750 0.0040 6.6870 -41.2272 47.9142 0.0 -109.7961 -614.4399

Framework versions

  • PEFT 0.10.0
  • Transformers 4.39.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for