hZzy's picture
End of training
baae053 verified
|
raw
history blame
5.17 kB
metadata
license: apache-2.0
base_model: hZzy/qwen2.5-0.5b-sft-news-IFT
tags:
  - alignment-handbook
  - ndcg
  - trl
  - expo
  - generated_from_trainer
  - trl
  - expo
  - generated_from_trainer
datasets:
  - hZzy/train_pairwise
model-index:
  - name: qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-5-1e6
    results: []

Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-EXPERIMENT-5-1e6

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 5.9301
  • Logps: -88.3847
  • Logits: -1.2661
  • Objective: 5.9752
  • Dpo Loss: 3.0906
  • Regularize: 5.9752
  • Ranking Simple: 0.5134
  • Ranking Idealized: 0.5093
  • Ranking Idealized Expo: 0.5093

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 6
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 288
  • total_eval_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Logps Logits Objective Dpo Loss Regularize Ranking Simple Ranking Idealized Ranking Idealized Expo
1.7171 0.2834 50 0.9452 -91.4216 -1.3980 0.9804 0.8391 0.9804 0.5114 0.5093 0.5093
4.4116 0.5668 100 2.2889 -91.3584 -1.3646 2.2847 1.3937 2.2847 0.5145 0.5093 0.5093
5.641 0.8503 150 3.6592 -89.6013 -1.3612 3.6993 1.8989 3.6993 0.5124 0.5093 0.5093
5.6662 1.1337 200 4.9017 -91.8203 -1.3129 5.1434 2.5622 5.1434 0.5134 0.5093 0.5093
5.0544 1.4171 250 4.6457 -89.6596 -1.2958 4.6981 2.3884 4.6981 0.5093 0.5093 0.5093
4.799 1.7005 300 5.0697 -89.6459 -1.3128 5.1481 2.5371 5.1481 0.5114 0.5093 0.5093
4.3968 1.9839 350 5.4045 -88.5459 -1.2879 5.3636 2.7971 5.3636 0.5103 0.5093 0.5093
3.8148 2.2674 400 5.7626 -88.2542 -1.2680 5.8200 2.9398 5.8200 0.5093 0.5093 0.5093
3.4169 2.5508 450 5.9539 -88.0116 -1.2897 6.1065 3.1384 6.1065 0.5145 0.5093 0.5093
2.988 2.8342 500 5.9854 -87.9506 -1.2856 6.0183 3.1318 6.0183 0.5093 0.5093 0.5093
2.4859 3.1176 550 6.1946 -88.5030 -1.2805 6.2029 3.1790 6.2029 0.5103 0.5093 0.5093
2.0539 3.4010 600 5.9332 -88.1616 -1.2651 6.0318 3.1111 6.0318 0.5114 0.5093 0.5093
1.664 3.6845 650 5.9239 -88.6992 -1.2608 5.9851 3.0968 5.9851 0.5114 0.5093 0.5093
1.3502 3.9679 700 5.9176 -88.5236 -1.2647 5.9571 3.0895 5.9571 0.5134 0.5093 0.5093
1.0052 4.2513 750 5.9642 -88.3618 -1.2630 6.0061 3.1036 6.0061 0.5134 0.5093 0.5093
0.8548 4.5347 800 5.9238 -88.3534 -1.2662 5.9711 3.0853 5.9711 0.5134 0.5093 0.5093
0.7765 4.8181 850 5.9323 -88.3874 -1.2660 5.9770 3.0916 5.9770 0.5134 0.5093 0.5093

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1