Visualize in Weights & Biases

qwen2.5-0.5b-expo-L2EXPO-ES-1000

This model is a fine-tuned version of hZzy/qwen2.5-0.5b-sft-news-IFT on the hZzy/train_pairwise dataset. It achieves the following results on the evaluation set:

  • Loss: 5280.8613
  • Logps: -85.1920
  • Logits: -0.4645
  • Objective: 5329.0571
  • Dpo Loss: 2703.0312
  • Regularize: 5329.0571
  • Ranking Simple: 0.5264
  • Ranking Idealized: 0.5212
  • Ranking Idealized Expo: 0.5212
  • Wo Beta: 14.0257

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 12
  • total_train_batch_size: 144
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Dpo Loss Logits Logps Validation Loss Objective Ranking Idealized Ranking Idealized Expo Ranking Simple Regularize Wo Beta
423.0596 0.1417 50 269.6528 -1.3995 -90.7221 547.3513 547.0782 0.5212 0.5212 0.5238 547.0782 16.2386
1707.6314 0.2834 100 848.0440 -1.3188 -86.9761 1694.1810 1675.6274 0.5212 0.5212 0.5202 1675.6274 15.6407
2824.1562 0.4251 150 1511.3630 -1.2862 -82.2178 3014.1465 2968.7720 0.5212 0.5212 0.5274 2968.7720 15.0323
3551.5363 0.5668 200 1970.6587 -0.7799 -81.0056 3928.5837 3925.4968 0.5212 0.5212 0.5248 3925.4968 14.6132
3769.9247 0.7085 250 2167.5466 -0.7280 -80.5808 4317.6050 4303.1143 0.5212 0.5212 0.5269 4303.1143 14.5829
3591.3281 0.8503 300 2308.4351 -0.5846 -82.7713 4553.7632 4559.6348 0.5212 0.5212 0.5248 4559.6348 14.5914
3315.5613 0.9920 350 2326.0144 -0.7541 -80.8051 4667.9404 4670.2617 0.5212 0.5212 0.5331 4670.2617 14.3052
3140.2284 1.1337 400 2524.6191 -0.6474 -81.5771 4876.3184 4879.0815 0.5212 0.5212 0.5228 4879.0815 14.3271
2984.025 1.2754 450 2466.7131 -0.7908 -84.2705 4773.4326 4785.7534 0.5212 0.5212 0.5248 4785.7534 14.3213
2769.3719 1.4171 500 2513.8191 -0.7098 -81.4917 4863.7148 4866.6235 0.5212 0.5212 0.5192 4866.6235 14.1934
2620.0086 1.5588 550 2463.1169 -0.5653 -81.8307 4887.2939 4877.4683 0.5212 0.5212 0.5248 4877.4683 14.1757
2530.9462 1.7005 600 2522.0715 -0.4886 -82.8727 4965.4233 5013.2871 0.5212 0.5212 0.5233 5013.2871 14.2573
2445.0009 1.8422 650 2509.7644 -0.5173 -81.8303 4964.3994 4986.9541 0.5212 0.5212 0.5243 4986.9541 14.2557
2287.7192 1.9839 700 2561.1602 -0.5354 -83.8738 5034.0654 5065.8521 0.5212 0.5212 0.5217 5065.8521 14.0847
2066.9519 2.1256 750 2654.1794 -0.4949 -82.1944 5229.8853 5264.4932 0.5212 0.5212 0.5254 5264.4932 14.0981
1963.7713 2.2674 800 2636.3833 -0.4790 -82.2307 5180.2388 5235.7695 0.5212 0.5212 0.5243 5235.7695 14.0378
1854.7628 2.4091 850 2612.3875 -0.4900 -82.9664 5130.6069 5171.9189 0.5212 0.5212 0.5269 5171.9189 14.1142
1711.9678 2.5508 900 2703.0312 -0.4645 -85.1920 5280.8613 5329.0571 0.5212 0.5212 0.5264 5329.0571 14.0257
1682.3781 2.6925 950 2644.8484 -0.4320 -83.9376 5177.0815 5195.8457 0.5212 0.5212 0.5254 5195.8457 14.1691
1508.6941 2.8342 1000 2632.4006 -0.5014 -83.4235 5124.7144 5131.5728 0.5212 0.5212 0.5243 5131.5728 14.1501
1432.2169 2.9759 1050 2638.4963 -0.4687 -83.8074 5215.6191 5232.5947 0.5212 0.5212 0.5295 5232.5947 14.2389
1247.6562 3.1223 1100 5184.8696 -84.1614 -0.5461 5190.6357 2631.7661 5190.6357 0.5264 0.5212 0.5212 14.1529
1136.2859 3.2641 1150 5110.2056 -83.8852 -0.5632 5112.0278 2590.1838 5112.0278 0.5280 0.5212 0.5212 14.0933
1042.7762 3.4058 1200 5146.4077 -83.9630 -0.5505 5162.1665 2612.2661 5162.1665 0.5274 0.5212 0.5212 14.1122
978.7787 3.5475 1250 5115.5093 -83.8987 -0.4993 5140.9258 2605.3420 5140.9258 0.5280 0.5212 0.5212 14.1279
864.8715 3.6892 1300 5143.4609 -84.2929 -0.5245 5173.7549 2621.0728 5173.7549 0.5259 0.5212 0.5212 14.1584

Framework versions

  • Transformers 4.42.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
494M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hZzy/qwen2.5-0.5b-expo-L2EXPO-ES-1000

Finetuned
(50)
this model

Dataset used to train hZzy/qwen2.5-0.5b-expo-L2EXPO-ES-1000