Cerebras-GPT-111M-instruction-sft-lora-merged-dpo-lora
This model is a fine-tuned version of SebastianSchramm/Cerebras-GPT-111M-instruction-sft-lora-merged on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.6203
- Rewards/chosen: 0.8184
- Rewards/rejected: 0.4678
- Rewards/accuracies: 0.6555
- Rewards/margins: 0.3506
- Logps/rejected: -797.4490
- Logps/chosen: -1064.1462
- Logits/rejected: -2.6967
- Logits/chosen: -2.9346
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 32
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6555 | 0.34 | 300 | 0.6536 | 0.5523 | 0.3662 | 0.6271 | 0.1862 | -798.4653 | -1066.8068 | -2.7199 | -2.9594 |
0.615 | 0.68 | 600 | 0.6352 | 0.7267 | 0.4534 | 0.6380 | 0.2732 | -797.5925 | -1065.0635 | -2.7194 | -2.9580 |
0.6313 | 1.02 | 900 | 0.6278 | 0.7792 | 0.4662 | 0.6440 | 0.3131 | -797.4653 | -1064.5378 | -2.7117 | -2.9469 |
0.6218 | 1.36 | 1200 | 0.6295 | 0.7738 | 0.4669 | 0.6457 | 0.3069 | -797.4579 | -1064.5920 | -2.7035 | -2.9401 |
0.6311 | 1.71 | 1500 | 0.6212 | 0.7817 | 0.4456 | 0.6654 | 0.3361 | -797.6708 | -1064.5128 | -2.7073 | -2.9437 |
0.6107 | 2.05 | 1800 | 0.6223 | 0.8065 | 0.4674 | 0.6572 | 0.3391 | -797.4526 | -1064.2653 | -2.7009 | -2.9373 |
0.6146 | 2.39 | 2100 | 0.6190 | 0.8141 | 0.4648 | 0.6698 | 0.3494 | -797.4793 | -1064.1887 | -2.6988 | -2.9353 |
0.6347 | 2.73 | 2400 | 0.6214 | 0.8118 | 0.4631 | 0.6654 | 0.3487 | -797.4959 | -1064.2124 | -2.6962 | -2.9342 |
Framework versions
- Transformers 4.35.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1
- Downloads last month
- 4