tsavage68's picture
End of training
9cfa4b1 verified
metadata
library_name: transformers
license: mit
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: final_checkpoint
    results: []

final_checkpoint

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Llama-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adafactor and the args are: No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
2.6773 0.0464 50 2.4460
0.7309 0.0929 100 0.7180
0.5965 0.1393 150 0.6000
0.5401 0.1857 200 0.5535
0.5055 0.2321 250 0.5285
0.4901 0.2786 300 0.5148
0.5266 0.3250 350 0.5047
0.4829 0.3714 400 0.4973
0.4825 0.4178 450 0.4922
0.508 0.4643 500 0.4886
0.503 0.5107 550 0.4858
0.514 0.5571 600 0.4835
0.492 0.6035 650 0.4822
0.4743 0.6500 700 0.4814
0.4942 0.6964 750 0.4809
0.4811 0.7428 800 0.4806
0.4645 0.7892 850 0.4805
0.4673 0.8357 900 0.4805
0.4933 0.8821 950 0.4805
0.5026 0.9285 1000 0.4805

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0