parth0908's picture
End of training
64fa1f8 verified
metadata
base_model: bigcode/starcoderbase-1b
library_name: peft
license: bigcode-openrail-m
tags:
  - generated_from_trainer
model-index:
  - name: peft-starcoder-finetuned
    results: []

peft-starcoder-finetuned

This model is a fine-tuned version of bigcode/starcoderbase-1b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8901

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 20
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.0733 0.1631 20 0.9622
1.0649 0.3262 40 0.9528
1.0324 0.4893 60 0.9462
1.0216 0.6524 80 0.9424
1.0067 0.8155 100 0.9368
0.9977 0.9786 120 0.9329
0.97 1.1458 140 0.9302
0.9085 1.3089 160 0.9279
0.934 1.4720 180 0.9233
1.0061 1.6351 200 0.9184
0.9564 1.7982 220 0.9165
0.9738 1.9613 240 0.9126
0.8864 2.1284 260 0.9114
0.9144 2.2915 280 0.9113
0.9443 2.4546 300 0.9098
0.9444 2.6177 320 0.9083
0.887 2.7808 340 0.9058
0.9398 2.9439 360 0.9052
0.9015 3.1111 380 0.9031
0.8536 3.2742 400 0.9024
0.8765 3.4373 420 0.9002
0.9198 3.6004 440 0.8997
0.9468 3.7635 460 0.8989
0.8631 3.9266 480 0.8978
0.8777 4.0938 500 0.8977
0.9006 4.2569 520 0.8959
0.8768 4.4200 540 0.8957
0.8477 4.5831 560 0.8951
0.9061 4.7462 580 0.8937
0.8837 4.9093 600 0.8930
0.8402 5.0765 620 0.8939
0.8608 5.2396 640 0.8931
0.879 5.4027 660 0.8928
0.8562 5.5657 680 0.8922
0.8776 5.7288 700 0.8913
0.8464 5.8919 720 0.8910
0.8528 6.0591 740 0.8914
0.8538 6.2222 760 0.8910
0.8844 6.3853 780 0.8905
0.8652 6.5484 800 0.8906
0.8443 6.7115 820 0.8905
0.8546 6.8746 840 0.8899
0.8094 7.0418 860 0.8904
0.863 7.2049 880 0.8899
0.8642 7.3680 900 0.8902
0.8413 7.5311 920 0.8901
0.8119 7.6942 940 0.8903
0.8909 7.8573 960 0.8901
0.8516 8.0245 980 0.8900
0.8834 8.1876 1000 0.8901

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.3