parth0908's picture
End of training
8e90400 verified
|
raw
history blame
2.51 kB
metadata
base_model: bigcode/starcoderbase-1b
library_name: peft
license: bigcode-openrail-m
tags:
  - generated_from_trainer
model-index:
  - name: peft-starcoder-finetuned
    results: []

peft-starcoder-finetuned

This model is a fine-tuned version of bigcode/starcoderbase-1b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2297

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.234 0.0996 50 1.2381
1.0882 0.1992 100 1.2581
0.9431 0.2988 150 1.2722
0.8286 0.3984 200 1.2415
0.7855 0.4980 250 1.2368
0.7417 0.5976 300 1.2313
0.7189 0.6972 350 1.2391
0.6781 0.7968 400 1.2231
0.6902 0.8964 450 1.2234
0.653 0.9960 500 1.2330
0.6556 1.0956 550 1.2255
0.6343 1.1952 600 1.2197
0.606 1.2948 650 1.2366
0.6223 1.3944 700 1.2300
0.6313 1.4940 750 1.2340
0.6068 1.5936 800 1.2327
0.6012 1.6932 850 1.2280
0.6299 1.7928 900 1.2296
0.5962 1.8924 950 1.2293
0.6041 1.9920 1000 1.2297

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.3