finetune_starcoder2_3b

This model is a fine-tuned version of bigcode/starcoder2-3b on an unknown dataset.

Model description

StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens.

Intended uses & limitations

The finetune_starcoder2_3b is the trained model that is capable of generating JavvaScript code snippets for various purposes like code completion, syntax suggestion tasks and code generation tasks. It has some limitations in generating complex levvell codes and users should check the code and use it at their own risk

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 0
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 500
  • mixed_precision_training: Native AMP

Training results

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.0.dev0
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Pranabit/finetune_starcoder2_3b

Adapter
(88)
this model