Edit model card

collapse_gemma-2-2b_hs2_replace_iter20_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5816
  • Num Input Tokens Seen: 4622064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.6771 0.0511 5 1.2784 236976
0.8302 0.1021 10 1.3133 476992
0.3953 0.1532 15 1.5940 717168
0.19 0.2042 20 1.8024 960944
0.0832 0.2553 25 2.0949 1203928
0.0476 0.3063 30 2.2677 1447144
0.0262 0.3574 35 2.4137 1684688
0.0237 0.4084 40 2.5160 1918712
0.0199 0.4595 45 2.5667 2154640
0.02 0.5105 50 2.5725 2387960
0.0241 0.5616 55 2.5581 2624240
0.0203 0.6126 60 2.5694 2863504
0.0216 0.6637 65 2.5731 3090440
0.0199 0.7147 70 2.5817 3328208
0.0197 0.7658 75 2.5885 3573152
0.0234 0.8168 80 2.5827 3806824
0.0198 0.8679 85 2.5798 4048544
0.0213 0.9190 90 2.5791 4285416
0.0199 0.9700 95 2.5799 4527992

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter20_sftsd2

Base model

google/gemma-2-2b
Finetuned
(452)
this model