Edit model card

collapse_gemma-2-27b_hs2_accumulate_iter1_sftsd0

This model is a fine-tuned version of google/gemma-2-27b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9043
  • Num Input Tokens Seen: 5253020

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.1282 0
1.0178 0.0511 5 0.9807 272284
0.9631 0.1021 10 0.9519 541544
0.933 0.1532 15 0.9407 815788
0.911 0.2043 20 0.9333 1087364
0.9322 0.2553 25 0.9283 1353816
0.9306 0.3064 30 0.9250 1626852
0.93 0.3575 35 0.9222 1893740
0.9036 0.4086 40 0.9192 2168380
0.9166 0.4596 45 0.9175 2438380
0.9158 0.5107 50 0.9154 2708844
0.9438 0.5618 55 0.9137 2978352
0.9321 0.6128 60 0.9119 3244148
0.9048 0.6639 65 0.9103 3518100
1.0015 0.7150 70 0.9100 3784544
0.8605 0.7660 75 0.9086 4055360
0.9524 0.8171 80 0.9077 4326216
0.9025 0.8682 85 0.9069 4595508
0.8468 0.9192 90 0.9062 4869076
0.8756 0.9703 95 0.9047 5142272

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
27.2B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-27b_hs2_accumulate_iter1_sftsd0

Base model

google/gemma-2-27b
Finetuned
(25)
this model