gemma-2b-it-gee / README.md
himanshue2e's picture
End of training
eca2b73 verified
metadata
license: gemma
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: google/gemma-2b
model-index:
  - name: gemma-2b-it-gee
    results: []

gemma-2b-it-gee

This model is a fine-tuned version of google/gemma-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 200

Training results

Training Loss Epoch Step Validation Loss
No log 0.016 2 1.1482
No log 0.032 4 1.1330
No log 0.048 6 1.1206
No log 0.064 8 1.1111
No log 0.08 10 1.1005
No log 0.096 12 1.0913
No log 0.112 14 1.0823
No log 0.128 16 1.0734
No log 0.144 18 1.0660
No log 0.16 20 1.0590
No log 0.176 22 1.0520
No log 0.192 24 1.0470
No log 0.208 26 1.0425
No log 0.224 28 1.0387
No log 0.24 30 1.0353
No log 0.256 32 1.0320
No log 0.272 34 1.0288
No log 0.288 36 1.0256
No log 0.304 38 1.0232
No log 0.32 40 1.0204
No log 0.336 42 1.0183
No log 0.352 44 1.0159
No log 0.368 46 1.0137
No log 0.384 48 1.0115
1.0621 0.4 50 1.0095
1.0621 0.416 52 1.0075
1.0621 0.432 54 1.0054
1.0621 0.448 56 1.0034
1.0621 0.464 58 1.0014
1.0621 0.48 60 0.9994
1.0621 0.496 62 0.9976
1.0621 0.512 64 0.9960
1.0621 0.528 66 0.9944
1.0621 0.544 68 0.9928
1.0621 0.56 70 0.9911
1.0621 0.576 72 0.9896
1.0621 0.592 74 0.9882
1.0621 0.608 76 0.9868
1.0621 0.624 78 0.9853
1.0621 0.64 80 0.9836
1.0621 0.656 82 0.9822
1.0621 0.672 84 0.9809
1.0621 0.688 86 0.9797
1.0621 0.704 88 0.9785
1.0621 0.72 90 0.9775
1.0621 0.736 92 0.9764
1.0621 0.752 94 0.9754
1.0621 0.768 96 0.9745
1.0621 0.784 98 0.9737
1.0035 0.8 100 0.9729
1.0035 0.816 102 0.9723
1.0035 0.832 104 0.9716
1.0035 0.848 106 0.9710
1.0035 0.864 108 0.9705
1.0035 0.88 110 0.9700
1.0035 0.896 112 0.9695
1.0035 0.912 114 0.9690
1.0035 0.928 116 0.9685
1.0035 0.944 118 0.9681
1.0035 0.96 120 0.9677
1.0035 0.976 122 0.9674
1.0035 0.992 124 0.9671
1.0035 1.008 126 0.9668
1.0035 1.024 128 0.9665
1.0035 1.04 130 0.9661
1.0035 1.056 132 0.9657
1.0035 1.072 134 0.9654
1.0035 1.088 136 0.9652
1.0035 1.104 138 0.9650
1.0035 1.12 140 0.9648
1.0035 1.1360 142 0.9645
1.0035 1.152 144 0.9644
1.0035 1.168 146 0.9642
1.0035 1.184 148 0.9639
0.9687 1.2 150 0.9637
0.9687 1.216 152 0.9634
0.9687 1.232 154 0.9632
0.9687 1.248 156 0.9630
0.9687 1.264 158 0.9628
0.9687 1.28 160 0.9627
0.9687 1.296 162 0.9625
0.9687 1.312 164 0.9623
0.9687 1.328 166 0.9622
0.9687 1.3440 168 0.9620
0.9687 1.3600 170 0.9619
0.9687 1.376 172 0.9618
0.9687 1.392 174 0.9617
0.9687 1.408 176 0.9616
0.9687 1.424 178 0.9615
0.9687 1.44 180 0.9614
0.9687 1.456 182 0.9613
0.9687 1.472 184 0.9613
0.9687 1.488 186 0.9612
0.9687 1.504 188 0.9611
0.9687 1.52 190 0.9611
0.9687 1.536 192 0.9610
0.9687 1.552 194 0.9610
0.9687 1.568 196 0.9609
0.9687 1.584 198 0.9609
0.982 1.6 200 0.9609

Framework versions

  • PEFT 0.10.1.dev0
  • Transformers 4.41.0.dev0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1