gemma-2-9b_pct_default_r32

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 9.9020

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
2.5978 0.0206 8 4.9799
11.0056 0.0412 16 11.4640
11.7215 0.0618 24 11.8765
11.8793 0.0824 32 11.9059
11.8739 0.1030 40 11.9781
11.8763 0.1236 48 11.9487
11.8231 0.1442 56 11.8282
11.7758 0.1648 64 11.7664
11.8011 0.1854 72 11.4372
11.6991 0.2060 80 11.7334
11.8108 0.2266 88 11.5005
11.6519 0.2472 96 11.5944
11.6905 0.2678 104 11.5944
11.6003 0.2885 112 11.5914
11.5813 0.3091 120 11.5684
11.5493 0.3297 128 11.7560
11.5458 0.3503 136 11.4566
11.5838 0.3709 144 11.4331
11.4815 0.3915 152 11.5174
11.5369 0.4121 160 11.5271
11.4617 0.4327 168 11.5392
11.4399 0.4533 176 11.3691
11.3199 0.4739 184 10.9823
10.6547 0.4945 192 10.0666
8.8163 0.5151 200 8.7638
9.5635 0.5357 208 8.5153
8.7862 0.5563 216 9.2186
10.2774 0.5769 224 10.3834
9.7932 0.5975 232 9.7972
9.5421 0.6181 240 9.6845
9.5401 0.6387 248 9.3438
10.9001 0.6593 256 10.6070
9.959 0.6799 264 9.6172
9.5409 0.7005 272 10.4762
10.8074 0.7211 280 10.4872
9.1645 0.7417 288 7.6567
8.1072 0.7623 296 8.7911
9.7069 0.7829 304 10.0043
10.0752 0.8035 312 10.1655
9.9734 0.8241 320 9.9644
9.6722 0.8447 328 9.7803
9.8279 0.8654 336 9.5826
9.6714 0.8860 344 9.5533
9.655 0.9066 352 9.6339
9.7184 0.9272 360 9.7651
9.6142 0.9478 368 9.8536
9.9249 0.9684 376 9.8906
9.8654 0.9890 384 9.9020

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
27
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/gemma-2-9b_pct_default_r32

Base model

google/gemma-2-9b
Adapter
(24)
this model