Edit model card

Gemma-2-9B_task-2_60-samples_config_1

This model is a fine-tuned version of google/gemma-2-9b-it on the GaetanMichelet/chat_60_ft_t2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9765

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.3686 0.9362 11 1.2919
1.1305 1.9574 23 1.1233
1.0025 2.9787 35 1.0144
0.8869 4.0 47 0.9765
0.8268 4.9362 58 0.9949
0.7453 5.9574 70 1.0588
0.5888 6.9787 82 1.2111
0.4427 8.0 94 1.4423
0.2953 8.9362 105 1.6260
0.2186 9.9574 117 1.7576
0.1719 10.9787 129 1.8585

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
5
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for GaetanMichelet/Gemma-2-9B_task-2_60-samples_config_1

Adapter
this model