Edit model card

cls_alldata_mistral_v1

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.5676 0.1091 20 0.5817
0.5158 0.2183 40 0.5408
0.5124 0.3274 60 0.5162
0.4791 0.4366 80 0.4999
0.4762 0.5457 100 0.4850
0.4724 0.6548 120 0.4737
0.4423 0.7640 140 0.4611
0.4453 0.8731 160 0.4508
0.4179 0.9823 180 0.4412
0.3243 1.0914 200 0.4479
0.3198 1.2005 220 0.4383
0.3012 1.3097 240 0.4335
0.3135 1.4188 260 0.4315
0.3081 1.5280 280 0.4247
0.3048 1.6371 300 0.4193
0.322 1.7462 320 0.4150
0.3034 1.8554 340 0.4136
0.3188 1.9645 360 0.4126

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for