Edit model card

Mistral-7B-Instruct-v0.2-miracl-raft-sft-v2.0

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the nthakur/miracl-raft-sft-instruct-v0.2 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2086

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 24
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.3095 0.0987 200 1.2800
1.249 0.1975 400 1.2516
1.2514 0.2962 600 1.2369
1.275 0.3950 800 1.2263
1.1984 0.4937 1000 1.2197
1.1556 0.5924 1200 1.2149
1.2386 0.6912 1400 1.2116
1.2661 0.7899 1600 1.2096
1.2752 0.8887 1800 1.2088
1.2701 0.9874 2000 1.2086

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
19
Unable to determine this model’s pipeline type. Check the docs .

Adapter for

Dataset used to train nthakur/Mistral-7B-Instruct-v0.2-miracl-raft-sft-v2.0