Edit model card

Uploaded model

  • Developed by: Angelectronic
  • License: apache-2.0
  • Finetuned from model : unsloth/mistral-7b-instruct-v0.2-bnb-4bit

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Evaluation

  • ViMMRC test set: 0.7913 accuracy

Training results

Training Loss Accuracy Step Validation Loss
1.031300 0.764065 240 1.471467
0.854200 0.767695 480 1.459051
0.753000 0.791288 720 1.521693
0.669600 0.767695 960 1.521454
0.592300 0.771324 1200 1.590301
0.496500 0.780399 1440 1.608687
0.381800 0.785843 1680 1.641979
0.334100 0.769510 1920 1.629696
0.285500 0.769510 2160 1.715881
0.242200 0.765880 2400 1.747410
0.200000 0.773140 2640 1.813693
0.146800 0.765880 2880 1.937426
0.112200 0.776769 3120 1.937926
0.101500 0.765880 3360 1.997301
0.094200 0.764065 3600 1.968903
0.087000 0.758621 3840 2.004644
0.084600 0.762250 4080 2.010856

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 5
  • num_epochs: 3

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from