zlucia's picture
End of training
f5ea6ba verified
|
raw
history blame
No virus
4.2 kB
---
license: apache-2.0
library_name: peft
tags:
- generated_from_trainer
base_model: mistralai/Mistral-7B-v0.1
datasets:
- mbe
metrics:
- accuracy
model-index:
- name: Mistral-7B-v0.1_mbe_no
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Mistral-7B-v0.1_mbe_no
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the mbe dataset.
It achieves the following results on the evaluation set:
- Loss: 0.7827
- Accuracy: 0.5493
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 3.0
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| 1.5293 | 0.07 | 10 | 0.6504 | 0.3520 |
| 0.6652 | 0.13 | 20 | 0.6469 | 0.3783 |
| 0.6523 | 0.2 | 30 | 0.6430 | 0.3651 |
| 0.613 | 0.27 | 40 | 0.6341 | 0.4079 |
| 0.6586 | 0.33 | 50 | 0.6206 | 0.3882 |
| 0.586 | 0.4 | 60 | 0.6269 | 0.4178 |
| 0.594 | 0.47 | 70 | 0.6046 | 0.4276 |
| 0.6063 | 0.53 | 80 | 0.6135 | 0.4178 |
| 0.5988 | 0.6 | 90 | 0.6097 | 0.4276 |
| 0.6217 | 0.67 | 100 | 0.6098 | 0.4539 |
| 0.5817 | 0.73 | 110 | 0.6022 | 0.4539 |
| 0.6219 | 0.8 | 120 | 0.5926 | 0.4572 |
| 0.559 | 0.87 | 130 | 0.5816 | 0.4605 |
| 0.5514 | 0.93 | 140 | 0.5783 | 0.4737 |
| 0.59 | 1.0 | 150 | 0.5622 | 0.4868 |
| 0.46 | 1.07 | 160 | 0.5868 | 0.4803 |
| 0.4484 | 1.14 | 170 | 0.5667 | 0.4868 |
| 0.4162 | 1.2 | 180 | 0.5820 | 0.4803 |
| 0.4716 | 1.27 | 190 | 0.5904 | 0.4638 |
| 0.4486 | 1.34 | 200 | 0.5777 | 0.5099 |
| 0.4264 | 1.4 | 210 | 0.6482 | 0.4967 |
| 0.4236 | 1.47 | 220 | 0.5741 | 0.5033 |
| 0.4141 | 1.54 | 230 | 0.5608 | 0.5164 |
| 0.4308 | 1.6 | 240 | 0.5539 | 0.5099 |
| 0.4505 | 1.67 | 250 | 0.5495 | 0.5033 |
| 0.3958 | 1.74 | 260 | 0.5594 | 0.5099 |
| 0.4432 | 1.8 | 270 | 0.5492 | 0.5164 |
| 0.4067 | 1.87 | 280 | 0.6024 | 0.5066 |
| 0.3988 | 1.94 | 290 | 0.5607 | 0.5099 |
| 0.3992 | 2.0 | 300 | 0.5670 | 0.5164 |
| 0.2304 | 2.07 | 310 | 0.8200 | 0.5362 |
| 0.1696 | 2.14 | 320 | 0.9087 | 0.5296 |
| 0.2255 | 2.2 | 330 | 0.7566 | 0.5362 |
| 0.1923 | 2.27 | 340 | 0.7020 | 0.5197 |
| 0.281 | 2.34 | 350 | 0.6653 | 0.5033 |
| 0.2311 | 2.4 | 360 | 0.6412 | 0.5132 |
| 0.1523 | 2.47 | 370 | 0.8846 | 0.5230 |
| 0.2451 | 2.54 | 380 | 0.9252 | 0.5164 |
| 0.2022 | 2.6 | 390 | 0.7422 | 0.5197 |
| 0.217 | 2.67 | 400 | 0.7558 | 0.5329 |
| 0.165 | 2.74 | 410 | 0.7846 | 0.5428 |
| 0.2025 | 2.8 | 420 | 0.7254 | 0.5230 |
| 0.2201 | 2.87 | 430 | 0.6531 | 0.5296 |
| 0.2037 | 2.94 | 440 | 0.7827 | 0.5493 |
### Framework versions
- PEFT 0.7.1
- Transformers 4.37.2
- Pytorch 2.1.2+cu121
- Datasets 2.17.1
- Tokenizers 0.15.1