retrieval-bar
/

Mistral-7B-v0.1_mbe_no

Generated from Trainer

Model card Files Files and versions Community

Mistral-7B-v0.1_mbe_no / README.md

zlucia's picture

End of training

f5ea6ba verified 4 months ago

|

No virus

4.2 kB

	---
	license: apache-2.0
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: mistralai/Mistral-7B-v0.1
	datasets:
	- mbe
	metrics:
	- accuracy
	model-index:
	- name: Mistral-7B-v0.1_mbe_no
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Mistral-7B-v0.1_mbe_no

	This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the mbe dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7827
	- Accuracy: 0.5493

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 1.5293 \| 0.07 \| 10 \| 0.6504 \| 0.3520 \|
	\| 0.6652 \| 0.13 \| 20 \| 0.6469 \| 0.3783 \|
	\| 0.6523 \| 0.2 \| 30 \| 0.6430 \| 0.3651 \|
	\| 0.613 \| 0.27 \| 40 \| 0.6341 \| 0.4079 \|
	\| 0.6586 \| 0.33 \| 50 \| 0.6206 \| 0.3882 \|
	\| 0.586 \| 0.4 \| 60 \| 0.6269 \| 0.4178 \|
	\| 0.594 \| 0.47 \| 70 \| 0.6046 \| 0.4276 \|
	\| 0.6063 \| 0.53 \| 80 \| 0.6135 \| 0.4178 \|
	\| 0.5988 \| 0.6 \| 90 \| 0.6097 \| 0.4276 \|
	\| 0.6217 \| 0.67 \| 100 \| 0.6098 \| 0.4539 \|
	\| 0.5817 \| 0.73 \| 110 \| 0.6022 \| 0.4539 \|
	\| 0.6219 \| 0.8 \| 120 \| 0.5926 \| 0.4572 \|
	\| 0.559 \| 0.87 \| 130 \| 0.5816 \| 0.4605 \|
	\| 0.5514 \| 0.93 \| 140 \| 0.5783 \| 0.4737 \|
	\| 0.59 \| 1.0 \| 150 \| 0.5622 \| 0.4868 \|
	\| 0.46 \| 1.07 \| 160 \| 0.5868 \| 0.4803 \|
	\| 0.4484 \| 1.14 \| 170 \| 0.5667 \| 0.4868 \|
	\| 0.4162 \| 1.2 \| 180 \| 0.5820 \| 0.4803 \|
	\| 0.4716 \| 1.27 \| 190 \| 0.5904 \| 0.4638 \|
	\| 0.4486 \| 1.34 \| 200 \| 0.5777 \| 0.5099 \|
	\| 0.4264 \| 1.4 \| 210 \| 0.6482 \| 0.4967 \|
	\| 0.4236 \| 1.47 \| 220 \| 0.5741 \| 0.5033 \|
	\| 0.4141 \| 1.54 \| 230 \| 0.5608 \| 0.5164 \|
	\| 0.4308 \| 1.6 \| 240 \| 0.5539 \| 0.5099 \|
	\| 0.4505 \| 1.67 \| 250 \| 0.5495 \| 0.5033 \|
	\| 0.3958 \| 1.74 \| 260 \| 0.5594 \| 0.5099 \|
	\| 0.4432 \| 1.8 \| 270 \| 0.5492 \| 0.5164 \|
	\| 0.4067 \| 1.87 \| 280 \| 0.6024 \| 0.5066 \|
	\| 0.3988 \| 1.94 \| 290 \| 0.5607 \| 0.5099 \|
	\| 0.3992 \| 2.0 \| 300 \| 0.5670 \| 0.5164 \|
	\| 0.2304 \| 2.07 \| 310 \| 0.8200 \| 0.5362 \|
	\| 0.1696 \| 2.14 \| 320 \| 0.9087 \| 0.5296 \|
	\| 0.2255 \| 2.2 \| 330 \| 0.7566 \| 0.5362 \|
	\| 0.1923 \| 2.27 \| 340 \| 0.7020 \| 0.5197 \|
	\| 0.281 \| 2.34 \| 350 \| 0.6653 \| 0.5033 \|
	\| 0.2311 \| 2.4 \| 360 \| 0.6412 \| 0.5132 \|
	\| 0.1523 \| 2.47 \| 370 \| 0.8846 \| 0.5230 \|
	\| 0.2451 \| 2.54 \| 380 \| 0.9252 \| 0.5164 \|
	\| 0.2022 \| 2.6 \| 390 \| 0.7422 \| 0.5197 \|
	\| 0.217 \| 2.67 \| 400 \| 0.7558 \| 0.5329 \|
	\| 0.165 \| 2.74 \| 410 \| 0.7846 \| 0.5428 \|
	\| 0.2025 \| 2.8 \| 420 \| 0.7254 \| 0.5230 \|
	\| 0.2201 \| 2.87 \| 430 \| 0.6531 \| 0.5296 \|
	\| 0.2037 \| 2.94 \| 440 \| 0.7827 \| 0.5493 \|


	### Framework versions

	- PEFT 0.7.1
	- Transformers 4.37.2
	- Pytorch 2.1.2+cu121
	- Datasets 2.17.1
	- Tokenizers 0.15.1