mixtral_semptom_0 / README.md
Cem13's picture
cem13/complaint_to_sythoms_mix_8x7b
711db3a verified
metadata
base_model: mistralai/Mixtral-8x7B-v0.1
datasets:
  - generator
library_name: peft
license: apache-2.0
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: Mixtral_Alpace_v2
    results: []

Mixtral_Alpace_v2

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 15
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.5577 0.0813 10 1.5534
1.4512 0.1626 20 1.4827
1.4106 0.2439 30 1.4104
1.3419 0.3252 40 1.3460
1.2361 0.4065 50 1.2827
1.2298 0.4878 60 1.2097
1.1468 0.5691 70 1.1400
1.0874 0.6504 80 1.0724
1.0372 0.7317 90 1.0088
0.9185 0.8130 100 0.9566
0.8927 0.8943 110 0.9139
0.8264 0.9756 120 0.8724
0.8799 1.0569 130 0.8329
0.8233 1.1382 140 0.7947
0.7761 1.2195 150 0.7633
0.7568 1.3008 160 0.7407
0.6957 1.3821 170 0.7224
0.6712 1.4634 180 0.7048
0.6738 1.5447 190 0.6908
0.7165 1.6260 200 0.6781
0.5913 1.7073 210 0.6673
0.6992 1.7886 220 0.6584
0.6438 1.8699 230 0.6497
0.6649 1.9512 240 0.6425
0.5907 2.0325 250 0.6358
0.6014 2.1138 260 0.6302
0.5605 2.1951 270 0.6250
0.5893 2.2764 280 0.6209
0.5761 2.3577 290 0.6166
0.6083 2.4390 300 0.6132
0.6404 2.5203 310 0.6100
0.5949 2.6016 320 0.6076
0.6208 2.6829 330 0.6047
0.6083 2.7642 340 0.6025
0.5922 2.8455 350 0.5998
0.6377 2.9268 360 0.5980
0.6059 3.0081 370 0.5960
0.6697 3.0894 380 0.5940
0.5813 3.1707 390 0.5925
0.5442 3.2520 400 0.5911
0.506 3.3333 410 0.5889
0.5806 3.4146 420 0.5878
0.5504 3.4959 430 0.5868
0.6051 3.5772 440 0.5849
0.5952 3.6585 450 0.5838
0.5128 3.7398 460 0.5825
0.5779 3.8211 470 0.5813
0.5448 3.9024 480 0.5802
0.5559 3.9837 490 0.5796
0.6136 4.0650 500 0.5787
0.5329 4.1463 510 0.5776
0.5267 4.2276 520 0.5767
0.5492 4.3089 530 0.5763
0.5206 4.3902 540 0.5758
0.5088 4.4715 550 0.5747
0.5811 4.5528 560 0.5739
0.5865 4.6341 570 0.5728
0.5563 4.7154 580 0.5729
0.5692 4.7967 590 0.5719
0.5827 4.8780 600 0.5713
0.5551 4.9593 610 0.5715
0.5059 5.0407 620 0.5708
0.5132 5.1220 630 0.5700
0.5314 5.2033 640 0.5698
0.5614 5.2846 650 0.5696
0.5489 5.3659 660 0.5688
0.5404 5.4472 670 0.5680
0.5745 5.5285 680 0.5672
0.5083 5.6098 690 0.5673
0.5565 5.6911 700 0.5670
0.5515 5.7724 710 0.5664
0.5448 5.8537 720 0.5664
0.5276 5.9350 730 0.5657
0.5436 6.0163 740 0.5656
0.5988 6.0976 750 0.5650
0.4929 6.1789 760 0.5652
0.5957 6.2602 770 0.5645
0.4968 6.3415 780 0.5645
0.4822 6.4228 790 0.5645
0.5527 6.5041 800 0.5642
0.5663 6.5854 810 0.5640
0.493 6.6667 820 0.5634
0.4992 6.7480 830 0.5630
0.5618 6.8293 840 0.5630
0.568 6.9106 850 0.5626
0.4869 6.9919 860 0.5626
0.5418 7.0732 870 0.5625
0.5364 7.1545 880 0.5621
0.5675 7.2358 890 0.5621
0.491 7.3171 900 0.5620
0.5555 7.3984 910 0.5621
0.6093 7.4797 920 0.5621
0.5529 7.5610 930 0.5620
0.5252 7.6423 940 0.5620
0.5024 7.7236 950 0.5620
0.5639 7.8049 960 0.5616
0.4676 7.8862 970 0.5618
0.5236 7.9675 980 0.5617
0.4902 8.0488 990 0.5616
0.486 8.1301 1000 0.5617

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1