mistral_docs_sum_p1_full

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5829

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3.6e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.1167	0.0277	200	2.1333
2.3428	0.0553	400	1.6966
1.3784	0.0830	600	1.4972
1.456	0.1107	800	1.3942
1.3227	0.1383	1000	1.3084
1.2535	0.1660	1200	1.2001
1.0612	0.1937	1400	1.0451
0.8815	0.2213	1600	0.9632
0.8971	0.2490	1800	0.9132
0.7908	0.2767	2000	0.8712
0.7549	0.3043	2200	0.8309
0.8099	0.3320	2400	0.8058
0.6891	0.3597	2600	0.7879
0.5204	0.3873	2800	0.7684
0.6249	0.4150	3000	0.7515
0.6764	0.4427	3200	0.7342
0.6996	0.4703	3400	0.7214
0.6371	0.4980	3600	0.7084
0.6694	0.5257	3800	0.6951
0.7048	0.5533	4000	0.6845
0.7265	0.5810	4200	0.6778
0.5663	0.6087	4400	0.6657
0.6222	0.6363	4600	0.6595
0.6463	0.6640	4800	0.6488
0.5754	0.6917	5000	0.6410
0.6208	0.7193	5200	0.6363
0.5613	0.7470	5400	0.6275
0.6316	0.7747	5600	0.6227
0.6564	0.8023	5800	0.6159
0.633	0.8300	6000	0.6077
0.5268	0.8577	6200	0.6022
0.4166	0.8853	6400	0.5978
0.6539	0.9130	6600	0.5926
0.5695	0.9407	6800	0.5875
0.6358	0.9683	7000	0.5845
0.5318	0.9960	7200	0.5829

Framework versions

Transformers 4.40.1
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.19.1

ikura31
/

mistral_docs_sum_p1_full

mistral_docs_sum_p1_full

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ikura31/mistral_docs_sum_p1_full

Evaluation results