ikura31
/

mistral_docs_sum_p1_full

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mistral_docs_sum_p1_full / README.md

ikura31's picture

Upload tokenizer

65a493b verified 6 months ago

|

3.21 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	base_model: mistralai/Mistral-7B-Instruct-v0.1
	model-index:
	- name: mistral_docs_sum_p1_full
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mistral_docs_sum_p1_full

	This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5829

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3.6e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 1
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 2.1167 \| 0.0277 \| 200 \| 2.1333 \|
	\| 2.3428 \| 0.0553 \| 400 \| 1.6966 \|
	\| 1.3784 \| 0.0830 \| 600 \| 1.4972 \|
	\| 1.456 \| 0.1107 \| 800 \| 1.3942 \|
	\| 1.3227 \| 0.1383 \| 1000 \| 1.3084 \|
	\| 1.2535 \| 0.1660 \| 1200 \| 1.2001 \|
	\| 1.0612 \| 0.1937 \| 1400 \| 1.0451 \|
	\| 0.8815 \| 0.2213 \| 1600 \| 0.9632 \|
	\| 0.8971 \| 0.2490 \| 1800 \| 0.9132 \|
	\| 0.7908 \| 0.2767 \| 2000 \| 0.8712 \|
	\| 0.7549 \| 0.3043 \| 2200 \| 0.8309 \|
	\| 0.8099 \| 0.3320 \| 2400 \| 0.8058 \|
	\| 0.6891 \| 0.3597 \| 2600 \| 0.7879 \|
	\| 0.5204 \| 0.3873 \| 2800 \| 0.7684 \|
	\| 0.6249 \| 0.4150 \| 3000 \| 0.7515 \|
	\| 0.6764 \| 0.4427 \| 3200 \| 0.7342 \|
	\| 0.6996 \| 0.4703 \| 3400 \| 0.7214 \|
	\| 0.6371 \| 0.4980 \| 3600 \| 0.7084 \|
	\| 0.6694 \| 0.5257 \| 3800 \| 0.6951 \|
	\| 0.7048 \| 0.5533 \| 4000 \| 0.6845 \|
	\| 0.7265 \| 0.5810 \| 4200 \| 0.6778 \|
	\| 0.5663 \| 0.6087 \| 4400 \| 0.6657 \|
	\| 0.6222 \| 0.6363 \| 4600 \| 0.6595 \|
	\| 0.6463 \| 0.6640 \| 4800 \| 0.6488 \|
	\| 0.5754 \| 0.6917 \| 5000 \| 0.6410 \|
	\| 0.6208 \| 0.7193 \| 5200 \| 0.6363 \|
	\| 0.5613 \| 0.7470 \| 5400 \| 0.6275 \|
	\| 0.6316 \| 0.7747 \| 5600 \| 0.6227 \|
	\| 0.6564 \| 0.8023 \| 5800 \| 0.6159 \|
	\| 0.633 \| 0.8300 \| 6000 \| 0.6077 \|
	\| 0.5268 \| 0.8577 \| 6200 \| 0.6022 \|
	\| 0.4166 \| 0.8853 \| 6400 \| 0.5978 \|
	\| 0.6539 \| 0.9130 \| 6600 \| 0.5926 \|
	\| 0.5695 \| 0.9407 \| 6800 \| 0.5875 \|
	\| 0.6358 \| 0.9683 \| 7000 \| 0.5845 \|
	\| 0.5318 \| 0.9960 \| 7200 \| 0.5829 \|


	### Framework versions

	- Transformers 4.40.1
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1