End of training

a12b8a2 verified 5 months ago

4.14 kB

	---
	base_model: microsoft/mpnet-base
	tags:
	- generated_from_trainer
	metrics:
	- f1
	model-index:
	- name: mpnet-base-airlines-news-multi-label
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mpnet-base-airlines-news-multi-label

	This model is a fine-tuned version of [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2478
	- F1: 0.8938
	- Roc Auc: 0.6465

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \| Roc Auc \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:-------:\|
	\| No log \| 1.0 \| 57 \| 0.3726 \| 0.8319 \| 0.5 \|
	\| No log \| 2.0 \| 114 \| 0.3361 \| 0.8319 \| 0.5 \|
	\| No log \| 3.0 \| 171 \| 0.3303 \| 0.8319 \| 0.5 \|
	\| No log \| 4.0 \| 228 \| 0.3249 \| 0.8319 \| 0.5 \|
	\| No log \| 5.0 \| 285 \| 0.3188 \| 0.8319 \| 0.5 \|
	\| No log \| 6.0 \| 342 \| 0.3141 \| 0.8319 \| 0.5 \|
	\| No log \| 7.0 \| 399 \| 0.3089 \| 0.8319 \| 0.5 \|
	\| No log \| 8.0 \| 456 \| 0.3042 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 9.0 \| 513 \| 0.2997 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 10.0 \| 570 \| 0.2940 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 11.0 \| 627 \| 0.2898 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 12.0 \| 684 \| 0.2856 \| 0.8463 \| 0.5032 \|
	\| 0.3595 \| 13.0 \| 741 \| 0.2819 \| 0.8593 \| 0.5096 \|
	\| 0.3595 \| 14.0 \| 798 \| 0.2789 \| 0.8600 \| 0.5128 \|
	\| 0.3595 \| 15.0 \| 855 \| 0.2757 \| 0.8701 \| 0.5220 \|
	\| 0.3595 \| 16.0 \| 912 \| 0.2723 \| 0.8733 \| 0.5312 \|
	\| 0.3595 \| 17.0 \| 969 \| 0.2698 \| 0.8733 \| 0.5312 \|
	\| 0.2983 \| 18.0 \| 1026 \| 0.2670 \| 0.8808 \| 0.5629 \|
	\| 0.2983 \| 19.0 \| 1083 \| 0.2652 \| 0.8814 \| 0.5661 \|
	\| 0.2983 \| 20.0 \| 1140 \| 0.2630 \| 0.8786 \| 0.5744 \|
	\| 0.2983 \| 21.0 \| 1197 \| 0.2612 \| 0.8807 \| 0.5840 \|
	\| 0.2983 \| 22.0 \| 1254 \| 0.2596 \| 0.8818 \| 0.5900 \|
	\| 0.2983 \| 23.0 \| 1311 \| 0.2580 \| 0.8841 \| 0.6024 \|
	\| 0.2983 \| 24.0 \| 1368 \| 0.2562 \| 0.8878 \| 0.6153 \|
	\| 0.2983 \| 25.0 \| 1425 \| 0.2555 \| 0.8851 \| 0.6056 \|
	\| 0.2983 \| 26.0 \| 1482 \| 0.2544 \| 0.8860 \| 0.6088 \|
	\| 0.2747 \| 27.0 \| 1539 \| 0.2535 \| 0.8868 \| 0.6148 \|
	\| 0.2747 \| 28.0 \| 1596 \| 0.2527 \| 0.8878 \| 0.6153 \|
	\| 0.2747 \| 29.0 \| 1653 \| 0.2519 \| 0.8869 \| 0.6121 \|
	\| 0.2747 \| 30.0 \| 1710 \| 0.2512 \| 0.8875 \| 0.6180 \|
	\| 0.2747 \| 31.0 \| 1767 \| 0.2501 \| 0.8900 \| 0.6277 \|
	\| 0.2747 \| 32.0 \| 1824 \| 0.2495 \| 0.8923 \| 0.6401 \|
	\| 0.2747 \| 33.0 \| 1881 \| 0.2492 \| 0.8907 \| 0.6337 \|
	\| 0.2747 \| 34.0 \| 1938 \| 0.2488 \| 0.8922 \| 0.6401 \|
	\| 0.2747 \| 35.0 \| 1995 \| 0.2485 \| 0.8915 \| 0.6369 \|
	\| 0.2633 \| 36.0 \| 2052 \| 0.2480 \| 0.8922 \| 0.6401 \|
	\| 0.2633 \| 37.0 \| 2109 \| 0.2478 \| 0.8938 \| 0.6465 \|
	\| 0.2633 \| 38.0 \| 2166 \| 0.2477 \| 0.8930 \| 0.6433 \|
	\| 0.2633 \| 39.0 \| 2223 \| 0.2476 \| 0.8938 \| 0.6465 \|
	\| 0.2633 \| 40.0 \| 2280 \| 0.2476 \| 0.8938 \| 0.6465 \|


	### Framework versions

	- Transformers 4.41.1
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1

	---
	base_model: microsoft/mpnet-base
	tags:
	- generated_from_trainer
	metrics:
	- f1
	model-index:
	- name: mpnet-base-airlines-news-multi-label
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mpnet-base-airlines-news-multi-label

	This model is a fine-tuned version of [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2478
	- F1: 0.8938
	- Roc Auc: 0.6465

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \| Roc Auc \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:-------:\|
	\| No log \| 1.0 \| 57 \| 0.3726 \| 0.8319 \| 0.5 \|
	\| No log \| 2.0 \| 114 \| 0.3361 \| 0.8319 \| 0.5 \|
	\| No log \| 3.0 \| 171 \| 0.3303 \| 0.8319 \| 0.5 \|
	\| No log \| 4.0 \| 228 \| 0.3249 \| 0.8319 \| 0.5 \|
	\| No log \| 5.0 \| 285 \| 0.3188 \| 0.8319 \| 0.5 \|
	\| No log \| 6.0 \| 342 \| 0.3141 \| 0.8319 \| 0.5 \|
	\| No log \| 7.0 \| 399 \| 0.3089 \| 0.8319 \| 0.5 \|
	\| No log \| 8.0 \| 456 \| 0.3042 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 9.0 \| 513 \| 0.2997 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 10.0 \| 570 \| 0.2940 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 11.0 \| 627 \| 0.2898 \| 0.8319 \| 0.5 \|
	\| 0.3595 \| 12.0 \| 684 \| 0.2856 \| 0.8463 \| 0.5032 \|
	\| 0.3595 \| 13.0 \| 741 \| 0.2819 \| 0.8593 \| 0.5096 \|
	\| 0.3595 \| 14.0 \| 798 \| 0.2789 \| 0.8600 \| 0.5128 \|
	\| 0.3595 \| 15.0 \| 855 \| 0.2757 \| 0.8701 \| 0.5220 \|
	\| 0.3595 \| 16.0 \| 912 \| 0.2723 \| 0.8733 \| 0.5312 \|
	\| 0.3595 \| 17.0 \| 969 \| 0.2698 \| 0.8733 \| 0.5312 \|
	\| 0.2983 \| 18.0 \| 1026 \| 0.2670 \| 0.8808 \| 0.5629 \|
	\| 0.2983 \| 19.0 \| 1083 \| 0.2652 \| 0.8814 \| 0.5661 \|
	\| 0.2983 \| 20.0 \| 1140 \| 0.2630 \| 0.8786 \| 0.5744 \|
	\| 0.2983 \| 21.0 \| 1197 \| 0.2612 \| 0.8807 \| 0.5840 \|
	\| 0.2983 \| 22.0 \| 1254 \| 0.2596 \| 0.8818 \| 0.5900 \|
	\| 0.2983 \| 23.0 \| 1311 \| 0.2580 \| 0.8841 \| 0.6024 \|
	\| 0.2983 \| 24.0 \| 1368 \| 0.2562 \| 0.8878 \| 0.6153 \|
	\| 0.2983 \| 25.0 \| 1425 \| 0.2555 \| 0.8851 \| 0.6056 \|
	\| 0.2983 \| 26.0 \| 1482 \| 0.2544 \| 0.8860 \| 0.6088 \|
	\| 0.2747 \| 27.0 \| 1539 \| 0.2535 \| 0.8868 \| 0.6148 \|
	\| 0.2747 \| 28.0 \| 1596 \| 0.2527 \| 0.8878 \| 0.6153 \|
	\| 0.2747 \| 29.0 \| 1653 \| 0.2519 \| 0.8869 \| 0.6121 \|
	\| 0.2747 \| 30.0 \| 1710 \| 0.2512 \| 0.8875 \| 0.6180 \|
	\| 0.2747 \| 31.0 \| 1767 \| 0.2501 \| 0.8900 \| 0.6277 \|
	\| 0.2747 \| 32.0 \| 1824 \| 0.2495 \| 0.8923 \| 0.6401 \|
	\| 0.2747 \| 33.0 \| 1881 \| 0.2492 \| 0.8907 \| 0.6337 \|
	\| 0.2747 \| 34.0 \| 1938 \| 0.2488 \| 0.8922 \| 0.6401 \|
	\| 0.2747 \| 35.0 \| 1995 \| 0.2485 \| 0.8915 \| 0.6369 \|
	\| 0.2633 \| 36.0 \| 2052 \| 0.2480 \| 0.8922 \| 0.6401 \|
	\| 0.2633 \| 37.0 \| 2109 \| 0.2478 \| 0.8938 \| 0.6465 \|
	\| 0.2633 \| 38.0 \| 2166 \| 0.2477 \| 0.8930 \| 0.6433 \|
	\| 0.2633 \| 39.0 \| 2223 \| 0.2476 \| 0.8938 \| 0.6465 \|
	\| 0.2633 \| 40.0 \| 2280 \| 0.2476 \| 0.8938 \| 0.6465 \|


	### Framework versions

	- Transformers 4.41.1
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1