Training in progress, epoch 1

fe8891c verified 6 months ago

3.83 kB

	---
	license: mit
	base_model: microsoft/Multilingual-MiniLM-L12-H384
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: intent_trading
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# intent_trading

	This model is a fine-tuned version of [microsoft/Multilingual-MiniLM-L12-H384](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1741
	- Accuracy: 0.9548

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| No log \| 1.0 \| 227 \| 1.5904 \| 0.7689 \|
	\| No log \| 2.0 \| 454 \| 1.0086 \| 0.8670 \|
	\| 1.6528 \| 3.0 \| 681 \| 0.6706 \| 0.9055 \|
	\| 1.6528 \| 4.0 \| 908 \| 0.4376 \| 0.9518 \|
	\| 0.6124 \| 5.0 \| 1135 \| 0.2966 \| 0.9551 \|
	\| 0.6124 \| 6.0 \| 1362 \| 0.2373 \| 0.9504 \|
	\| 0.2536 \| 7.0 \| 1589 \| 0.1967 \| 0.9537 \|
	\| 0.2536 \| 8.0 \| 1816 \| 0.1666 \| 0.9565 \|
	\| 0.1476 \| 9.0 \| 2043 \| 0.1642 \| 0.9543 \|
	\| 0.1476 \| 10.0 \| 2270 \| 0.1570 \| 0.9551 \|
	\| 0.1476 \| 11.0 \| 2497 \| 0.1500 \| 0.9543 \|
	\| 0.1067 \| 12.0 \| 2724 \| 0.1469 \| 0.9548 \|
	\| 0.1067 \| 13.0 \| 2951 \| 0.1458 \| 0.9557 \|
	\| 0.0817 \| 14.0 \| 3178 \| 0.1409 \| 0.9540 \|
	\| 0.0817 \| 15.0 \| 3405 \| 0.1426 \| 0.9595 \|
	\| 0.0709 \| 16.0 \| 3632 \| 0.1418 \| 0.9540 \|
	\| 0.0709 \| 17.0 \| 3859 \| 0.1416 \| 0.9557 \|
	\| 0.0631 \| 18.0 \| 4086 \| 0.1373 \| 0.9581 \|
	\| 0.0631 \| 19.0 \| 4313 \| 0.1458 \| 0.9559 \|
	\| 0.0557 \| 20.0 \| 4540 \| 0.1391 \| 0.9559 \|
	\| 0.0557 \| 21.0 \| 4767 \| 0.1526 \| 0.9518 \|
	\| 0.0557 \| 22.0 \| 4994 \| 0.1511 \| 0.9529 \|
	\| 0.0495 \| 23.0 \| 5221 \| 0.1578 \| 0.9526 \|
	\| 0.0495 \| 24.0 \| 5448 \| 0.1360 \| 0.9568 \|
	\| 0.0443 \| 25.0 \| 5675 \| 0.1451 \| 0.9565 \|
	\| 0.0443 \| 26.0 \| 5902 \| 0.1477 \| 0.9562 \|
	\| 0.0419 \| 27.0 \| 6129 \| 0.1624 \| 0.9540 \|
	\| 0.0419 \| 28.0 \| 6356 \| 0.1659 \| 0.9537 \|
	\| 0.0371 \| 29.0 \| 6583 \| 0.1607 \| 0.9548 \|
	\| 0.0371 \| 30.0 \| 6810 \| 0.1638 \| 0.9543 \|
	\| 0.035 \| 31.0 \| 7037 \| 0.1655 \| 0.9529 \|
	\| 0.035 \| 32.0 \| 7264 \| 0.1662 \| 0.9562 \|
	\| 0.035 \| 33.0 \| 7491 \| 0.1702 \| 0.9532 \|
	\| 0.033 \| 34.0 \| 7718 \| 0.1662 \| 0.9562 \|
	\| 0.033 \| 35.0 \| 7945 \| 0.1667 \| 0.9532 \|
	\| 0.0309 \| 36.0 \| 8172 \| 0.1794 \| 0.9554 \|
	\| 0.0309 \| 37.0 \| 8399 \| 0.1756 \| 0.9546 \|
	\| 0.0292 \| 38.0 \| 8626 \| 0.1722 \| 0.9559 \|
	\| 0.0292 \| 39.0 \| 8853 \| 0.1706 \| 0.9559 \|
	\| 0.0281 \| 40.0 \| 9080 \| 0.1741 \| 0.9548 \|


	### Framework versions

	- Transformers 4.40.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.14.5
	- Tokenizers 0.19.1

	---
	license: mit
	base_model: microsoft/Multilingual-MiniLM-L12-H384
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: intent_trading
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# intent_trading

	This model is a fine-tuned version of [microsoft/Multilingual-MiniLM-L12-H384](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1741
	- Accuracy: 0.9548

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| No log \| 1.0 \| 227 \| 1.5904 \| 0.7689 \|
	\| No log \| 2.0 \| 454 \| 1.0086 \| 0.8670 \|
	\| 1.6528 \| 3.0 \| 681 \| 0.6706 \| 0.9055 \|
	\| 1.6528 \| 4.0 \| 908 \| 0.4376 \| 0.9518 \|
	\| 0.6124 \| 5.0 \| 1135 \| 0.2966 \| 0.9551 \|
	\| 0.6124 \| 6.0 \| 1362 \| 0.2373 \| 0.9504 \|
	\| 0.2536 \| 7.0 \| 1589 \| 0.1967 \| 0.9537 \|
	\| 0.2536 \| 8.0 \| 1816 \| 0.1666 \| 0.9565 \|
	\| 0.1476 \| 9.0 \| 2043 \| 0.1642 \| 0.9543 \|
	\| 0.1476 \| 10.0 \| 2270 \| 0.1570 \| 0.9551 \|
	\| 0.1476 \| 11.0 \| 2497 \| 0.1500 \| 0.9543 \|
	\| 0.1067 \| 12.0 \| 2724 \| 0.1469 \| 0.9548 \|
	\| 0.1067 \| 13.0 \| 2951 \| 0.1458 \| 0.9557 \|
	\| 0.0817 \| 14.0 \| 3178 \| 0.1409 \| 0.9540 \|
	\| 0.0817 \| 15.0 \| 3405 \| 0.1426 \| 0.9595 \|
	\| 0.0709 \| 16.0 \| 3632 \| 0.1418 \| 0.9540 \|
	\| 0.0709 \| 17.0 \| 3859 \| 0.1416 \| 0.9557 \|
	\| 0.0631 \| 18.0 \| 4086 \| 0.1373 \| 0.9581 \|
	\| 0.0631 \| 19.0 \| 4313 \| 0.1458 \| 0.9559 \|
	\| 0.0557 \| 20.0 \| 4540 \| 0.1391 \| 0.9559 \|
	\| 0.0557 \| 21.0 \| 4767 \| 0.1526 \| 0.9518 \|
	\| 0.0557 \| 22.0 \| 4994 \| 0.1511 \| 0.9529 \|
	\| 0.0495 \| 23.0 \| 5221 \| 0.1578 \| 0.9526 \|
	\| 0.0495 \| 24.0 \| 5448 \| 0.1360 \| 0.9568 \|
	\| 0.0443 \| 25.0 \| 5675 \| 0.1451 \| 0.9565 \|
	\| 0.0443 \| 26.0 \| 5902 \| 0.1477 \| 0.9562 \|
	\| 0.0419 \| 27.0 \| 6129 \| 0.1624 \| 0.9540 \|
	\| 0.0419 \| 28.0 \| 6356 \| 0.1659 \| 0.9537 \|
	\| 0.0371 \| 29.0 \| 6583 \| 0.1607 \| 0.9548 \|
	\| 0.0371 \| 30.0 \| 6810 \| 0.1638 \| 0.9543 \|
	\| 0.035 \| 31.0 \| 7037 \| 0.1655 \| 0.9529 \|
	\| 0.035 \| 32.0 \| 7264 \| 0.1662 \| 0.9562 \|
	\| 0.035 \| 33.0 \| 7491 \| 0.1702 \| 0.9532 \|
	\| 0.033 \| 34.0 \| 7718 \| 0.1662 \| 0.9562 \|
	\| 0.033 \| 35.0 \| 7945 \| 0.1667 \| 0.9532 \|
	\| 0.0309 \| 36.0 \| 8172 \| 0.1794 \| 0.9554 \|
	\| 0.0309 \| 37.0 \| 8399 \| 0.1756 \| 0.9546 \|
	\| 0.0292 \| 38.0 \| 8626 \| 0.1722 \| 0.9559 \|
	\| 0.0292 \| 39.0 \| 8853 \| 0.1706 \| 0.9559 \|
	\| 0.0281 \| 40.0 \| 9080 \| 0.1741 \| 0.9548 \|


	### Framework versions

	- Transformers 4.40.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.14.5
	- Tokenizers 0.19.1