Librarian Bot: Add base_model information to model

7aa0762 about 1 year ago

4.34 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- glue
	metrics:
	- accuracy
	base_model: bert-base-uncased
	model-index:
	- name: jpqd-bert-base-ft-sst2
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: GLUE SST2
	type: glue
	config: sst2
	split: validation
	args: sst2
	metrics:
	- type: accuracy
	value: 0.9162844036697247
	name: Accuracy
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# jpqd-bert-base-ft-sst2

	This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the GLUE SST2 dataset.

	It was compressed with [NNCF](https://github.com/openvinotoolkit/nncf) following the [Optimum JPQD text-classification
	example](https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/text-classification)

	It achieves the following results on the evaluation set:
	- Loss: 0.2798
	- Accuracy: 0.9163

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5.0
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 0.392 \| 0.12 \| 250 \| 0.4535 \| 0.8888 \|
	\| 0.4413 \| 0.24 \| 500 \| 0.4671 \| 0.8899 \|
	\| 0.29 \| 0.36 \| 750 \| 0.3285 \| 0.9128 \|
	\| 0.2851 \| 0.48 \| 1000 \| 0.2498 \| 0.9151 \|
	\| 0.3717 \| 0.59 \| 1250 \| 0.2037 \| 0.9243 \|
	\| 0.2467 \| 0.71 \| 1500 \| 0.2840 \| 0.9174 \|
	\| 0.2114 \| 0.83 \| 1750 \| 0.2239 \| 0.9243 \|
	\| 0.1777 \| 0.95 \| 2000 \| 0.1968 \| 0.9266 \|
	\| 2.6501 \| 1.07 \| 2250 \| 2.8219 \| 0.9255 \|
	\| 6.4768 \| 1.19 \| 2500 \| 6.5765 \| 0.8979 \|
	\| 9.3594 \| 1.31 \| 2750 \| 9.4648 \| 0.8819 \|
	\| 11.5481 \| 1.43 \| 3000 \| 11.5391 \| 0.8567 \|
	\| 12.7541 \| 1.54 \| 3250 \| 12.8359 \| 0.8578 \|
	\| 13.6184 \| 1.66 \| 3500 \| 13.6519 \| 0.8429 \|
	\| 13.9171 \| 1.78 \| 3750 \| 14.0734 \| 0.8475 \|
	\| 13.9601 \| 1.9 \| 4000 \| 14.1024 \| 0.8578 \|
	\| 0.2701 \| 2.02 \| 4250 \| 0.3354 \| 0.9048 \|
	\| 0.2689 \| 2.14 \| 4500 \| 0.3320 \| 0.9048 \|
	\| 0.1775 \| 2.26 \| 4750 \| 0.2838 \| 0.9163 \|
	\| 0.1648 \| 2.38 \| 5000 \| 0.2842 \| 0.9128 \|
	\| 0.1316 \| 2.49 \| 5250 \| 0.2750 \| 0.9163 \|
	\| 0.2349 \| 2.61 \| 5500 \| 0.2405 \| 0.9232 \|
	\| 0.066 \| 2.73 \| 5750 \| 0.2695 \| 0.9174 \|
	\| 0.1285 \| 2.85 \| 6000 \| 0.3017 \| 0.9094 \|
	\| 0.1813 \| 2.97 \| 6250 \| 0.3472 \| 0.9106 \|
	\| 0.078 \| 3.09 \| 6500 \| 0.2915 \| 0.9140 \|
	\| 0.0886 \| 3.21 \| 6750 \| 0.2853 \| 0.9151 \|
	\| 0.117 \| 3.33 \| 7000 \| 0.2689 \| 0.9186 \|
	\| 0.0894 \| 3.44 \| 7250 \| 0.2748 \| 0.9174 \|
	\| 0.1023 \| 3.56 \| 7500 \| 0.3279 \| 0.9094 \|
	\| 0.0495 \| 3.68 \| 7750 \| 0.2988 \| 0.9151 \|
	\| 0.0899 \| 3.8 \| 8000 \| 0.2796 \| 0.9174 \|
	\| 0.1102 \| 3.92 \| 8250 \| 0.2667 \| 0.9163 \|
	\| 0.061 \| 4.04 \| 8500 \| 0.2837 \| 0.9174 \|
	\| 0.0594 \| 4.16 \| 8750 \| 0.2766 \| 0.9151 \|
	\| 0.1062 \| 4.28 \| 9000 \| 0.2777 \| 0.9140 \|
	\| 0.0751 \| 4.39 \| 9250 \| 0.2690 \| 0.9220 \|
	\| 0.0386 \| 4.51 \| 9500 \| 0.2668 \| 0.9163 \|
	\| 0.0284 \| 4.63 \| 9750 \| 0.2812 \| 0.9186 \|
	\| 0.1016 \| 4.75 \| 10000 \| 0.2825 \| 0.9163 \|
	\| 0.0507 \| 4.87 \| 10250 \| 0.2805 \| 0.9140 \|
	\| 0.0709 \| 4.99 \| 10500 \| 0.2855 \| 0.9140 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1+cu117
	- Datasets 2.8.0
	- Tokenizers 0.13.2