Intel
/

roberta-base-mrpc-int8-static-inc

Text Classification

text-classfication

Intel® Neural Compressor

neural-compressor

PostTrainingStatic

Inference Endpoints

Model card Files Files and versions Community

roberta-base-mrpc-int8-static-inc / README.md

xinhe's picture

Update README.md

4a3c353 almost 2 years ago

|

raw history blame

No virus

1.42 kB

	---
	language:
	- en
	license: mit
	tags:
	- text-classfication
	- int8
	- Intel® Neural Compressor
	- PostTrainingStatic
	datasets:
	- glue
	metrics:
	- f1
	model-index:
	- name: roberta-base-mrpc-int8-static
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: GLUE MRPC
	type: glue
	args: mrpc
	metrics:
	- name: F1
	type: f1
	value: 0.924693520140105
	---
	# INT8 roberta-base-mrpc

	### Post-training static quantization

	This is an INT8 PyTorch model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).

	The original fp32 model comes from the fine-tuned model [roberta-base-mrpc](https://huggingface.co/Intel/roberta-base-mrpc).

	The calibration dataloader is the train dataloader. The default calibration sampling size 300 isn't divisible exactly by batch size 8, so the real sampling size is 304.

	The embedding module roberta.embeddings.token_type_embeddings falls back to fp32 due to RuntimeError('Expect weight, indices, and offsets to be contiguous.')

	### Test result

	\| \|INT8\|FP32\|
	\|---\|:---:\|:---:\|
	\| Accuracy (eval-f1) \|0.9247\|0.9138\|
	\| Model size (MB) \|121\|476\|

	### Load with Intel® Neural Compressor:

	```python
	from neural_compressor.utils.load_huggingface import OptimizedModel
	int8_model = OptimizedModel.from_pretrained(
	'Intel/roberta-base-mrpc-int8-static',
	)
	```