File size: 1,788 Bytes
7cead88
1698289
 
7cead88
1698289
 
 
cc68488
1698289
04b5738
1698289
 
 
 
 
e064e32
1698289
 
 
 
 
 
 
 
 
 
 
888e061
7cead88
888e061
1698289
04b5738
 
 
1698289
e064e32
1698289
 
 
04b5738
1698289
 
 
 
68d4e71
1698289
04b5738
1698289
 
69e7b04
 
 
 
1698289
04b5738
 
 
 
 
 
 
 
 
 
 
fea302f
 
04b5738
 
 
 
 
 
 
69e7b04
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
language:
- en
license: apache-2.0
tags:
- text-classfication
- int8
- Intel® Neural Compressor
- PostTrainingDynamic
- onnx
datasets:
- glue
metrics:
- f1
model-index:
- name: bart-large-mrpc-int8-dynamic
  results:
  - task:
      name: Text Classification
      type: text-classification
    dataset:
      name: GLUE MRPC
      type: glue
      args: mrpc
    metrics:
    - name: F1
      type: f1
      value: 0.9050847457627118
---
# INT8 bart-large-mrpc

##  Post-training dynamic quantization

### PyTorch

This is an INT8  PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor). 

The original fp32 model comes from the fine-tuned model [bart-large-mrpc](https://huggingface.co/Intel/bart-large-mrpc).

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9051|0.9120|
| **Model size (MB)**  |547|1556.48|

#### Load with optimum:

```python
from optimum.intel import INCModelForSequenceClassification

model_id = "Intel/bart-large-mrpc-int8-dynamic"
int8_model = INCModelForSequenceClassification.from_pretrained(model_id)
```

### ONNX

This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).

The original fp32 model comes from the fine-tuned model [bart-large-mrpc](https://huggingface.co/Intel/bart-large-mrpc).

#### Test result

|   |INT8|FP32|
|---|:---:|:---:|
| **Accuracy (eval-f1)** |0.9236|0.9120|
| **Model size (MB)**  |764|1555|


#### Load ONNX model:

```python
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/bart-large-mrpc-int8-dynamic')
```