echarlaix's picture
echarlaix HF staff
update model card
a713ee5
metadata
language: en
license: apache-2.0
datasets:
  - sst2
  - glue
metrics:
  - accuracy
tags:
  - text-classification
  - neural-compressor
  - int8

Dynamically quantized and pruned DistilBERT base uncased finetuned SST-2

Table of Contents

Model Details

Model Description: This model is a DistilBERT fine-tuned on SST-2 dynamically quantized and pruned using a magnitude pruning strategy to obtain a sparsity of 10% with optimum-intel through the usage of Intel® Neural Compressor.

  • Model Type: Text Classification
  • Language(s): English
  • License: Apache-2.0
  • Parent Model: For more details on the original model, we encourage users to check out this model card.

How to Get Started With the Model

This requires to install Optimum : pip install optimum[neural-compressor]

To load the quantized model and run inference using the Transformers pipelines, you can do as follows:

from transformers import AutoTokenizer, pipeline
from optimum.intel import INCModelForSequenceClassification

model_id = "echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1"
model = INCModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
cls_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "He's a dreadful magician."
outputs = cls_pipe(text)