|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
This repo has an optimized version of [Detoxify](https://github.com/unitaryai/detoxify/), which needs less disk space and less memory at the cost of just a little bit of accuracy. |
|
|
|
This is an experiment for me to learn how to use [🤗 Optimum](https://huggingface.co/docs/optimum/index). |
|
|
|
# Usage |
|
|
|
Loading the model requires the [🤗 Optimum](https://huggingface.co/docs/optimum/index) library installed. |
|
|
|
```python |
|
from optimum.onnxruntime import ORTModelForSequenceClassification |
|
from optimum.pipelines import pipeline as opt_pipeline |
|
from transformers import AutoTokenizer |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("dcferreira/detoxify-optimized") |
|
model = ORTModelForSequenceClassification.from_pretrained("dcferreira/detoxify-optimized") |
|
pipe = opt_pipeline( |
|
model=model, |
|
task="text-classification", |
|
function_to_apply="sigmoid", |
|
accelerator="ort", |
|
tokenizer=tokenizer, |
|
top_k=None, # return scores for all the labels, model was trained as multilabel |
|
) |
|
|
|
print(pipe(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])) |
|
``` |
|
|
|
# Performance |
|
|
|
The table below compares some statistics on running the original model, vs the original model with the [onnxruntime](https://onnxruntime.ai/), vs optimizing the model with onnxruntime. |
|
|
|
|
|
| model | Accuracy (%) | Samples p/ second (CPU) | Samples p/ second (GPU) | GPU VRAM | Disk Space | |
|
|----------------|----------|-------------------------|-------------------------|----------|------------| |
|
| original | 92.1083 | 16 | 250 | 3GB | 1.1GB | |
|
| ort | 92.1067 | 19 | 340 | 4GB | 1.1GB | |
|
| optimized (O4) | 92.1031 | 14 | 650 | 2GB | 540MB | |
|
|
|
For details on how these numbers were reached, check out `evaluate.py` in this repo. |
|
|