--- license: apache-2.0 --- This repo has an optimized version of [Detoxify](https://github.com/unitaryai/detoxify/), which needs less disk space and less memory at the cost of just a little bit of accuracy. This is an experiment for me to learn how to use [🤗 Optimum](https://huggingface.co/docs/optimum/index). # Usage Loading the model requires the [🤗 Optimum](https://huggingface.co/docs/optimum/index) library installed. ```python from optimum.onnxruntime import ORTModelForSequenceClassification from optimum.pipelines import pipeline as opt_pipeline from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("dcferreira/detoxify-optimized") model = ORTModelForSequenceClassification.from_pretrained("dcferreira/detoxify-optimized") pipe = opt_pipeline( model=model, task="text-classification", function_to_apply="sigmoid", accelerator="ort", tokenizer=tokenizer, top_k=None, # return scores for all the labels, model was trained as multilabel ) print(pipe(['example text','exemple de texte','texto de ejemplo','testo di esempio','texto de exemplo','örnek metin','пример текста'])) ``` # Performance The table below compares some statistics on running the original model, vs the original model with the [onnxruntime](https://onnxruntime.ai/), vs optimizing the model with onnxruntime. | model | Accuracy (%) | Samples p/ second (CPU) | Samples p/ second (GPU) | GPU VRAM | Disk Space | |----------------|----------|-------------------------|-------------------------|----------|------------| | original | 92.1083 | 16 | 250 | 3GB | 1.1GB | | ort | 92.1067 | 19 | 340 | 4GB | 1.1GB | | optimized (O4) | 92.1031 | 14 | 650 | 2GB | 540MB | For details on how these numbers were reached, check out `evaluate.py` in this repo.