How to quantize

#10
by supercharge19 - opened

Is there an example, code, to use quantize this model or is there a quantized version available?

I found this: https://huggingface.co/WhereIsAI/UAE-Large-V1/blob/main/onnx/model_quantized.onnx

But how to use it, please give some example.

WhereIsAI org

@supercharge19 hi, you can use optimum to load the quantized onnx model, as follows:

from optimum.onnxruntime import ORTModelForFeatureExtraction
from optimum.pipelines import pipeline

model = ORTModelForFeatureExtraction.from_pretrained('WhereIsAI/UAE-Large-V1', file_name="onnx/model_quantized.onnx")
extractor = pipeline('feature-extraction', model=model)
output = extractor('hello world')

@supercharge19 hi, you can use optimum to load the quantized onnx model, as follows:

from optimum.onnxruntime import ORTModelForFeatureExtraction
from optimum.pipelines import pipeline

model = ORTModelForFeatureExtraction.from_pretrained('WhereIsAI/UAE-Large-V1', file_name="onnx/model_quantized.onnx")
extractor = pipeline('feature-extraction', model=model)
output = extractor('hello world')

Thanks man.

Sign up or log in to comment