Optimum Inference with Furiosa NPU
Optimum Furiosa is a utility package for building and running inference with Furiosa NPUs. Optimum can be used to load optimized models from the Hugging Face Hub and create pipelines to run accelerated inference without rewriting your APIs.
Switching from Transformers to Optimum Furiosa
The optimum.furiosa.FuriosaAIModelForXXX
model classes are API compatible with Hugging Face models. This
means you can just replace your AutoModelForXXX
class with the corresponding FuriosaAIModelForXXX
class in optimum.furiosa
.
You do not need to adapt your code to get it to work with FuriosaAIModelForXXX
classes:
Because the model you want to work with might not be already converted to ONNX, FuriosaAIModel
includes a method to convert vanilla Hugging Face models to ONNX ones. Simply pass export=True
to the
from_pretrained
method, and your model will be loaded and converted to ONNX on-the-fly:
Loading and inference of a vanilla Transformers model
import requests
from PIL import Image
- from transformers import AutoModelForImageClassification
+ from optimum.furiosa import FuriosaAIModelForImageClassification
from transformers import AutoFeatureExtractor, pipeline
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
model_id = "microsoft/resnet-50"
- model = AutoModelForImageClassification.from_pretrained(model_id)
+ model = FuriosaAIModelForImageClassification.from_pretrained(model_id, export=True, input_shape_dict={"pixel_values": [1, 3, 224, 224]}, output_shape_dict={"logits": [1, 1000]},)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
cls_pipe = pipeline("image-classification", model=model, feature_extractor=feature_extractor)
outputs = cls_pipe(image)
Pushing compiled models to the Hugging Face Hub
It is also possible, just as with regular PreTrainedModels, to push your FurisoaAIModelForXXX
to the
Hugging Face Model Hub:
>>> from optimum.furiosa import FuriosaAIModelForImageClassification
>>> # Load the model from the hub
>>> model = FuriosaAIModelForImageClassification.from_pretrained(
... "microsoft/resnet-50", export=True, input_shape_dict={"pixel_values": [1, 3, 224, 224]}, output_shape_dict={"logits": [1, 1000]},
... )
>>> # Save the converted model
>>> model.save_pretrained("a_local_path_for_compiled_model")
# Push the compiled model to HF Hub
>>> model.push_to_hub(
... "a_local_path_for_compiled_model", repository_id="my-furiosa-repo", use_auth_token=True
... )