scaleflex
/

clip-vit-base-patch32-openvino

Model card Files Files and versions Community

s-emanuilov commited on Oct 25, 2023

Commit

78010d6

•

1 Parent(s): 65c75be

Update README.md

Files changed (1) hide show

README.md +58 -1

README.md CHANGED Viewed

@@ -27,4 +27,61 @@ The results indicate that the OpenVINO™ optimization provides a consistent imp
 ## Usage
-You can utilize this optimized model for faster inferences in environments where time is a critical factor. Ensure you have the necessary libraries and dependencies installed to leverage the power of OpenVINO™.

 ## Usage
+You can utilize this optimized model for faster inferences in environments where time is a critical factor. Ensure you have the necessary libraries and dependencies installed to leverage the usage of OpenVINO™.
+```bash
+pip install transformers huggingface_hub openvino-dev
+```
+Then use it for inference:
+```python
+import os
+import numpy as np
+from PIL import Image
+from huggingface_hub import snapshot_download
+from openvino.runtime import Core
+from scipy.special import softmax
+from transformers import CLIPProcessor
+# Download the OV model
+ov_path = snapshot_download(repo_id="scaleflex/clip-vit-base-patch32-openvino")
+# Load preprocessor for model input
+processor = CLIPProcessor.from_pretrained("scaleflex/clip-vit-base-patch32-openvino")
+ov_model_xml = os.path.join(ov_path, "clip-vit-base-patch32.xml")
+image = Image.open("face.png")  # download this example image: http://sample.li/face.png
+input_labels = [
+    "businessman",
+    "dog playing in the garden",
+    "beautiful woman",
+    "big city",
+    "lake in the mountain",
+]
+text_descriptions = [f"This is a photo of a {label}" for label in input_labels]
+inputs = processor(
+    text=text_descriptions, images=[image], return_tensors="pt", padding=True
+)
+# Create OpenVINO core object instance
+core = Core()
+ov_model = core.read_model(model=ov_model_xml)
+# Compile model for loading on device
+compiled_model = core.compile_model(ov_model)
+# Obtain output tensor for getting predictions
+logits_per_image_out = compiled_model.output(0)
+# Run inference on preprocessed data and get image-text similarity score
+ov_logits_per_image = compiled_model(dict(inputs))[logits_per_image_out]
+# Perform softmax on score
+probs = softmax(ov_logits_per_image, axis=1)
+max_index = np.argmax(probs)
+# Use the index to get the corresponding label
+label_with_max_prob = input_labels[max_index]
+print(
+    f"The label with the highest probability is: '{label_with_max_prob}' with a probability of {probs[0][max_index] * 100:.2f}%"
+)
+# The label with the highest probability is: 'beautiful woman' with a probability of 97.87%
+```