tiiuae/falcon-7b-instruct

This is the tiiuae/falcon-7b-instruct model converted to OpenVINO with INT8 weights compression for accelerated inference.

An example of how to do inference on this model:

from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer, pipeline

# model_id should be set to either a local directory or a model available on the HuggingFace hub.
model_id = "helenai/tiiuae-falcon-7b-instruct-ov"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = OVModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
result = pipe("hello world")
print(result)
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.