katuni4ka commited on
Commit
58aa9ce
1 Parent(s): 0bb9182

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -3
README.md CHANGED
@@ -4,7 +4,7 @@ language:
4
  - en
5
  ---
6
 
7
- # codegen25-7b-multi
8
 
9
  * Model creator: [Salesforce](https://huggingface.co/Salesforce)
10
  * Original model: [CodeGen2.5-7B-multi](https://huggingface.co/Salesforce/codegen25-7b-multi_P)
@@ -13,6 +13,14 @@ language:
13
 
14
  This is [CodeGen2.5-7B-multi](https://huggingface.co/Salesforce/codegen25-7b-multi_P) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to FP16.
15
 
 
 
 
 
 
 
 
 
16
  ## Compatibility
17
 
18
  The provided OpenVINO™ IR model is compatible with:
@@ -20,7 +28,7 @@ The provided OpenVINO™ IR model is compatible with:
20
  * OpenVINO version 2024.1.0 and higher
21
  * Optimum Intel 1.16.0 and higher
22
 
23
- ## Running Model Inference
24
 
25
  1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
26
 
@@ -34,7 +42,7 @@ pip install optimum[openvino] tiktoken
34
  from transformers import AutoTokenizer
35
  from optimum.intel.openvino import OVModelForCausalLM
36
 
37
- model_id = "OpenVINO/codegen25-7b-multi-fp16-ov"
38
  tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
39
  model = OVModelForCausalLM.from_pretrained(model_id)
40
  text = "def hello_world():"
@@ -45,6 +53,38 @@ print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
45
 
46
  For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  ## Limitations
49
 
50
  Check the original model card for [limitations](https://huggingface.co/Salesforce/codegen25-7b-instruct_P#intended-use-and-limitations).
 
4
  - en
5
  ---
6
 
7
+ # codegen25-7b-multi-int8-ov
8
 
9
  * Model creator: [Salesforce](https://huggingface.co/Salesforce)
10
  * Original model: [CodeGen2.5-7B-multi](https://huggingface.co/Salesforce/codegen25-7b-multi_P)
 
13
 
14
  This is [CodeGen2.5-7B-multi](https://huggingface.co/Salesforce/codegen25-7b-multi_P) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to FP16.
15
 
16
+ ## Quantization Parameters
17
+
18
+ Weight compression was performed using `nncf.compress_weights` with the following parameters:
19
+
20
+ * mode: **INT8_ASYM**
21
+
22
+ For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html).
23
+
24
  ## Compatibility
25
 
26
  The provided OpenVINO™ IR model is compatible with:
 
28
  * OpenVINO version 2024.1.0 and higher
29
  * Optimum Intel 1.16.0 and higher
30
 
31
+ ## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
32
 
33
  1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
34
 
 
42
  from transformers import AutoTokenizer
43
  from optimum.intel.openvino import OVModelForCausalLM
44
 
45
+ model_id = "OpenVINO/codegen25-7b-multi-int8-ov"
46
  tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
47
  model = OVModelForCausalLM.from_pretrained(model_id)
48
  text = "def hello_world():"
 
53
 
54
  For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
55
 
56
+ ## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
57
+
58
+ 1. Install packages required for using OpenVINO GenAI.
59
+ ```
60
+ pip install openvino-genai huggingface_hub
61
+ ```
62
+
63
+ 2. Download model from HuggingFace Hub
64
+
65
+ ```
66
+ import huggingface_hub as hf_hub
67
+
68
+ model_id = "OpenVINO/codegen25-7b-multi-int8-ov"
69
+ model_path = "codegen25-7b-multi-int8-ov"
70
+
71
+ hf_hub.snapshot_download(model_id, local_dir=model_path)
72
+
73
+ ```
74
+
75
+ 3. Run model inference:
76
+
77
+ ```
78
+ import openvino_genai as ov_genai
79
+
80
+ device = "CPU"
81
+ pipe = ov_genai.LLMPipeline(model_path, device)
82
+ print(pipe.generate("def hello_world():"))
83
+ ```
84
+
85
+ More GenAI usage examples can be found in OpenVINO GenAI library [docs](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md) and [samples](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#openvino-genai-samples)
86
+
87
+
88
  ## Limitations
89
 
90
  Check the original model card for [limitations](https://huggingface.co/Salesforce/codegen25-7b-instruct_P#intended-use-and-limitations).