parinitarahi commited on
Commit
b18be0d
·
verified ·
1 Parent(s): 13739cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -21,8 +21,51 @@ Here are some of the optimized configurations we have added:
21
  1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
22
  2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
23
 
 
24
  You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ## Model Description
27
  - Developed by: Microsoft
28
  - Model type: ONNX
 
21
  1. ONNX model for int4 CPU and Mobile: ONNX model for CPU and mobile using int4 quantization via RTN.
22
  2. ONNX model for int4 CUDA and DML GPU devices using int4 quantization via RTN.
23
 
24
+ ## Model Run
25
  You can see how to run examples with ORT GenAI [here](https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi-3-tutorial.md)
26
 
27
+ For CPU:
28
+
29
+ ```bash
30
+ # Download the model directly using the Hugging Face CLI
31
+ huggingface-cli download microsoft/Phi-4-onnx/ --include Phi-4-onnx/cpu_and_mobile/* --local-dir .
32
+
33
+ # Install the CPU package of ONNX Runtime GenAI
34
+ pip install --pre onnxruntime-genai
35
+
36
+ # Please adjust the model directory (-m) accordingly
37
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
38
+ python phi3-qa.py -m cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4 -e cpu
39
+ ```
40
+
41
+ For CUDA:
42
+
43
+ ```bash
44
+ # Download the model directly using the Hugging Face CLI
45
+ huggingface-cli download onnxruntime/Phi-4-onnx --include Phi-4-onnx/gpu/* --local-dir .
46
+
47
+ # Install the CUDA package of ONNX Runtime GenAI
48
+ pip install --pre onnxruntime-genai-cuda
49
+
50
+ # Please adjust the model directory (-m) accordingly
51
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
52
+ python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e cuda
53
+ ```
54
+
55
+ For DirectML:
56
+
57
+ ```bash
58
+ # Download the model directly using the Hugging Face CLI
59
+ huggingface-cli download onnxruntime/Phi-4-onnx --include Phi-4-onnx/gpu/* --local-dir .
60
+
61
+ # Install the CUDA package of ONNX Runtime GenAI
62
+ pip install --pre onnxruntime-genai-cuda
63
+
64
+ # Please adjust the model directory (-m) accordingly
65
+ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py -o phi3-qa.py
66
+ python phi3-qa.py -m gpu/gpu-int4-rtn-block-32 -e dml
67
+ ```
68
+
69
  ## Model Description
70
  - Developed by: Microsoft
71
  - Model type: ONNX