bpawar commited on
Commit
1d8d991
·
verified ·
1 Parent(s): 1b438fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -2
README.md CHANGED
@@ -13,12 +13,23 @@ base_model:
13
 
14
  # Mistral-Nemo-12B-Instruct-ONNX-INT4
15
 
 
16
 
17
  ### Model Description
18
 
19
  Mistral-NeMo is a Large Language Model (LLM) composed of 12B parameters. This model leads accuracy on popular benchmarks across common sense reasoning, coding, math, multilingual and multi-turn chat tasks; it significantly outperforms existing models smaller or similar in size.
20
-
21
- We downloaded Mistral Nemo 12B instruct model in Pytorch bfloat16 format from HuggingFace. We used Onnxruntime-GenAI to convert the model from Pytorch FP16 format to ONNX FP16 format. We used TensorRT Model Optimizer - Windows tool to convert the model from ONNX FP16 format to ONNX INT4 fomat.  We have posted the Mistral Nemo 12B ONNX INT4 model files here. 
 
 
 
 
 
 
 
 
 
 
22
 
23
  This model is ready for commercial/non-commercial use. 
24
 
 
13
 
14
  # Mistral-Nemo-12B-Instruct-ONNX-INT4
15
 
16
+ ## Model Developer : Mistral
17
 
18
  ### Model Description
19
 
20
  Mistral-NeMo is a Large Language Model (LLM) composed of 12B parameters. This model leads accuracy on popular benchmarks across common sense reasoning, coding, math, multilingual and multi-turn chat tasks; it significantly outperforms existing models smaller or similar in size.
21
+
22
+ The NVIDIA Mistral-Nemo-12B Instruct ONNX INT4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
23
+
24
+
25
+ Steps followed to generate this quantized model:
26
+
27
+ * 1. Download Mistral-Nemo-12B Instruct model in Pytorch bfloat16 format from HuggingFace.
28
+
29
+ * 2. Convert PyTorch model to ONNX FP16 using onnxruntime-genai model builder.
30
+
31
+ * 3. Quantize Mistral-Nemo-12B Instruct ONNX FP16 model to Mistral-Nemo-12B Instruct ONNX INT4 AWQ model using TensorRT Model Optimizer – Windows.
32
+
33
 
34
  This model is ready for commercial/non-commercial use. 
35