noneUsername
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,6 @@
|
|
|
|
|
|
|
|
1 |
My first quantization uses the quantization method provided by vllm:
|
2 |
|
3 |
https://docs.vllm.ai/en/latest/quantization/int8.html
|
|
|
1 |
+
Note: This model is no longer the optimal W8A8 quantization, please consider using a better quantization model I made later:
|
2 |
+
noneUsername/Mistral-Nemo-Instruct-2407-W8A8-Dynamic-Per-Token-better
|
3 |
+
|
4 |
My first quantization uses the quantization method provided by vllm:
|
5 |
|
6 |
https://docs.vllm.ai/en/latest/quantization/int8.html
|