Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,11 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
---
|
4 |
+
|
5 |
+
原始模型:https://huggingface.co/SakuraLLM/Sakura-13B-Qwen2beta-v0.9
|
6 |
+
|
7 |
+
4Bit AWQ量化,未测试,不建议使用。
|
8 |
+
|
9 |
+
GroupSize=64
|
10 |
+
|
11 |
+
vLLM双卡推理不兼容AWQ,查ISSUE说好像量化时GroupSize设置为64可以解决。
|