MohamedRashad
commited on
Commit
•
02856c9
1
Parent(s):
8491878
Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,10 @@ library_name: transformers
|
|
30 |
|
31 |
This repo contains AWQ model files for [FreedomIntelligence's AceGPT 7B Chat](https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat).
|
32 |
|
|
|
|
|
|
|
|
|
33 |
### About AWQ
|
34 |
|
35 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|
|
|
30 |
|
31 |
This repo contains AWQ model files for [FreedomIntelligence's AceGPT 7B Chat](https://huggingface.co/FreedomIntelligence/AceGPT-7B-chat).
|
32 |
|
33 |
+
In my effort of making Arabic LLms Available for consumers with simple GPUs I have Quantized two important models:
|
34 |
+
- [AceGPT 13B Chat AWQ](https://huggingface.co/MohamedRashad/AceGPT-13B-chat-AWQ)
|
35 |
+
- [AceGPT 7B Chat AWQ](https://huggingface.co/MohamedRashad/AceGPT-7B-chat-AWQ) **(We are Here)**
|
36 |
+
|
37 |
### About AWQ
|
38 |
|
39 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|