monsoon-nlp
commited on
Commit
•
f70ac82
1
Parent(s):
3c3432f
Update README.md
Browse files
README.md
CHANGED
@@ -22,11 +22,13 @@ The concept: 8-bit quantized version of [mGPT](https://huggingface.co/ai-forever
|
|
22 |
|
23 |
On the GPT scale, it is a similar # of parameters to GPT2-XL, but on 60+ languages.
|
24 |
|
|
|
|
|
25 |
My goal is to evaluate this on Arabic, Hindi, and Indonesian tasks, where there are fewer autoregressive language models in this size range.
|
26 |
|
27 |
For English: use a GPT model or LLaMa2-7B
|
28 |
|
29 |
-
[AI-Forever](https://huggingface.co/ai-forever)
|
30 |
|
31 |
## How was the model created?
|
32 |
|
@@ -55,5 +57,4 @@ qmodel.save_pretrained("model_name")
|
|
55 |
|
56 |
## Future steps
|
57 |
|
58 |
-
- mGPT could be further quantized (4-bit), but `model.save_pretrained()` currently throws a `NotImplementedError` error.
|
59 |
-
- It would be great to load and quantize the 10x larger mGPT-13B, but that would take more resources.
|
|
|
22 |
|
23 |
On the GPT scale, it is a similar # of parameters to GPT2-XL, but on 60+ languages.
|
24 |
|
25 |
+
AI-Forever also released a 13B-parameter model. I made an 8-bit quantized version with weights available here: https://huggingface.co/monsoon-nlp/mGPT-13B-quantized
|
26 |
+
|
27 |
My goal is to evaluate this on Arabic, Hindi, and Indonesian tasks, where there are fewer autoregressive language models in this size range.
|
28 |
|
29 |
For English: use a GPT model or LLaMa2-7B
|
30 |
|
31 |
+
In August 2023 [AI-Forever](https://huggingface.co/ai-forever) added 1.3B-param models for about 1/3 of the model's languages. If your language is Mongolian, for example, use mGPT-1.3B-mongol and not this one.
|
32 |
|
33 |
## How was the model created?
|
34 |
|
|
|
57 |
|
58 |
## Future steps
|
59 |
|
60 |
+
- mGPT could be further quantized (4-bit), but `model.save_pretrained()` currently throws a `NotImplementedError` error.
|
|