monsoon-nlp
/

mGPT-quantized

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions Community

monsoon-nlp commited on Sep 1, 2023

Commit

f70ac82

·

1 Parent(s): 3c3432f

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -22,11 +22,13 @@ The concept: 8-bit quantized version of [mGPT](https://huggingface.co/ai-forever
 On the GPT scale, it is a similar # of parameters to GPT2-XL, but on 60+ languages.
 My goal is to evaluate this on Arabic, Hindi, and Indonesian tasks, where there are fewer autoregressive language models in this size range.
 For English: use a GPT model or LLaMa2-7B
-[AI-Forever](https://huggingface.co/ai-forever) also released a 13B-parameter model, and in August 2023 added 1.3B-param models for about 1/3 of the model languages. If your language is Mongolian, for example, use mGPT-1.3B-mongol and not this one.
 ## How was the model created?
@@ -55,5 +57,4 @@ qmodel.save_pretrained("model_name")
 ## Future steps
-- mGPT could be further quantized (4-bit), but `model.save_pretrained()` currently throws a `NotImplementedError` error.
-- It would be great to load and quantize the 10x larger mGPT-13B, but that would take more resources.

 On the GPT scale, it is a similar # of parameters to GPT2-XL, but on 60+ languages.
+AI-Forever also released a 13B-parameter model. I made an 8-bit quantized version with weights available here: https://huggingface.co/monsoon-nlp/mGPT-13B-quantized
 My goal is to evaluate this on Arabic, Hindi, and Indonesian tasks, where there are fewer autoregressive language models in this size range.
 For English: use a GPT model or LLaMa2-7B
+In August 2023 [AI-Forever](https://huggingface.co/ai-forever) added 1.3B-param models for about 1/3 of the model's languages. If your language is Mongolian, for example, use mGPT-1.3B-mongol and not this one.
 ## How was the model created?
 ## Future steps
+- mGPT could be further quantized (4-bit), but `model.save_pretrained()` currently throws a `NotImplementedError` error.