monsoon-nlp commited on
Commit
e39d68d
1 Parent(s): 882fa80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -1,3 +1,62 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - ar
5
+ - hi
6
+ - id
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - multilingual
10
+ widget:
11
+ - text: 'في مدرستي السابقة'
12
+ example_title: Arabic prompt
13
+ - text: 'आप समुद्री लुटेरों के बारे में क्या जानते हैं?'
14
+ example_title: Hindi prompt
15
+ - text: 'Kucing saya suka'
16
+ example_title: Indonesian prompt
17
  ---
18
+
19
+ # mGPT-quantized
20
+
21
+ The concept: 8-bit quantized version of [mGPT-13B](https://huggingface.co/ai-forever/mGPT-13B), an LLM released by AI-Forever / Sberbank AI in 2022-2023.
22
+
23
+ On the GPT scale, it is a similar # of parameters to GPT-3, but trained on 60+ languages.
24
+
25
+ My goal is to evaluate this on Hindi and Indonesian tasks, where there are fewer autoregressive language models in this size range.
26
+
27
+ For English: use a GPT model or LLaMa2-7B
28
+
29
+ For Arabic: in August 2023 I would recommend the bilingual [JAIS model](https://huggingface.co/inception-mbzuai/jais-13b), which is also 13B parameters can be quantized.
30
+
31
+ In August 2023 AI-Forever added 1.3B-param models for 20+ languages. If your language is Mongolian, for example, it might be better to use mGPT-1.3B-mongol and not this one.
32
+
33
+ They also have a 1.3B param model for all languages, which I further quantized here: https://huggingface.co/monsoon-nlp/mGPT-quantized
34
+
35
+ ## How was the model created?
36
+
37
+ Quantization of mGPT-13B was done using `bitsandbytes` library, CoLab Pro with an A100 GPU, and a lot of space on Google Drive.
38
+
39
+ ```python
40
+ from transformers import BitsAndBytesConfig, GPT2LMHeadModel
41
+
42
+ quantization_config = BitsAndBytesConfig(
43
+ load_in_8bit=True,
44
+ bnb_8bit_compute_dtype=torch.bfloat16,
45
+ bnb_8bit_use_double_quant=True,
46
+ bnb_8bit_quant_type="nf4",
47
+ )
48
+
49
+ qmodel = GPT2LMHeadModel.from_pretrained(
50
+ "ai-forever/mGPT-13B",
51
+ load_in_8bit=True,
52
+ torch_dtype=torch.bfloat16,
53
+ quantization_config=quantization_config,
54
+ device_map="auto"
55
+ )
56
+
57
+ qmodel.save_pretrained("model_name")
58
+ ```
59
+
60
+ ## Future steps
61
+
62
+ - mGPT could be further quantized (4-bit), but `model.save_pretrained()` currently throws a `NotImplementedError` error.