limcheekin commited on
Commit
cc85c06
·
1 Parent(s): b967b23

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - ctranslate2
7
+ - mpt-7b-instruct
8
+ - quantization
9
+ - int8
10
+ ---
11
+
12
+ # MPT-7B-Instruct Q8
13
+
14
+ The model is quantized version of the [mosaicml/mpt-7b-instruct](https://huggingface.co/mosaicml/mpt-7b-instruct) with int8 quantization.
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ The model being quantized using [CTranslate2](https://opennmt.net/CTranslate2/) with the following command:
21
+
22
+ ```
23
+ ct2-transformers-converter --model mosaicml/mpt-7b-instruct --output_dir mosaicml/mpt-7b-instruct-ct2 --copy_files tokenizer.json tokenizer_config.json special_tokens_map.json generation_config.json --quantization int8 --force --low_cpu_mem_usage --trust_remote_code
24
+ ```
25
+
26
+ If you want to perform the quantization yourself, you need to install the following dependencies:
27
+
28
+ ```
29
+ pip install -qU ctranslate2 transformers[torch] accelerate einops
30
+ ```
31
+
32
+ - **Shared by:** Lim Chee Kin
33
+ - **License:** Apache 2.0
34
+
35
+ ## How to Get Started with the Model
36
+
37
+ Use the code below to get started with the model.
38
+
39
+ ```python
40
+ import ctranslate2
41
+ import transformers
42
+
43
+ generator = ctranslate2.Generator("limcheekin/mpt-7b-instruct-ct2")
44
+ tokenizer = transformers.AutoTokenizer.from_pretrained("limcheekin/mpt-7b-instruct-ct2")
45
+
46
+ prompt = "Long long time ago, "
47
+ tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
48
+
49
+ results = generator.generate_batch([tokens], max_length=256, sampling_topk=10)
50
+
51
+ text = tokenizer.decode(results[0].sequences_ids[0])
52
+ ```
53
+
54
+ The code is taken from https://opennmt.net/CTranslate2/guides/transformers.html#mpt.
55
+
56
+ The key method of the code above is `generate_batch`, you can find out [its supported parameters here](https://opennmt.net/CTranslate2/python/ctranslate2.Generator.html#ctranslate2.Generator.generate_batch).
config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "layer_norm_epsilon": null,
5
+ "unk_token": "<|endoftext|>"
6
+ }
generation_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "transformers_version": "4.28.1",
4
+ "use_cache": false
5
+ }
model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:828b7681bee76e8c65d5fe8df3282e666a3750803c854f3f6d3576c5b3395875
3
+ size 6655048262
special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "unk_token": "<|endoftext|>"
5
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<|endoftext|>",
4
+ "clean_up_tokenization_spaces": true,
5
+ "eos_token": "<|endoftext|>",
6
+ "model_max_length": 2048,
7
+ "tokenizer_class": "GPTNeoXTokenizer",
8
+ "unk_token": "<|endoftext|>"
9
+ }
vocabulary.txt ADDED
The diff for this file is too large to render. See raw diff