Danivilanova commited on
Commit
12d0eab
1 Parent(s): 3544572

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -17,6 +17,63 @@ tags:
17
  inference: false
18
  ---
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  # MPT-30B-Instruct
21
 
22
  MPT-30B-Instruct is a model for short-form instruction following.
 
17
  inference: false
18
  ---
19
 
20
+ # MosaicML's MPT-30B-Instruct 8-bit
21
+
22
+ These files are .safetensors format model files for [MosaicML's MPT-30B-Instruct](https://huggingface.co/mosaicml/mpt-30b-instruct).
23
+
24
+ ## How to convert
25
+
26
+ ```python
27
+ # Load the model
28
+ name = 'mosaicml/mpt-30b-instruct'
29
+
30
+ config = transformers.AutoConfig.from_pretrained(name, trust_remote_code=True)
31
+ config.attn_config['attn_impl'] = 'triton' # change this to use triton-based FlashAttention
32
+ config.init_device = 'cuda:0' # For fast initialization directly on GPU!
33
+
34
+ start_time = time.time()
35
+ model = transformers.AutoModelForCausalLM.from_pretrained(
36
+ name,
37
+ config=config,
38
+ torch_dtype=torch.bfloat16, # Load model weights in bfloat16
39
+ trust_remote_code=True,
40
+ load_in_8bit=True
41
+ )
42
+
43
+ # Filter the non-tensor items
44
+ def filter_dict(dictionary):
45
+ filtered_dict = {key: value for key, value in dictionary.items() if "weight_format" not in key}
46
+ return filtered_dict
47
+
48
+ new_state_dict = filter_dict(model.state_dict())
49
+
50
+ # Save the 8-bit model
51
+ model.save_pretrained('mpt-30b-instruct-8bits', state_dict=new_state_dict, safe_serialization=True)
52
+ ```
53
+
54
+ ## How to use
55
+
56
+ ```python
57
+ # Load the model
58
+ model = transformers.AutoModelForCausalLM.from_pretrained(
59
+ 'mpt-30b-instruct-8bits',
60
+ trust_remote_code=True,
61
+ )
62
+
63
+ ```
64
+
65
+
66
+ ## Prompt template
67
+
68
+ ```
69
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
70
+
71
+ ### Instruction
72
+ {prompt}
73
+
74
+ ### Response
75
+ ```
76
+
77
  # MPT-30B-Instruct
78
 
79
  MPT-30B-Instruct is a model for short-form instruction following.