cloudyu commited on
Commit
8b3e68a
1 Parent(s): f9db19e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+
5
+ # Mixtral MOE 2x34B
6
+
7
+
8
+
9
+ MoE of the following models :
10
+
11
+
12
+ * [NurtureAI/neural-chat-7b-v3-16k](https://huggingface.co/NurtureAI/neural-chat-7b-v3-16k)
13
+ * [mncai/mistral-7b-dpo-v6](https://huggingface.co/mncai/mistral-7b-dpo-v6)
14
+
15
+
16
+ * metrics:
17
+ Average 73.43
18
+ ARC 71.25
19
+ HellaSwag 87.45
20
+
21
+ gpu code example
22
+
23
+ ```
24
+ import torch
25
+ from transformers import AutoTokenizer, AutoModelForCausalLM
26
+ import math
27
+
28
+ ## v2 models
29
+ model_path = "cloudyu/Mixtral_34Bx2_MoE_60B"
30
+
31
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
32
+ model = AutoModelForCausalLM.from_pretrained(
33
+ model_path, torch_dtype=torch.float32, device_map='auto',local_files_only=False, load_in_4bit=True
34
+ )
35
+ print(model)
36
+ prompt = input("please input prompt:")
37
+ while len(prompt) > 0:
38
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
39
+
40
+ generation_output = model.generate(
41
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
42
+ )
43
+ print(tokenizer.decode(generation_output[0]))
44
+ prompt = input("please input prompt:")
45
+ ```
46
+
47
+ CPU example
48
+
49
+ ```
50
+ import torch
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+ import math
53
+
54
+ ## v2 models
55
+ model_path = "cloudyu/Mixtral_34Bx2_MoE_60B"
56
+
57
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ model_path, torch_dtype=torch.float32, device_map='cpu',local_files_only=False
60
+ )
61
+ print(model)
62
+ prompt = input("please input prompt:")
63
+ while len(prompt) > 0:
64
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
65
+
66
+ generation_output = model.generate(
67
+ input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
68
+ )
69
+ print(tokenizer.decode(generation_output[0]))
70
+ prompt = input("please input prompt:")
71
+
72
+ ```