mmnga commited on
Commit
f88f8f9
1 Parent(s): 051ea46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -12,14 +12,20 @@ inference: false
12
  This model is an experimental model created by merging [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) experts.
13
 
14
  # How we merged experts
15
- We simply take the average of every two experts.weight.
16
- The same goes for gate.weight.
17
- **Unfortunately, this model has a large hallucination. Look extraction version. -> [mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1](https://huggingface.co/mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1)**
 
 
 
18
 
19
  # How To Convert
20
  use colab cpu-high-memory.
21
  [convert_mixtral_8x7b_to_4x7b.ipynb](https://huggingface.co/mmnga/Mixtral-Fusion-4x7B-Instruct-v0.1/blob/main/notebook/convert_mixtral_8x7b_to_4x7b.ipynb)
22
 
 
 
 
23
  # Usage
24
  ~~~python
25
  pip install git+https://github.com/huggingface/transformers --upgrade
@@ -35,11 +41,10 @@ model_name_or_path = "mmnga/Mixtral-Fusion-4x7B-Instruct-v0.1"
35
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
36
  model = MixtralForCausalLM.from_pretrained(model_name_or_path, load_in_8bit=True)
37
 
38
- text = "Tell me what's for dinner tonight. "
39
  inputs = tokenizer(text, return_tensors="pt")
40
 
41
  outputs = model.generate(**inputs, max_new_tokens=128)
42
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
43
 
44
  ~~~
45
-
 
12
  This model is an experimental model created by merging [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) experts.
13
 
14
  # How we merged experts
15
+ Changed to merge using slerp.
16
+ [Discussion](https://huggingface.co/mmnga/Mixtral-Fusion-4x7B-Instruct-v0.1/discussions/2)
17
+
18
+ [old merge version](https://huggingface.co/mmnga/Mixtral-Fusion-4x7B-Instruct-v0.1/tree/v0.1.0)
19
+ ~~We simply take the average of every two experts.weight.~~
20
+ ~~The same goes for gate.weight.~~
21
 
22
  # How To Convert
23
  use colab cpu-high-memory.
24
  [convert_mixtral_8x7b_to_4x7b.ipynb](https://huggingface.co/mmnga/Mixtral-Fusion-4x7B-Instruct-v0.1/blob/main/notebook/convert_mixtral_8x7b_to_4x7b.ipynb)
25
 
26
+ # OtherModels
27
+ [mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1](https://huggingface.co/mmnga/Mixtral-Extraction-4x7B-Instruct-v0.1)
28
+
29
  # Usage
30
  ~~~python
31
  pip install git+https://github.com/huggingface/transformers --upgrade
 
41
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
42
  model = MixtralForCausalLM.from_pretrained(model_name_or_path, load_in_8bit=True)
43
 
44
+ text = "[INST] What was John Holt's vision on education? [/INST] "
45
  inputs = tokenizer(text, return_tensors="pt")
46
 
47
  outputs = model.generate(**inputs, max_new_tokens=128)
48
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
49
 
50
  ~~~