MoEMoEKKung commited on
Commit
342e3ff
1 Parent(s): 14850a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -7,7 +7,8 @@ license: cc-by-nc-sa-4.0
7
  # Frankenstein-MoE
8
 
9
  ### Method
10
- To initialize the gate projection weight of the MoE layer, the H6 trainset was sampled and used. We sampled 400 and selected the final 30 with low PPL.
 
11
  trufulqa used gpt4 to generate data.
12
 
13
  ### Evals
 
7
  # Frankenstein-MoE
8
 
9
  ### Method
10
+ To initialize the gate projection weight of the MoE layer, the H6 trainset was sampled and used. We sampled 400 and selected the final 30 with low PPL.
11
+
12
  trufulqa used gpt4 to generate data.
13
 
14
  ### Evals