JosephusCheung commited on
Commit
755d331
1 Parent(s): f2cbd96

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -8,4 +8,6 @@ There are 8 completely different expert models based on Qwen-7B / CausalLM, six
8
 
9
  The initialization of the gate is based on the hidden state of the few-shot prompt input from each expert model and undergoes simple alignment training.
10
 
11
- Prompt format: ChatML
 
 
 
8
 
9
  The initialization of the gate is based on the hidden state of the few-shot prompt input from each expert model and undergoes simple alignment training.
10
 
11
+ Prompt format: ChatML
12
+
13
+ A simple verification found that the expert model occasionally had routing errors, resulting in suboptimal results and required further fine-tuning.