Amu commited on
Commit
0123ce0
1 Parent(s): 5466830

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -119,6 +119,8 @@ This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/m
119
  I think SPIN not only can use on a SFT model, but also it can use on a pretrained model.
120
  Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
121
 
 
 
122
  **I Think the best paradigm for training a conversational Large Language Model (LLM):
123
  pretrain -> dpo(spin) -> sft -> dpo(spin)**
124
 
 
119
  I think SPIN not only can use on a SFT model, but also it can use on a pretrained model.
120
  Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
121
 
122
+ But the ultrachat_200k dataset is a alignment dataset for sft model. I think there should use a alignment dataset for pretrained model.
123
+
124
  **I Think the best paradigm for training a conversational Large Language Model (LLM):
125
  pretrain -> dpo(spin) -> sft -> dpo(spin)**
126