hyxmmm commited on
Commit
9fc5366
β€’
1 Parent(s): 8133219

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -17,6 +17,13 @@ language:
17
 
18
  Infinity-Instruct-3M-0613-Llama3-70B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on [Infinity-Instruct-3M and Infinity-Instruct-0613](https://huggingface.co/datasets/BAAI/Infinity-Instruct) and showing favorable results on AlpacaEval 2.0 compared to GPT4-0613.
19
 
 
 
 
 
 
 
 
20
  ## **Training Details**
21
 
22
  <p align="center">
@@ -53,7 +60,7 @@ Thanks to [FlagScale](https://github.com/FlagOpen/FlagScale), we could concatena
53
 
54
  *denote the model is finetuned without reinforcement learning from human feedback (RLHF).
55
 
56
- We evaluate Infinity-Instruct-3M-0613-Llama3-70B on the two most popular instructions following benchmarks. Mt-Bench is a set of challenging multi-turn questions including code, math and routine dialogue. AlpacaEval2.0 is based on AlpacaFarm evaluation set. Both of these two benchmarks use GPT-4 to judge the model answer. AlpacaEval2.0 displays a high agreement rate with human-annotated benchmark, Chatbot Arena. The result shows that InfInstruct-3M-0613-Llama3-70B achieved 31.2 in AlpacaEval2.0, which is higher than the 30.4 of GPT4-0613 Turbo although it does not yet use RLHF.
57
 
58
  ## **How to use**
59
 
 
17
 
18
  Infinity-Instruct-3M-0613-Llama3-70B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on [Infinity-Instruct-3M and Infinity-Instruct-0613](https://huggingface.co/datasets/BAAI/Infinity-Instruct) and showing favorable results on AlpacaEval 2.0 compared to GPT4-0613.
19
 
20
+ ## **News**
21
+ - πŸ”₯πŸ”₯πŸ”₯[2024/06/28] We release the model weight of [InfInstruct-Llama3-70B 0613](https://huggingface.co/BAAI/Infinity-Instruct-3M-0613-Llama3-70B). It shows favorable results on AlpacaEval 2.0 compared to GPT4-0613 without RLHF.
22
+
23
+ - πŸ”₯πŸ”₯πŸ”₯[2024/06/21] We release the model weight of [InfInstruct-Mistral-7B 0613](https://huggingface.co/BAAI/Infinity-Instruct-3M-0613-Mistral-7B). It shows favorable results on AlpacaEval 2.0 compared to Mixtral 8x7B v0.1, Gemini Pro, and GPT-3.5 without RLHF.
24
+
25
+ - πŸ”₯πŸ”₯πŸ”₯[2024/06/13] We share the intermediate result of our data construction process (corresponding to the [InfInstruct-3M](https://huggingface.co/datasets/BAAI/Infinity-Instruct) in the table below). Our ongoing efforts focus on risk assessment and data generation. The finalized version with 10 million instructions is scheduled for release in late June.
26
+
27
  ## **Training Details**
28
 
29
  <p align="center">
 
60
 
61
  *denote the model is finetuned without reinforcement learning from human feedback (RLHF).
62
 
63
+ We evaluate Infinity-Instruct-3M-0613-Llama3-70B on the two most popular instructions following benchmarks. Mt-Bench is a set of challenging multi-turn questions including code, math and routine dialogue. AlpacaEval2.0 is based on AlpacaFarm evaluation set. Both of these two benchmarks use GPT-4 to judge the model answer. AlpacaEval2.0 displays a high agreement rate with human-annotated benchmark, Chatbot Arena. The result shows that InfInstruct-3M-0613-Llama3-70B achieved 31.2 in AlpacaEval2.0, which is higher than the 30.4 of GPT4-0613 Turbo although it does not yet use RLHF.
64
 
65
  ## **How to use**
66