xwinxu commited on
Commit
31efc9a
1 Parent(s): 8d0fec9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -17,12 +17,13 @@ metrics:
17
  - accuracy
18
  ---
19
 
20
-
21
  This repo contains the model and tokenizer checkpoints for:
22
  - model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
23
  - optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
24
  - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
25
- - via 3 iterations of KTO on one epoch of each training partition.
 
 
26
 
27
  To prompt this model, ensure that the format is consistent with that of TuluV2.
28
  For example, a prompt should be formatted as follows, where `<|user|>` corresponds to the human's role and `<|assistant|>` corresponds to the LLM's role.
 
17
  - accuracy
18
  ---
19
 
 
20
  This repo contains the model and tokenizer checkpoints for:
21
  - model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
22
  - optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
23
  - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
24
+ - via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the subsequeent.
25
+
26
+ **[03/06/2024]**: We are #2 on the (verified) [Alpaca Eval 2.0 Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) scoring **33.23**!
27
 
28
  To prompt this model, ensure that the format is consistent with that of TuluV2.
29
  For example, a prompt should be formatted as follows, where `<|user|>` corresponds to the human's role and `<|assistant|>` corresponds to the LLM's role.