ContextualAI
/

Contextual_KTO_Mistral_PairRM

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

xwinxu commited on Mar 7

Commit

31efc9a

•

1 Parent(s): 8d0fec9

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -17,12 +17,13 @@ metrics:
 - accuracy
 ---
 This repo contains the model and tokenizer checkpoints for:
 - model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
 - optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
 - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
-- via 3 iterations of KTO on one epoch of each training partition.
 To prompt this model, ensure that the format is consistent with that of TuluV2.
 For example, a prompt should be formatted as follows, where `<|user|>` corresponds to the human's role and `<|assistant|>` corresponds to the LLM's role.

 - accuracy
 ---
 This repo contains the model and tokenizer checkpoints for:
 - model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
 - optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
 - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
+- via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the subsequeent.
+**[03/06/2024]**: We are #2 on the (verified) [Alpaca Eval 2.0 Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) scoring **33.23**!
 To prompt this model, ensure that the format is consistent with that of TuluV2.
 For example, a prompt should be formatted as follows, where `<|user|>` corresponds to the human's role and `<|assistant|>` corresponds to the LLM's role.