Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,14 @@ Note that this model has a non-commerical license as we used the Command R and C
|
|
24 |
|
25 |
We are currently working on a developing a commerically usable model, so stay tuned for that!
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
# Model results
|
28 |
|
29 |
We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines:
|
|
|
24 |
|
25 |
We are currently working on a developing a commerically usable model, so stay tuned for that!
|
26 |
|
27 |
+
# Model list
|
28 |
+
|
29 |
+
We have ORPO trained the following models using different proportions of the [lightblue/mitsu](https://huggingface.co/datasets/lightblue/mitsu) dataset:
|
30 |
+
* Trained on the top/bottom responses of all prompts in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full)
|
31 |
+
* Trained on the top/bottom responses of the prompts of the 75\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75)
|
32 |
+
* Trained on the top/bottom responses of the prompts of the 50\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half)
|
33 |
+
* Trained on the top/bottom responses of the prompts of the 25\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25)
|
34 |
+
|
35 |
# Model results
|
36 |
|
37 |
We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines:
|