wandb
/

zephyr-orpo-7b-v0.2

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tcapelle commited on Apr 21, 2024

Commit

ba89a59

•

1 Parent(s): 8ebe28b

Update README.md

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -28,4 +28,24 @@ We trained using the [alignment handbook recipe](https://github.com/huggingface/
 Visit the [W&B workspace here](https://wandb.ai/llm_surgery/mistral_zephyr_orpo_v0.2?nw=nwusercapecape)
 ## Trained on a single H100 for 2 hours!

 Visit the [W&B workspace here](https://wandb.ai/llm_surgery/mistral_zephyr_orpo_v0.2?nw=nwusercapecape)
+## Results:
+- MT bench
+```
+########## First turn ##########
+                            score
+model               turn
+zephyr-orpo-7b-v0.2 1     7.44375
+########## Second turn ##########
+                          score
+model               turn
+zephyr-orpo-7b-v0.2 2     6.875
+########## Average ##########
+                        score
+model
+zephyr-orpo-7b-v0.2  7.159375
+```
 ## Trained on a single H100 for 2 hours!