Commit
•
4941dda
1
Parent(s):
a954f23
Update README.md
Browse files
README.md
CHANGED
@@ -23,11 +23,11 @@ model-index:
|
|
23 |
|
24 |
|
25 |
# Model Card for Notux 8x7B-v1
|
26 |
-
This model is a preference-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/argilla/ultrafeedback-binarized-preferences-cleaned) dataset using DPO (Direct Preference Optimization)
|
27 |
|
28 |
-
|
29 |
|
30 |
-
|
31 |
|
32 |
## Model Details
|
33 |
|
|
|
23 |
|
24 |
|
25 |
# Model Card for Notux 8x7B-v1
|
26 |
+
This model is a preference-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/argilla/ultrafeedback-binarized-preferences-cleaned) dataset using DPO (Direct Preference Optimization).
|
27 |
|
28 |
+
As of Dec 26th 2023, it outperforms `Mixtral-8x7B-Instruct-v0.1` and is the top ranked MoE (Mixture of Experts) model on the Hugging Face Open LLM Leaderboard.
|
29 |
|
30 |
+
This is part of the Notus family of models and experiments, where the Argilla team investigates data-first and preference tuning methods like dDPO (distilled DPO). This model is the result of our first experiment at tuning a MoE model that has already been fine-tuned with DPO (i.e., Mixtral-8x7B-Instruct-v0.1).
|
31 |
|
32 |
## Model Details
|
33 |
|