argilla
/

notux-8x7b-v1

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

dvilasuero HF staff commited on Dec 27, 2023

Commit

4941dda

•

1 Parent(s): a954f23

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -23,11 +23,11 @@ model-index:
 # Model Card for Notux 8x7B-v1
-This model is a preference-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/argilla/ultrafeedback-binarized-preferences-cleaned) dataset using DPO (Direct Preference Optimization)..
-This is part of the Notus family of models and experiments, where the Argilla team investigates data-first and preference tuning methods like dDPO (distilled DPO). This model is the result of our first experiment at tuning a MoE model that has already been fine-tuned with DPO (i.e., Mixtral-8x7B-Instruct-v0.1).
-As of 26th Dec, it outperforms its base model `Mixtral-8x7B-Instruct-v0.1` and it's the top ranked MoE (Mixture of Experts) model on the Hugging Face Open LLM Leaderboard.
 ## Model Details

 # Model Card for Notux 8x7B-v1
+This model is a preference-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/argilla/ultrafeedback-binarized-preferences-cleaned) dataset using DPO (Direct Preference Optimization).
+As of Dec 26th 2023, it outperforms `Mixtral-8x7B-Instruct-v0.1` and is the top ranked MoE (Mixture of Experts) model on the Hugging Face Open LLM Leaderboard.
+This is part of the Notus family of models and experiments, where the Argilla team investigates data-first and preference tuning methods like dDPO (distilled DPO). This model is the result of our first experiment at tuning a MoE model that has already been fine-tuned with DPO (i.e., Mixtral-8x7B-Instruct-v0.1).
 ## Model Details