Commit
•
085b914
1
Parent(s):
822d483
Fix wrong `argilla/ultrafeedback-binarized-preferences-cleaned` links
Browse files
README.md
CHANGED
@@ -23,7 +23,8 @@ model-index:
|
|
23 |
|
24 |
|
25 |
# Model Card for Notux 8x7B-v1
|
26 |
-
|
|
|
27 |
|
28 |
As of Dec 26th 2023, it outperforms `Mixtral-8x7B-Instruct-v0.1` and is the top ranked MoE (Mixture of Experts) model on the [Hugging Face Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
29 |
|
@@ -53,7 +54,7 @@ We used a VM with 8 x H100 40GB hosted in runpod.io for 1 epoch (~10hr)
|
|
53 |
|
54 |
### Training Data
|
55 |
|
56 |
-
We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/argilla/ultrafeedback-binarized-preferences-cleaned).
|
57 |
|
58 |
|
59 |
## Training procedure
|
|
|
23 |
|
24 |
|
25 |
# Model Card for Notux 8x7B-v1
|
26 |
+
|
27 |
+
This model is a preference-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned) dataset using DPO (Direct Preference Optimization).
|
28 |
|
29 |
As of Dec 26th 2023, it outperforms `Mixtral-8x7B-Instruct-v0.1` and is the top ranked MoE (Mixture of Experts) model on the [Hugging Face Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
30 |
|
|
|
54 |
|
55 |
### Training Data
|
56 |
|
57 |
+
We used a new iteration of the Argilla UltraFeedback preferences dataset named [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned).
|
58 |
|
59 |
|
60 |
## Training procedure
|