ArianAskari
/

NeuralHermes-2.5-Mistral-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ArianAskari commited on Jan 31, 2024

Commit

1c42183

·

verified ·

1 Parent(s): 40d42d6

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -1,3 +1,11 @@
 ---
 license: mit
 language:

+A variation of NeuralHermes 2.5 - Mistral 7B
+This is a variation of NeuralHermes which is based on the teknium/OpenHermes-2.5-Mistral-7B model that has been further fine-tuned with Direct Preference Optimization (DPO) using the mlabonne/chatml_dpo_pairs dataset. It surpasses the original model on most benchmarks (see results).
+It is directly inspired by the RLHF process described by Intel/neural-chat-7b-v3-1's authors to improve performance. I used the same dataset and reformatted it to apply the ChatML template.
+The code to train this model is available on Google Colab and GitHub. It required an A100 GPU for about an hour.
 ---
 license: mit
 language: