Files changed (1) hide show
  1. README.md +131 -0
README.md ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - merge
7
+ - lazymergekit
8
+ - gguf
9
+ - rlhf
10
+ - dpo
11
+ ---
12
+
13
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/TI7C8F2gk43gmI9U2L0uk.jpeg)
14
+
15
+ # πŸ‘‘ AlphaMonarch-7B
16
+
17
+ **tl;dr: AlphaMonarch-7B is a new DPO merge that retains all the reasoning abilities of the very best merges and significantly improves its conversational abilities. Kind of the best of both worlds in a 7B model. πŸŽ‰**
18
+
19
+ AlphaMonarch-7B is a DPO fine-tuned of [mlabonne/NeuralMonarch-7B](https://huggingface.co/mlabonne/NeuralMonarch-7B/) using the [argilla/OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/argilla/OpenHermes2.5-dpo-binarized-alpha) preference dataset.
20
+
21
+ It is based on a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
22
+ * [mlabonne/OmniTruthyBeagle-7B-v0](https://huggingface.co/mlabonne/OmniTruthyBeagle-7B-v0)
23
+ * [mlabonne/NeuBeagle-7B](https://huggingface.co/mlabonne/NeuBeagle-7B)
24
+ * [mlabonne/NeuralOmniBeagle-7B](https://huggingface.co/mlabonne/NeuralOmniBeagle-7B)
25
+
26
+ Special thanks to [Jon Durbin](https://huggingface.co/jondurbin), [Intel](https://huggingface.co/Intel), [Argilla](https://huggingface.co/argilla), and [Teknium](https://huggingface.co/teknium) for the preference datasets.
27
+
28
+ **Try the demo**: https://huggingface.co/spaces/mlabonne/AlphaMonarch-7B-GGUF-Chat
29
+
30
+ ## πŸ” Applications
31
+
32
+ This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
33
+
34
+ It is one of the very best 7B models in terms of instructing following and reasoning abilities and can be used for conversations, RP, and storytelling. Note that it tends to have a quite formal and sophisticated style, but it can be changed by modifying the prompt.
35
+
36
+ ## ⚑ Quantized models
37
+
38
+ * **GGUF**: https://huggingface.co/mlabonne/AlphaMonarch-7B-GGUF
39
+
40
+ ## πŸ† Evaluation
41
+
42
+ ### Nous
43
+
44
+ AlphaMonarch-7B is the best-performing 7B model on Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval)). See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
45
+
46
+ | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
47
+ |---|---:|---:|---:|---:|---:|
48
+ | [**AlphaMonarch-7B**](https://huggingface.co/mlabonne/AlphaMonarch-7B) [πŸ“„](https://gist.github.com/mlabonne/1d33c86824b3a11d2308e36db1ba41c1) | **62.74** | **45.37** | **77.01** | **78.39** | **50.2** |
49
+ | [NeuralMonarch-7B](https://huggingface.co/mlabonne/NeuralMonarch-7B) [πŸ“„](https://gist.github.com/mlabonne/64050c96c6aa261a8f5b403190c8dee4) | 62.73 | 45.31 | 76.99 | 78.35 | 50.28 |
50
+ | [Monarch-7B](https://huggingface.co/mlabonne/Monarch-7B) [πŸ“„](https://gist.github.com/mlabonne/0b8d057c5ece41e0290580a108c7a093) | 62.68 | 45.48 | 77.07 | 78.04 | 50.14 |
51
+ | [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) [πŸ“„](https://gist.github.com/mlabonne/88b21dd9698ffed75d6163ebdc2f6cc8) | 52.42 | 42.75 | 72.99 | 52.99 | 40.94 |
52
+ | [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B) [πŸ“„](https://gist.github.com/mlabonne/14687f1eb3425b166db511f31f8e66f6) | 53.51 | 43.67 | 73.24 | 55.37 | 41.76 |
53
+ | [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B) [πŸ“„](https://gist.github.com/mlabonne/ad0c665bbe581c8420136c3b52b3c15c) | 60.25 | 46.06 | 76.77 | 70.32 | 47.86 |
54
+ | [mlabonne/NeuralOmniBeagle-7B](https://huggingface.co/mlabonne/NeuralOmniBeagle-7B) [πŸ“„](https://gist.github.com/mlabonne/0e49d591787185fa5ae92ca5d9d4a1fd) | 62.3 | 45.85 | 77.26 | 76.06 | 50.03 |
55
+ | [eren23/dpo-binarized-NeuralTrix-7B](https://huggingface.co/eren23/dpo-binarized-NeuralTrix-7B) [πŸ“„](https://gist.github.com/CultriX-Github/dbdde67ead233df0c7c56f1b091f728c) | 62.5 | 44.57 | 76.34 | 79.81 | 49.27 |
56
+ | [CultriX/NeuralTrix-7B-dpo](https://huggingface.co/CultriX/NeuralTrix-7B-dpo) [πŸ“„](https://gist.github.com/CultriX-Github/df0502599867d4043b45d9dafb5976e8) | 62.5 | 44.61 | 76.33 | 79.8 | 49.24 |
57
+
58
+ ### EQ-bench
59
+
60
+ AlphaMonarch-7B is also outperforming 70B and 120B parameter models on [EQ-bench](https://eqbench.com/) by [Samuel J. Paech](https://twitter.com/sam_paech), who kindly ran the evaluations.
61
+
62
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/dnCFxieqLiAC3Ll6CfdZW.png)
63
+
64
+ ### MT-Bench
65
+
66
+ ```
67
+ ########## First turn ##########
68
+ score
69
+ model turn
70
+ gpt-4 1 8.95625
71
+ OmniBeagle-7B 1 8.31250
72
+ AlphaMonarch-7B 1 8.23750
73
+ claude-v1 1 8.15000
74
+ NeuralMonarch-7B 1 8.09375
75
+ gpt-3.5-turbo 1 8.07500
76
+ claude-instant-v1 1 7.80000
77
+
78
+ ########## Second turn ##########
79
+ score
80
+ model turn
81
+ gpt-4 2 9.025000
82
+ claude-instant-v1 2 8.012658
83
+ OmniBeagle-7B 2 7.837500
84
+ gpt-3.5-turbo 2 7.812500
85
+ claude-v1 2 7.650000
86
+ AlphaMonarch-7B 2 7.618750
87
+ NeuralMonarch-7B 2 7.375000
88
+
89
+ ########## Average ##########
90
+ score
91
+ model
92
+ gpt-4 8.990625
93
+ OmniBeagle-7B 8.075000
94
+ gpt-3.5-turbo 7.943750
95
+ AlphaMonarch-7B 7.928125
96
+ claude-instant-v1 7.905660
97
+ claude-v1 7.900000
98
+ NeuralMonarch-7B 7.734375
99
+ NeuralBeagle14-7B 7.628125
100
+ ```
101
+
102
+ ### Open LLM Leaderboard
103
+
104
+ AlphaMonarch-7B is one of the best-performing non-merge 7B models on the Open LLM Leaderboard:
105
+
106
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/njHxX_ERQaBssHqp17fMy.png)
107
+
108
+ ## πŸ’» Usage
109
+
110
+ ```python
111
+ !pip install -qU transformers accelerate
112
+
113
+ from transformers import AutoTokenizer
114
+ import transformers
115
+ import torch
116
+
117
+ model = "mlabonne/AlphaMonarch-7B"
118
+ messages = [{"role": "user", "content": "What is a large language model?"}]
119
+
120
+ tokenizer = AutoTokenizer.from_pretrained(model)
121
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
122
+ pipeline = transformers.pipeline(
123
+ "text-generation",
124
+ model=model,
125
+ torch_dtype=torch.float16,
126
+ device_map="auto",
127
+ )
128
+
129
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
130
+ print(outputs[0]["generated_text"])
131
+ ```