blackmount8 commited on
Commit
fd866ea
โ€ข
1 Parent(s): c368e60

Initial commit.

Browse files
README.md CHANGED
@@ -1,3 +1,164 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - llama-2
6
+ - self-instruct
7
+ - distillation
8
+ - synthetic instruction
9
+ license:
10
+ - mit
11
  ---
12
+
13
+ # blackmount8/Nous-Hermes-Llama2-13b-int8_float16
14
+
15
+ Int8_float16 version of [NousResearch/Nous-Hermes-Llama2-13b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b), quantized using CTranslate2.
16
+
17
+ # Model Card: Nous-Hermes-Llama2-13b
18
+
19
+ Compute provided by our project sponsor Redmond AI, thank you! Follow RedmondAI on Twitter @RedmondAI.
20
+
21
+ ## Model Description
22
+
23
+ Nous-Hermes-Llama2-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Emozilla leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
24
+
25
+ This Hermes model uses the exact same dataset as Hermes on Llama-1. This is to ensure consistency between the old Hermes and new, for anyone who wanted to keep Hermes as similar to the old one, just more capable.
26
+
27
+ This model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 4096 sequence length on an 8x a100 80GB DGX machine.
28
+
29
+ ## Example Outputs:
30
+ ![Example4](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b/resolve/main/example5.png "Example 4")
31
+ ![Example1](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b/resolve/main/Example1.png "Example 1")
32
+ ![Example2](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b/resolve/main/example2.png "Example 2")
33
+ ![Example3](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b/resolve/main/example3.png "Example 3")
34
+
35
+ ## Model Training
36
+
37
+ The model was trained almost entirely on synthetic GPT-4 outputs. Curating high quality GPT-4 datasets enables incredibly high quality in knowledge, task completion, and style.
38
+
39
+ This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), and several others, detailed further below
40
+
41
+ ## Collaborators
42
+ The model fine-tuning and the datasets were a collaboration of efforts and resources between Teknium, Karan4D, Emozilla, Huemin Art, and Redmond AI.
43
+
44
+ Special mention goes to @winglian for assisting in some of the training issues.
45
+
46
+ Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly.
47
+
48
+ Among the contributors of datasets:
49
+ - GPTeacher was made available by Teknium
50
+ - Wizard LM by nlpxucan
51
+ - Nous Research Instruct Dataset was provided by Karan4D and HueminArt.
52
+ - GPT4-LLM and Unnatural Instructions were provided by Microsoft
53
+ - Airoboros dataset by jondurbin
54
+ - Camel-AI's domain expert datasets are from Camel-AI
55
+ - CodeAlpaca dataset by Sahil 2801.
56
+
57
+ If anyone was left out, please open a thread in the community tab.
58
+
59
+ ## Prompt Format
60
+
61
+ The model follows the Alpaca prompt format:
62
+ ```
63
+ ### Instruction:
64
+ <prompt>
65
+
66
+ ### Response:
67
+ <leave a newline blank for model to respond>
68
+
69
+ ```
70
+
71
+ or
72
+
73
+ ```
74
+ ### Instruction:
75
+ <prompt>
76
+
77
+ ### Input:
78
+ <additional context>
79
+
80
+ ### Response:
81
+ <leave a newline blank for model to respond>
82
+
83
+ ```
84
+
85
+ ## Benchmark Results
86
+ AGI-Eval
87
+ ```
88
+ | Task |Version| Metric |Value | |Stderr|
89
+ |agieval_aqua_rat | 0|acc |0.2362|ยฑ |0.0267|
90
+ | | |acc_norm|0.2480|ยฑ |0.0272|
91
+ |agieval_logiqa_en | 0|acc |0.3425|ยฑ |0.0186|
92
+ | | |acc_norm|0.3472|ยฑ |0.0187|
93
+ |agieval_lsat_ar | 0|acc |0.2522|ยฑ |0.0287|
94
+ | | |acc_norm|0.2087|ยฑ |0.0269|
95
+ |agieval_lsat_lr | 0|acc |0.3510|ยฑ |0.0212|
96
+ | | |acc_norm|0.3627|ยฑ |0.0213|
97
+ |agieval_lsat_rc | 0|acc |0.4647|ยฑ |0.0305|
98
+ | | |acc_norm|0.4424|ยฑ |0.0303|
99
+ |agieval_sat_en | 0|acc |0.6602|ยฑ |0.0331|
100
+ | | |acc_norm|0.6165|ยฑ |0.0340|
101
+ |agieval_sat_en_without_passage| 0|acc |0.4320|ยฑ |0.0346|
102
+ | | |acc_norm|0.4272|ยฑ |0.0345|
103
+ |agieval_sat_math | 0|acc |0.2909|ยฑ |0.0307|
104
+ | | |acc_norm|0.2727|ยฑ |0.0301|
105
+ ```
106
+ GPT-4All Benchmark Set
107
+ ```
108
+ | Task |Version| Metric |Value | |Stderr|
109
+ |arc_challenge| 0|acc |0.5102|ยฑ |0.0146|
110
+ | | |acc_norm|0.5213|ยฑ |0.0146|
111
+ |arc_easy | 0|acc |0.7959|ยฑ |0.0083|
112
+ | | |acc_norm|0.7567|ยฑ |0.0088|
113
+ |boolq | 1|acc |0.8394|ยฑ |0.0064|
114
+ |hellaswag | 0|acc |0.6164|ยฑ |0.0049|
115
+ | | |acc_norm|0.8009|ยฑ |0.0040|
116
+ |openbookqa | 0|acc |0.3580|ยฑ |0.0215|
117
+ | | |acc_norm|0.4620|ยฑ |0.0223|
118
+ |piqa | 0|acc |0.7992|ยฑ |0.0093|
119
+ | | |acc_norm|0.8069|ยฑ |0.0092|
120
+ |winogrande | 0|acc |0.7127|ยฑ |0.0127|
121
+ ```
122
+ BigBench Reasoning Test
123
+ ```
124
+ | Task |Version| Metric |Value | |Stderr|
125
+
126
+ |bigbench_causal_judgement | 0|multiple_choice_grade|0.5526|ยฑ |0.0362|
127
+ |bigbench_date_understanding | 0|multiple_choice_grade|0.7344|ยฑ |0.0230|
128
+ |bigbench_disambiguation_qa | 0|multiple_choice_grade|0.2636|ยฑ |0.0275|
129
+ |bigbench_geometric_shapes | 0|multiple_choice_grade|0.0195|ยฑ |0.0073|
130
+ | | |exact_str_match |0.0000|ยฑ |0.0000|
131
+ |bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2760|ยฑ |0.0200|
132
+ |bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.2100|ยฑ |0.0154|
133
+ |bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.4400|ยฑ |0.0287|
134
+ |bigbench_movie_recommendation | 0|multiple_choice_grade|0.2440|ยฑ |0.0192|
135
+ |bigbench_navigate | 0|multiple_choice_grade|0.4950|ยฑ |0.0158|
136
+ |bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.5570|ยฑ |0.0111|
137
+ |bigbench_ruin_names | 0|multiple_choice_grade|0.3728|ยฑ |0.0229|
138
+ |bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.1854|ยฑ |0.0123|
139
+ |bigbench_snarks | 0|multiple_choice_grade|0.6298|ยฑ |0.0360|
140
+ |bigbench_sports_understanding | 0|multiple_choice_grade|0.6156|ยฑ |0.0155|
141
+ |bigbench_temporal_sequences | 0|multiple_choice_grade|0.3140|ยฑ |0.0147|
142
+ |bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.2032|ยฑ |0.0114|
143
+ |bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1406|ยฑ |0.0083|
144
+ |bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.4400|ยฑ |0.0287|
145
+ ```
146
+
147
+ These are the highest benchmarks Hermes has seen on every metric, achieving the following average scores:
148
+ - GPT4All benchmark average is now 70.0 - from 68.8 in Hermes-Llama1
149
+ - 0.3657 on BigBench, up from 0.328 on hermes-llama1
150
+ - 0.372 on AGIEval, up from 0.354 on Hermes-llama1
151
+
152
+ These benchmarks currently have us at #1 on ARC-c, ARC-e, Hellaswag, and OpenBookQA, and 2nd place on Winogrande, comparing to GPT4all's benchmarking list, supplanting Hermes 1 for the new top position.
153
+
154
+ ## Resources for Applied Use Cases:
155
+ For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord
156
+ For an example of a roleplaying discord chatbot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot
157
+
158
+ ## Future Plans
159
+ We plan to continue to iterate on both more high quality data, and new data filtering techniques to eliminate lower quality data going forward.
160
+
161
+ ## Model Usage
162
+ The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
163
+
164
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
added_tokens.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "<pad>": 32000,
3
+ "ๆฐ–": 32010,
4
+ "ๆฐŸ": 32009,
5
+ "ๆฐฆ": 32002,
6
+ "ๆฐง": 32008,
7
+ "ๆฐซ": 32001,
8
+ "ๆฐฌ": 32018,
9
+ "ๆฐฎ": 32007,
10
+ "ๆฐฏ": 32017,
11
+ "็‡": 32015,
12
+ "็ก…": 32014,
13
+ "็กซ": 32016,
14
+ "็กผ": 32005,
15
+ "็ขณ": 32006,
16
+ "้‡ฉ": 32023,
17
+ "้ˆ‰": 32011,
18
+ "้ˆฃ": 32020,
19
+ "้ˆฆ": 32022,
20
+ "้ˆง": 32021,
21
+ "้ˆท": 32027,
22
+ "้ˆน": 32004,
23
+ "้‰€": 32019,
24
+ "้‰ป": 32024,
25
+ "้Š…": 32029,
26
+ "้‹": 32013,
27
+ "้‹…": 32030,
28
+ "้‹ฐ": 32003,
29
+ "้Œณ": 32025,
30
+ "้Ž‚": 32012,
31
+ "้Žณ": 32028,
32
+ "้Žต": 32031,
33
+ "้ต": 32026
34
+ }
config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "layer_norm_epsilon": 1e-05,
5
+ "unk_token": "<unk>"
6
+ }
model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71f235c057232fd8968bd500de8788a0ed8b123a23035bad91249c3f51b4d596
3
+ size 13026245342
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<unk>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": true,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "bos_token": {
5
+ "__type": "AddedToken",
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "clean_up_tokenization_spaces": false,
13
+ "eos_token": {
14
+ "__type": "AddedToken",
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": true,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "legacy": false,
22
+ "model_max_length": 1000000000000000019884624838656,
23
+ "pad_token": null,
24
+ "sp_model_kwargs": {},
25
+ "tokenizer_class": "LlamaTokenizer",
26
+ "unk_token": {
27
+ "__type": "AddedToken",
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ }
34
+ }
vocabulary.json ADDED
The diff for this file is too large to render. See raw diff