Text Generation
Transformers
PyTorch
Safetensors
English
llama
goliath
deutsch
llama2
discoresearch
Inference Endpoints
text-generation-inference
LoneStriker commited on
Commit
6cb000c
1 Parent(s): 3b3d192

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - Open-Orca/SlimOrca-Dedup
4
+ - teknium/openhermes
5
+ - meta-math/MetaMathQA
6
+ - migtissera/Synthia-v1.3
7
+ - THUDM/AgentInstruct
8
+ - LeoLM/German_Songs
9
+ - LeoLM/German_Poems
10
+ - LeoLM/OpenSchnabeltier
11
+ - bjoernp/ultrachat_de
12
+ language:
13
+ - en
14
+ library_name: transformers
15
+ pipeline_tag: text-generation
16
+ license: llama2
17
+ model_creator: DiscoResearch
18
+ model_type: llama
19
+ tags:
20
+ - goliath
21
+ - deutsch
22
+ - llama2
23
+ - discoresearch
24
+ ---
25
+
26
+
27
+ <img src="imgs/disco_goliath.jpeg" width="600">
28
+
29
+ # DiscoLM 120b (Alpha)
30
+
31
+ **DiscoLM 120b (Alpha)** is an experimental 120b model based on [Alpindale´s Goliath 120b](https://huggingface.co/alpindale/goliath-120b), a merge of different Llama2-70b models, and further finetuned on a dataset of some the most popular open-source instruction sets.
32
+ Disco 120b is a [DiscoResearch](https://huggingface.co/DiscoResearch) project and was trained by [Björn Plüster](https://huggingface.co/bjoernp).
33
+
34
+ Many thanks to [LAION](https://laion.ai) and [HessianAI](https://hessian.ai/) for scientific supervision, coordination and compute resources provided for this project on supercomputer 42 by [HessianAI](https://hessian.ai/)!
35
+
36
+ <img src="https://hessian.ai/wp-content/themes/hessianai/img/hessian-ai-logo.svg" width="120">
37
+ <img src="https://avatars.githubusercontent.com/u/92627801?s=200&v=4" width="120">
38
+
39
+ ## Table of Contents
40
+
41
+ 1. [Download](#download)
42
+ 2. [Benchmarks](#benchmarks)
43
+ 3. [Prompt Format](#prompt-format)
44
+ 4. [Dataset](#dataset)
45
+ 5. [Acknowledgements](#acknowledgements)
46
+ 6. [Contact](#contact)
47
+ 7. [About DiscoResearch](#about-discoresearch)
48
+ 8. [Disclaimer](#disclaimer)
49
+
50
+ ## Download
51
+
52
+ | Huggingface | GPTQ | GGUF | AWQ | *Base Model* |
53
+ |-------|-------|-------|-------|-------|
54
+ | [Link](https://huggingface.co/DiscoResearch/DiscoLM-120b) | [Link](https://huggingface.co/TheBloke/DiscoLM-120b-GPTQ) | [Link](https://huggingface.co/TheBloke/DiscoLM-120b-GGUF) | [Link](https://huggingface.co/TheBloke/DiscoLM-120b-AWQ) | [Goliath 120b](https://huggingface.co/alpindale/goliath-120b) |
55
+
56
+ ## Benchmarks
57
+
58
+ ### Hugginface Leaderboard
59
+
60
+ This models is still an early Alpha and we can't guarantee that there isn't any contamination.
61
+ However, the average of **73.198** would earn the #2 spot on the HF leaderboard at the time of writing and the highest score for a >70b model yet.
62
+
63
+ | Metric | Value |
64
+ |-----------------------|-------|
65
+ | ARC (25-shot) | 69.54 |
66
+ | HellaSwag (10-shot) | 86.49 |
67
+ | MMLU (5-shot) | 70.32 |
68
+ | TruthfulQA (0-shot) | 61.42 |
69
+ | Winogrande (5-shot) | 83.03 |
70
+ | GSM8k (5-shot) | 68.39 |
71
+ | **Avg.** | **73.198** |
72
+
73
+ We use [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as the HuggingFace LLM Leaderboard.
74
+
75
+ ### FastEval
76
+
77
+ | Metric | Value |
78
+ |-----------------------|-------|
79
+ | GSM8K | 81.2 |
80
+ | Math | 22.3 |
81
+ | BBH | 72.9 |
82
+ | MMLU | 67.9 |
83
+ | **Avg.** | **53.3** |
84
+
85
+ This places DiscoLM 120b firmly ahead of gpt-3.5-turbo-0613 as seen on the screenshot of the current (sadly no longer maintained) FastEval CoT leaderboard:
86
+ ![FastEval Leaderboard](imgs/cot_leaderboard.png)
87
+
88
+ ### MTBench
89
+
90
+ ```json
91
+ {
92
+ "first_turn": 8.45,
93
+ "second_turn": 7.45,
94
+ "categories": {
95
+ "writing": 9.4,
96
+ "roleplay": 8.65,
97
+ "reasoning": 6.85,
98
+ "math": 5.55,
99
+ "coding": 4.95,
100
+ "extraction": 9.15,
101
+ "stem": 9.225,
102
+ "humanities": 9.825
103
+ },
104
+ "average": 7.95
105
+ }
106
+ ```
107
+ Screenshot of the current FastEval MT Bench leaderboard:
108
+ ![FastEval Leaderboard](imgs/mtbench_leaderboard.png)
109
+
110
+ ## Prompt Format
111
+
112
+ This model follows the ChatML format:
113
+
114
+ ```
115
+ <|im_start|>system
116
+ You are DiscoLM, a helpful assistant.
117
+ <|im_end|>
118
+ <|im_start|>user
119
+ Please tell me possible reasons to call a research collective "Disco Research"<|im_end|>
120
+ <|im_start|>assistant
121
+ ```
122
+
123
+ This formatting is also available via a pre-defined Transformers chat template, which means that lists of messages can be formatted for you with the apply_chat_template() method:
124
+
125
+ ```python
126
+ chat = [
127
+ {"role": "system", "content": "You are DiscoLM, a helpful assistant."},
128
+ {"role": "user", "content": "Please tell me possible reasons to call a research collective Disco Research"}
129
+ ]
130
+ tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
131
+ ```
132
+
133
+ If you use `tokenize=True` and `return_tensors="pt"` instead, then you will get a tokenized and formatted conversation ready to pass to `model.generate()`.
134
+
135
+ ## Dataset
136
+
137
+ The dataset curation for DiscoLM 120b followed a "brute force"/"PoC" approach, as one goal was to see whether a 120b model can "absorb" more instruction data than a 70b model.
138
+
139
+ The following datasets were used for training DiscoLM 120b:
140
+
141
+ * [SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
142
+ * [OpenSchnabeltier](https://huggingface.co/datasets/LeoLM/OpenSchnabeltier) translated to DE from [OpenPlatypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus)
143
+ * [OpenHermes](https://huggingface.co/datasets/teknium/openhermes)
144
+ * [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)
145
+ * [UltraChat DE](https://huggingface.co/datasets/bjoernp/ultrachat_de) translated to DE from [UltraChat](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)
146
+ * [Synthia v.1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3)
147
+ * [German_Songs](https://huggingface.co/datasets/LeoLM/German_Songs)
148
+ * [German_Poems](https://huggingface.co/datasets/LeoLM/German_Poems)
149
+ * Capybara Dataset by [Nous Research](https://huggingface.co/NousResearch/)
150
+ * Vezora/Tested-188k-Python (No longer available? Version changed to [Vezora/Tested-22k-Python-Alpaca](https://huggingface.co/datasets/Vezora/Tested-22k-Python-Alpaca))
151
+
152
+ Many thanks for all dataset providers/curators!
153
+
154
+ ## Contact
155
+
156
+ Best way to reach us is on our [Discord](https://discord.gg/S8W8B5nz3v).
157
+
158
+ ## About DiscoResearch
159
+
160
+ DiscoResearch is an aspiring open research community. Disco should be a place where researchers from many communities can come together to combine their expertise and create innovative and groundbreaking LLMs. Come join our Discord, share your opinions and ideas, and advance open LLM research with us!
161
+
162
+ ## Acknowledgements
163
+
164
+ Disco 120b is a [DiscoResearch](https://huggingface.co/DiscoResearch) project and was trained by [Björn Plüster](https://huggingface.co/bjoernp). [Jan Harries](https://huggingface.co/jphme) helped with technical adivce, logistics and the Model Card and [AutoMeta](https://huggingface.co/Alignment-Lab-AI) also provided helpful technical adivce.
165
+ The model was trained with compute provided by [HessianAI](https://hessian.ai/) in collaboration with [LAION](https://laion.ai) - many thanks in particular to [Patrick Schramowski](https://huggingface.co/PSaiml) for his support.
166
+
167
+ We are standing on the shoulders of giants; many thanks in no particular order to [LAION](https://laion.ai) and especially to [Christoph Schuhmann](https://laion.ai) who got us all connected,
168
+ [alpindale](https://huggingface.co/alpindale) for Goliath 120b (with important contributions by [Charles Goddard](https://huggingface.co/chargoddard) and [Undi95](https://huggingface.co/Undi95)), [TheBloke](https://huggingface.co/TheBloke) for providing quantized versions, [winglian](https://huggingface.co/winglian) for Axolotl which was used to train the model and the SlimOrca dataset, [garage-bAInd](https://huggingface.co/garage-bAInd), [Teknium](https://huggingface.co/teknium), [Migel Tissera](https://huggingface.co/migtissera), [MetaMath](https://huggingface.co/meta-math) for their great datasets (please contact us if we forgot to mention you here!).
169
+
170
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
171
+
172
+ ## Disclaimer
173
+
174
+ The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model.
175
+ This model should only be used for research purposes. The original Llama2 license and all restrictions of datasets used to train this model apply.
added_tokens.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "</s>": 2,
3
+ "<s>": 1,
4
+ "<unk>": 0,
5
+ "<|im_end|>": 32000,
6
+ "<|im_start|>": 32001
7
+ }
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "alpindale/goliath-120b",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 8192,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 28672,
13
+ "max_position_embeddings": 4096,
14
+ "model_type": "llama",
15
+ "num_attention_heads": 64,
16
+ "num_hidden_layers": 137,
17
+ "num_key_value_heads": 8,
18
+ "pretraining_tp": 1,
19
+ "rms_norm_eps": 1e-05,
20
+ "rope_scaling": null,
21
+ "rope_theta": 10000.0,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "float16",
24
+ "transformers_version": "4.34.0",
25
+ "use_cache": true,
26
+ "vocab_size": 32032
27
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.34.0"
6
+ }
huggingface-metadata.txt ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ url: https://huggingface.co/DiscoResearch/DiscoLM-120b
2
+ branch: main
3
+ download date: 2023-12-09 08:59:58
4
+ sha256sum:
5
+ 5d7745d5d27b6aab7603eded5ac8d04beb6400a0e8ee9ca37d359bf5f63c7870 model-00001-of-00024.safetensors
6
+ bf15b9a67c0e27c51f719a77e30db43c6ba3e68be883a5af97d64191cc52c2e5 model-00002-of-00024.safetensors
7
+ 286cf28355ae3d7671fab2350519d704f99ca1b801a41c9b7076b4ea52a7bee3 model-00003-of-00024.safetensors
8
+ a57b5561b72d64dba85aeff007137e1aa06c6472e01633c7b5698787c98bb874 model-00004-of-00024.safetensors
9
+ 736016f98195f0e7f1065d262b933051f17dcd18cef386d98c75eaf8c53290b0 model-00005-of-00024.safetensors
10
+ 602dd5973bed10c3aa0bcde64ba7434ed8a560e031ec2882edbe18a8b1072576 model-00006-of-00024.safetensors
11
+ 62cdd7479739c3ff2211e1bd3bc0253e17b03af99933f7eef5bec1c323e6919b model-00007-of-00024.safetensors
12
+ 3facffec8a7494996e07b3ae494520d29a931c5643b2250a985c27f826fe0dd8 model-00008-of-00024.safetensors
13
+ f40347240a2571e4ad0ff6148132da65a4bb6b41d0d3496b8c52d9246c35a19d model-00009-of-00024.safetensors
14
+ 4312012b91255578969d385b556e54a91fe6388a99279b1267490ae5574900f2 model-00010-of-00024.safetensors
15
+ b62777dabfd6c706f316f252fc2d21ef652a611de22b0ff0ae64d64e79072c51 model-00011-of-00024.safetensors
16
+ 3666ada228d31ed1761453ae03f49baf548039e0617ffc64ceb9d1a62f8b94a8 model-00012-of-00024.safetensors
17
+ c71beee6dab377ecaa9786c8c49140203a7b6adfdcaee4899e70385cec6be22b model-00013-of-00024.safetensors
18
+ 27d983d3fad6739b87ed70b21a0125df9570a6bc4767453191414b43e92d3745 model-00014-of-00024.safetensors
19
+ ad22dc328097c648400ce97d3e59be57b9014be22fd9c85f7768faf8d86736e8 model-00015-of-00024.safetensors
20
+ 3fee1d394a49fe68ce653e951461fa77927cdb8410a1c899e2191f85c5e5fa98 model-00016-of-00024.safetensors
21
+ 53e0fc80e3034a3930a03c5ff192be7a5a5a0ccef8134294244a9e8d2e4d79cb model-00017-of-00024.safetensors
22
+ 988ecf1c9e8e2727c832ab47fe2e31350fae43d8852e8cf298f111d2de1de7da model-00018-of-00024.safetensors
23
+ c1b94c1abb1bbeab4f19e74dda158212fe2be879ec3473478de050c35bb449b5 model-00019-of-00024.safetensors
24
+ 9f0612ba92fb8f8f32d9ab8343ad32d0eac0fd75fb2cb46b51c9772a02c89969 model-00020-of-00024.safetensors
25
+ 4a15d7d8d0c0c72bf9a78f443d28d8040808a474d98525d0237258dc694ac6cd model-00021-of-00024.safetensors
26
+ 004d697bc844e652690a35561d9d8065ebcf2e16edb98c7f55895459e7c46720 model-00022-of-00024.safetensors
27
+ 6bf94f3a7d8d835a862f4e6e2c773cf58a02871a9ea6dda83d5aa2d17631fc3e model-00023-of-00024.safetensors
28
+ 64cc9b8beb71f2523aa67ccb15ea76fbac243814c6f684cd1161c86c52478cd8 model-00024-of-00024.safetensors
29
+ 703e6a21f5f80340c340e61e5cd8528e1ab376b36a7f1605d5c01c087886bc77 pytorch_model-00001-of-00024.bin
30
+ 0b9aad1976a13a8ea067990910aa053a931309f22125e97120d5f18016f6f5d1 pytorch_model-00002-of-00024.bin
31
+ 4843f6af6e7b11443f4ce8386e7e90183146785e6e5f486d765b508a52dedbbb pytorch_model-00003-of-00024.bin
32
+ bbed432e03f0a3e7bbbfcc06d80fb776247b30aa19214868c9462f3c9328b510 pytorch_model-00004-of-00024.bin
33
+ 1b2cf32d7b276b39815c1d9468fdc7d6e9a3e9dba07dcfa915f0418fc6e575c8 pytorch_model-00005-of-00024.bin
34
+ cc4b1a8780a0fc894c39398e3c161bd0b38f07d5fed483bf63c89abd1299c5c0 pytorch_model-00006-of-00024.bin
35
+ 110dfb6eb79136d2c16993a367b7ee0eafff4a320567a782947c55ba2b4f9606 pytorch_model-00007-of-00024.bin
36
+ 0888f80ce1d7404395956f21e03552aa2daf7e7cdea8e9eb69b59c16471ad424 pytorch_model-00008-of-00024.bin
37
+ 5128e084c9f91ca5d8ecfd42da6ea35682d01f8a97fc0072596bac770b2c0a1c pytorch_model-00009-of-00024.bin
38
+ f35a06fca3b79e2797b05539b0400ab8697b88f51f0b34fceb5536101319a0e8 pytorch_model-00010-of-00024.bin
39
+ 6aa66f9cd7ae407deac3fea37e18bdb81da6cd915df56cdb127b350341db0548 pytorch_model-00011-of-00024.bin
40
+ 9ab5aaf07f7bc0920dca5f27614f1d2248cf6d4cb33556b387cfbbbaf587da11 pytorch_model-00012-of-00024.bin
41
+ 97617b7f47ec43934b15f7f3fbd288bf2c8184bda232b902c3528405dedea818 pytorch_model-00013-of-00024.bin
42
+ 68f64eeb00dda4392039f2efcd6b555ee49ed4fd7b5f731163fb4579a3f1b850 pytorch_model-00014-of-00024.bin
43
+ b19bfb818160cb56dffb7434a4ed9c783123775ede9e276bfc88e8615f87042a pytorch_model-00015-of-00024.bin
44
+ c0df2ab8bc5f809636c1c53c3e9acd26d64601b145cc61f809f590f139158469 pytorch_model-00016-of-00024.bin
45
+ d18f0ae5507c17837ed0c20257548a26332185743ee9ab0a87a6ac75f8609dd3 pytorch_model-00017-of-00024.bin
46
+ 0f14867fefdefc4173822977fb585168e15be49deb0047db40bafa386e55d3c6 pytorch_model-00018-of-00024.bin
47
+ bd7b90be4c3613e22ed1f0a0505dba610020aa135f956a838817f89aaedb142b pytorch_model-00019-of-00024.bin
48
+ 7543e3e5972532ebfb2131f1248c8ab818cd4159863425fa67b2df48b4c98ec8 pytorch_model-00020-of-00024.bin
49
+ fa0e4d0af437d853965e64017f7fac8eb52f075d07f1ac04e3973f60026d58ed pytorch_model-00021-of-00024.bin
50
+ 83064cfa842547bb784d7d6178fd049311c4aff42feb0ada891c7585a0decda8 pytorch_model-00022-of-00024.bin
51
+ a967860faaed0db1e6edb41c4cb82ee25012e26f178aafc1a8145c3278d24762 pytorch_model-00023-of-00024.bin
52
+ 58d4ce94410f315c86f15f7439ed4d68e5341f1ff136f502edc974244a43ec53 pytorch_model-00024-of-00024.bin
53
+ 9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347 tokenizer.model
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
output-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c9d41bcef98c72a170c1ea23e36343bd02b7cdac225439c47bd9e1ebbd05ecd
3
+ size 8581026016
output-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c8252b83c4df3344181258554fdda664890fd2cc07431ab03cf272433971ec1
3
+ size 8588188344
output-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f464a8ee589aa2a8df0a593276e71dbc5f3a967b5438276c1212d418569de818
3
+ size 8555819944
output-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e99c529d224aece834029cdbf54c9ed15254cc815f907f7946c380001ee55cc5
3
+ size 8552624024
output-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79bb80b80c29de8a014b33a7e6cf0aa39157660da8a5fe5e587b973f3efb6d9e
3
+ size 8503964344
output-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3593c2de2bd4feab1812392c65cf71cc9fecfe4ecf11ff11664a171cc3ddbe9f
3
+ size 1952914560
pytorch_model.bin.index.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<unk>",
4
+ "<s>",
5
+ "</s>"
6
+ ],
7
+ "bos_token": "<s>",
8
+ "eos_token": "<|im_end|>",
9
+ "pad_token": "</s>",
10
+ "unk_token": "<unk>"
11
+ }
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "32000": {
30
+ "content": "<|im_end|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "32001": {
38
+ "content": "<|im_start|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": false
44
+ }
45
+ },
46
+ "additional_special_tokens": [
47
+ "<unk>",
48
+ "<s>",
49
+ "</s>"
50
+ ],
51
+ "bos_token": "<s>",
52
+ "clean_up_tokenization_spaces": false,
53
+ "eos_token": "<|im_end|>",
54
+ "legacy": false,
55
+ "model_max_length": 1000000000000000019884624838656,
56
+ "pad_token": "</s>",
57
+ "sp_model_kwargs": {},
58
+ "spaces_between_special_tokens": false,
59
+ "tokenizer_class": "LlamaTokenizer",
60
+ "tokenizer_file": null,
61
+ "trust_remote_code": true,
62
+ "unk_token": "<unk>",
63
+ "use_default_system_prompt": true,
64
+ "use_fast": false
65
+ }