LoneStriker commited on
Commit
491a492
1 Parent(s): 044dc8f

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - openbmb/Eurux-8x22b-nca
4
+ - alpindale/WizardLM-2-8x22B
5
+ - fireworks-ai/mixtral-8x22b-instruct-oh
6
+ - migtissera/Tess-2.0-Mixtral-8x22B
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+
12
+ ---
13
+ # WizardLM-2-8x22B-Beige
14
+
15
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
+
17
+
18
+ ## Merge Details
19
+ ### Merge Method
20
+
21
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [alpindale/WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B) as a base.
22
+
23
+ ### Models Merged
24
+
25
+ The following models were included in the merge:
26
+ * [openbmb/Eurux-8x22b-nca](https://huggingface.co/openbmb/Eurux-8x22b-nca)
27
+ * [fireworks-ai/mixtral-8x22b-instruct-oh](https://huggingface.co/fireworks-ai/mixtral-8x22b-instruct-oh)
28
+ * [migtissera/Tess-2.0-Mixtral-8x22B](https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B)
29
+
30
+ ### Configuration
31
+
32
+ The following YAML configuration was used to produce this model:
33
+
34
+ ```yaml
35
+ models:
36
+ - model: alpindale/WizardLM-2-8x22B
37
+ - model: openbmb/Eurux-8x22b-nca
38
+ - model: migtissera/Tess-2.0-Mixtral-8x22B
39
+ - model: fireworks-ai/mixtral-8x22b-instruct-oh
40
+ base_model: alpindale/WizardLM-2-8x22B
41
+ merge_method: model_stock
42
+ dtype: bfloat16
43
+
44
+ ```
45
+
46
+ This model is an attempt to tone down the purple prose of WizardLM2-8X22B
47
+ Here's an example:
48
+
49
+ Prompt:
50
+ ```
51
+ USER: A dead cat is placed into a box along with a nuclear isotope, a vial of poison and a radiation detector. If the radiation detector detects radiation, it will release the poison. The box is opened one day later. What is the probability of the cat being alive?
52
+ Repeat the question before answering it.
53
+ ```
54
+
55
+
56
+ WizardLM2-8X22B-Beige:
57
+ ```
58
+ ASSISTANT: A dead cat is placed into a box along with a nuclear isotope, a vial of poison, and a radiation detector. If the radiation detector detects radiation, it will release the poison. The box is opened one day later. What is the probability of the cat being alive?
59
+ The probability of the cat being alive is zero. Since the cat is dead when placed into the box, it will remain dead regardless of the radiation detector or the poison being released or not.
60
+ ```
61
+
62
+ WizardLM2-8X22B:
63
+ ```
64
+ The scenario you've described is a thought experiment that is often referred to as "Schrödinger's cat," a famous paradox in quantum mechanics. The experiment was devised by physicist Erwin Schrödinger in 1935. Repeating the question: If a dead cat is placed into a box along with a nuclear isotope, a vial of poison, and a radiation detector that will release the poison upon detecting radiation, what is the probability of the cat being alive when the box is opened one day later?
65
+
66
+ In the original paradox, the cat is considered to be simultaneously alive and dead until the box is opened and the cat is observed. This is a way to illustrate the concept of superposition in quantum mechanics, where a quantum system is said to exist in all possible states simultaneously until it is measured.
67
+
68
+ However, in the scenario you've described, the cat is explicitly stated to be dead when placed in the box. Therefore, regardless of the quantum state of the isotope or the function of the radiation detector and poison, the cat is not alive to begin with. The probability of the cat being alive when the box is opened is zero because it was dead at the outset.
69
+
70
+ In the classic Schrödinger's cat thought experiment, if the cat were alive when placed in the box, quantum mechanics would suggest that there is a non-zero probability that the cat could be either alive or dead when the box is opened, depending on whether the isotope has decayed and triggered the release of the poison. But in your version, with the cat being definitively dead initially, there is no probability of it being alive later—it remains dead.
71
+ ```
config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "alpindale/WizardLM-2-8x22B",
3
+ "architectures": [
4
+ "MixtralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 6144,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 16384,
13
+ "max_position_embeddings": 65536,
14
+ "model_type": "mixtral",
15
+ "num_attention_heads": 48,
16
+ "num_experts_per_tok": 2,
17
+ "num_hidden_layers": 56,
18
+ "num_key_value_heads": 8,
19
+ "num_local_experts": 8,
20
+ "output_router_logits": false,
21
+ "rms_norm_eps": 1e-05,
22
+ "rope_theta": 1000000,
23
+ "router_aux_loss_coef": 0.001,
24
+ "router_jitter_noise": 0.0,
25
+ "sliding_window": null,
26
+ "tie_word_embeddings": false,
27
+ "torch_dtype": "bfloat16",
28
+ "transformers_version": "4.41.2",
29
+ "use_cache": false,
30
+ "vocab_size": 32000,
31
+ "quantization_config": {
32
+ "quant_method": "exl2",
33
+ "version": "0.1.5",
34
+ "bits": 3.0,
35
+ "head_bits": 6,
36
+ "calibration": {
37
+ "rows": 100,
38
+ "length": 2048,
39
+ "dataset": "(default)"
40
+ }
41
+ }
42
+ }
mergekit_config.yml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ models:
2
+ - model: alpindale/WizardLM-2-8x22B
3
+ - model: openbmb/Eurux-8x22b-nca
4
+ - model: migtissera/Tess-2.0-Mixtral-8x22B
5
+ - model: fireworks-ai/mixtral-8x22b-instruct-oh
6
+ base_model: alpindale/WizardLM-2-8x22B
7
+ merge_method: model_stock
8
+ dtype: bfloat16
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
output-00001-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1141985917f2d0ee0122bf02aab694be01beb5e31b741188071ae23bdf033b2
3
+ size 8579261352
output-00002-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65632fd1ec2ad592d25d8a9a363d04bb62d5c28ce0cad1d1d0aad56eb7c78916
3
+ size 8578378816
output-00003-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50ae394f6797b4b916cc977d6b9816d0e9d1435866ca7e1673b15729f647ba56
3
+ size 8583491000
output-00004-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6233dc49c1526653a0076dca718910300475f4b332b5e4dd4a4dcaf444e88fe7
3
+ size 8579855560
output-00005-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c82e98df2edfe768c3b6c07abcea4683f6677f89d605e5d9f00177378557ba33
3
+ size 8589054440
output-00006-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0b7b499777d3cfa999e72ed823bee218c5474fced051db15b6e1fd25d6685f0
3
+ size 8573362328
output-00007-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0968a33cc8ee6892c5268352fbde52e0f6e9c227b5f94ef1dd20fa0eaf73e9ac
3
+ size 1650569456
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
+ size 493443
tokenizer_config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": true,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ }
30
+ },
31
+ "additional_special_tokens": [],
32
+ "bos_token": "<s>",
33
+ "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{{ messages[0]['content'].strip() }}{% else %}{% set loop_messages = messages %}{{ 'A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\\'s questions.' }}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{% if message['role'] == 'system' or message['role'] == 'user' %}{{ ' USER: ' + message['content'].strip() }}{% else %}{{ ' ASSISTANT: ' + message['content'].strip() + eos_token }}{% endif %}{% else %}{% if message['role'] == 'system' or message['role'] == 'user' %}{{ '\nUSER: ' + message['content'].strip() }}{% else %}{{ ' ASSISTANT: ' + message['content'].strip() + eos_token }}{% endif %}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ ' ASSISTANT:' }}{% endif %}",
34
+ "clean_up_tokenization_spaces": false,
35
+ "eos_token": "</s>",
36
+ "legacy": true,
37
+ "model_max_length": 1000000000000000019884624838656,
38
+ "pad_token": "<unk>",
39
+ "padding_side": "right",
40
+ "sp_model_kwargs": {},
41
+ "spaces_between_special_tokens": false,
42
+ "tokenizer_class": "LlamaTokenizer",
43
+ "unk_token": "<unk>",
44
+ "use_default_system_prompt": true
45
+ }