jtatman commited on
Commit
f611c0d
1 Parent(s): 575a07a

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -8,13 +8,13 @@ tags:
8
  - lazymergekit
9
  - Locutusque/TinyMistral-248M-v2.5-Instruct
10
  - Locutusque/TinyMistral-248M-v2.5-Instruct
 
11
  - Locutusque/TinyMistral-248M-v2-Instruct
12
- - Locutusque/TinyMistral-248M-Instruct
13
  base_model:
14
  - Locutusque/TinyMistral-248M-v2.5-Instruct
15
  - Locutusque/TinyMistral-248M-v2.5-Instruct
 
16
  - Locutusque/TinyMistral-248M-v2-Instruct
17
- - Locutusque/TinyMistral-248M-Instruct
18
  ---
19
 
20
  # TinyMistral-248m-v2.5-4x-Moe
@@ -22,21 +22,19 @@ base_model:
22
  TinyMistral-248m-v2.5-4x-Moe is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
23
  * [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
24
  * [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
 
25
  * [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
26
- * [Locutusque/TinyMistral-248M-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-Instruct)
27
 
28
  ## 🧩 Configuration
29
 
30
  ```yaml
31
- base_model: Locutusque/TinyMistral-248M-v2.5
32
  experts:
33
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
34
  positive_prompts:
 
35
  - "Help me debug this code."
36
- - "Optimize this C# script."
37
- - "Implement this feature using JavaScript."
38
- - "Convert this HTML structure into a more efficient design."
39
- - "Assist me with writing a program that"
40
  negative_prompts:
41
  - "How do you"
42
  - "Explain the concept of"
@@ -47,26 +45,27 @@ experts:
47
  - "Summarize"
48
  - "Make a recommendation on"
49
  - "Answer this question"
 
 
 
 
50
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
51
  positive_prompts:
52
- - "How do you"
53
- - "Explain the concept of"
54
- - "Give an overview of"
55
- - "Compare and contrast between"
56
- - "Provide information about"
57
- - "Help me understand"
58
- - "Summarize"
59
- - "Make a recommendation on"
60
- - "Answer this question"
61
  negative_prompts:
62
  - "Help me debug this code."
63
  - "Optimize this C# script."
64
  - "Implement this feature using JavaScript."
65
  - "Convert this HTML structure into a more efficient design."
66
  - "Assist me with writing a program that"
67
- - source_model: Locutusque/TinyMistral-248M-v2-Instruct
68
- positive_prompts:
 
69
  - "How do I incorporate visual elements into my writing?"
 
 
 
70
  negative_prompts:
71
  - "Help me debug this code."
72
  - "Optimize this C# script."
@@ -82,11 +81,16 @@ experts:
82
  - "Summarize"
83
  - "Make a recommendation on"
84
  - "Answer this question"
85
- - source_model: Locutusque/TinyMistral-248M-Instruct
 
 
 
86
  positive_prompts:
87
  - "Craft me a list of some nice places to visit around the world. "
88
  - "Write me a story"
89
  - "Write me an essay"
 
 
90
  negative_prompts:
91
  - "Help me debug this code."
92
  - "Optimize this C# script."
@@ -102,7 +106,7 @@ experts:
102
  - "Summarize"
103
  - "Make a recommendation on"
104
  - "Answer this question"
105
-
106
  gate_mode: hidden
107
  ```
108
 
 
8
  - lazymergekit
9
  - Locutusque/TinyMistral-248M-v2.5-Instruct
10
  - Locutusque/TinyMistral-248M-v2.5-Instruct
11
+ - Locutusque/TinyMistral-248M-v2.5-Instruct
12
  - Locutusque/TinyMistral-248M-v2-Instruct
 
13
  base_model:
14
  - Locutusque/TinyMistral-248M-v2.5-Instruct
15
  - Locutusque/TinyMistral-248M-v2.5-Instruct
16
+ - Locutusque/TinyMistral-248M-v2.5-Instruct
17
  - Locutusque/TinyMistral-248M-v2-Instruct
 
18
  ---
19
 
20
  # TinyMistral-248m-v2.5-4x-Moe
 
22
  TinyMistral-248m-v2.5-4x-Moe is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
23
  * [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
24
  * [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
25
+ * [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
26
  * [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
 
27
 
28
  ## 🧩 Configuration
29
 
30
  ```yaml
31
+ base_model: Locutusque/TinyMistral-248M-v2.5-Instruct
32
  experts:
33
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
34
  positive_prompts:
35
+ - "Write me a Python program that calculates the factorial of n."
36
  - "Help me debug this code."
37
+ - "Optimize this C++ program."
 
 
 
38
  negative_prompts:
39
  - "How do you"
40
  - "Explain the concept of"
 
45
  - "Summarize"
46
  - "Make a recommendation on"
47
  - "Answer this question"
48
+ - "Craft me a list of some nice places to visit around the world. "
49
+ - "Write me a story"
50
+ - "Write me an essay"
51
+ - "How do I incorporate visual elements into my writing?"
52
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
53
  positive_prompts:
54
+ - "What is the product of 2 x 5 x 18?"
55
+ - "How do I guess the value of x for the function f(x) = x^4 - 2x^2 - 1?"
 
 
 
 
 
 
 
56
  negative_prompts:
57
  - "Help me debug this code."
58
  - "Optimize this C# script."
59
  - "Implement this feature using JavaScript."
60
  - "Convert this HTML structure into a more efficient design."
61
  - "Assist me with writing a program that"
62
+ - "Craft me a list of some nice places to visit around the world. "
63
+ - "Write me a story"
64
+ - "Write me an essay"
65
  - "How do I incorporate visual elements into my writing?"
66
+ - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
67
+ positive_prompts:
68
+ - "How do I incorporate fewer visual elements into my art but retain impact?"
69
  negative_prompts:
70
  - "Help me debug this code."
71
  - "Optimize this C# script."
 
81
  - "Summarize"
82
  - "Make a recommendation on"
83
  - "Answer this question"
84
+ - "Craft me a list of some nice places to visit around the world. "
85
+ - "Write me a story"
86
+ - "Write me an essay"
87
+ - source_model: Locutusque/TinyMistral-248M-v2-Instruct
88
  positive_prompts:
89
  - "Craft me a list of some nice places to visit around the world. "
90
  - "Write me a story"
91
  - "Write me an essay"
92
+ - "Create a fantasy story about"
93
+ - "Tell me about the wild fjords."
94
  negative_prompts:
95
  - "Help me debug this code."
96
  - "Optimize this C# script."
 
106
  - "Summarize"
107
  - "Make a recommendation on"
108
  - "Answer this question"
109
+ - "How do I incorporate visual elements into my writing?"
110
  gate_mode: hidden
111
  ```
112
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "Locutusque/TinyMistral-248M-v2.5",
3
  "architectures": [
4
  "MixtralForCausalLM"
5
  ],
@@ -23,7 +23,7 @@
23
  "router_aux_loss_coef": 0.001,
24
  "sliding_window": null,
25
  "tie_word_embeddings": false,
26
- "torch_dtype": "float16",
27
  "transformers_version": "4.37.2",
28
  "use_cache": true,
29
  "vocab_size": 32005
 
1
  {
2
+ "_name_or_path": "Locutusque/TinyMistral-248M-v2.5-Instruct",
3
  "architectures": [
4
  "MixtralForCausalLM"
5
  ],
 
23
  "router_aux_loss_coef": 0.001,
24
  "sliding_window": null,
25
  "tie_word_embeddings": false,
26
+ "torch_dtype": "float32",
27
  "transformers_version": "4.37.2",
28
  "use_cache": true,
29
  "vocab_size": 32005
mergekit_moe_config.yml CHANGED
@@ -1,13 +1,11 @@
1
 
2
- base_model: Locutusque/TinyMistral-248M-v2.5
3
  experts:
4
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
5
  positive_prompts:
 
6
  - "Help me debug this code."
7
- - "Optimize this C# script."
8
- - "Implement this feature using JavaScript."
9
- - "Convert this HTML structure into a more efficient design."
10
- - "Assist me with writing a program that"
11
  negative_prompts:
12
  - "How do you"
13
  - "Explain the concept of"
@@ -18,26 +16,27 @@ experts:
18
  - "Summarize"
19
  - "Make a recommendation on"
20
  - "Answer this question"
 
 
 
 
21
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
22
  positive_prompts:
23
- - "How do you"
24
- - "Explain the concept of"
25
- - "Give an overview of"
26
- - "Compare and contrast between"
27
- - "Provide information about"
28
- - "Help me understand"
29
- - "Summarize"
30
- - "Make a recommendation on"
31
- - "Answer this question"
32
  negative_prompts:
33
  - "Help me debug this code."
34
  - "Optimize this C# script."
35
  - "Implement this feature using JavaScript."
36
  - "Convert this HTML structure into a more efficient design."
37
  - "Assist me with writing a program that"
38
- - source_model: Locutusque/TinyMistral-248M-v2-Instruct
39
- positive_prompts:
 
40
  - "How do I incorporate visual elements into my writing?"
 
 
 
41
  negative_prompts:
42
  - "Help me debug this code."
43
  - "Optimize this C# script."
@@ -53,11 +52,16 @@ experts:
53
  - "Summarize"
54
  - "Make a recommendation on"
55
  - "Answer this question"
56
- - source_model: Locutusque/TinyMistral-248M-Instruct
 
 
 
57
  positive_prompts:
58
  - "Craft me a list of some nice places to visit around the world. "
59
  - "Write me a story"
60
  - "Write me an essay"
 
 
61
  negative_prompts:
62
  - "Help me debug this code."
63
  - "Optimize this C# script."
@@ -73,5 +77,5 @@ experts:
73
  - "Summarize"
74
  - "Make a recommendation on"
75
  - "Answer this question"
76
-
77
  gate_mode: hidden
 
1
 
2
+ base_model: Locutusque/TinyMistral-248M-v2.5-Instruct
3
  experts:
4
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
5
  positive_prompts:
6
+ - "Write me a Python program that calculates the factorial of n."
7
  - "Help me debug this code."
8
+ - "Optimize this C++ program."
 
 
 
9
  negative_prompts:
10
  - "How do you"
11
  - "Explain the concept of"
 
16
  - "Summarize"
17
  - "Make a recommendation on"
18
  - "Answer this question"
19
+ - "Craft me a list of some nice places to visit around the world. "
20
+ - "Write me a story"
21
+ - "Write me an essay"
22
+ - "How do I incorporate visual elements into my writing?"
23
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
24
  positive_prompts:
25
+ - "What is the product of 2 x 5 x 18?"
26
+ - "How do I guess the value of x for the function f(x) = x^4 - 2x^2 - 1?"
 
 
 
 
 
 
 
27
  negative_prompts:
28
  - "Help me debug this code."
29
  - "Optimize this C# script."
30
  - "Implement this feature using JavaScript."
31
  - "Convert this HTML structure into a more efficient design."
32
  - "Assist me with writing a program that"
33
+ - "Craft me a list of some nice places to visit around the world. "
34
+ - "Write me a story"
35
+ - "Write me an essay"
36
  - "How do I incorporate visual elements into my writing?"
37
+ - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
38
+ positive_prompts:
39
+ - "How do I incorporate fewer visual elements into my art but retain impact?"
40
  negative_prompts:
41
  - "Help me debug this code."
42
  - "Optimize this C# script."
 
52
  - "Summarize"
53
  - "Make a recommendation on"
54
  - "Answer this question"
55
+ - "Craft me a list of some nice places to visit around the world. "
56
+ - "Write me a story"
57
+ - "Write me an essay"
58
+ - source_model: Locutusque/TinyMistral-248M-v2-Instruct
59
  positive_prompts:
60
  - "Craft me a list of some nice places to visit around the world. "
61
  - "Write me a story"
62
  - "Write me an essay"
63
+ - "Create a fantasy story about"
64
+ - "Tell me about the wild fjords."
65
  negative_prompts:
66
  - "Help me debug this code."
67
  - "Optimize this C# script."
 
77
  - "Summarize"
78
  - "Make a recommendation on"
79
  - "Answer this question"
80
+ - "How do I incorporate visual elements into my writing?"
81
  gate_mode: hidden
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:09a6143c521ff0c20f1df03d4a92db7e6a97a71e25ad40f9de4092d3c2fe2a48
3
- size 1402144424
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:514a8f6bc5248309422ad425e94929e1521d765dd30f65d4cf8dbc841edc28e7
3
+ size 2804260704
tokenizer_config.json CHANGED
@@ -74,7 +74,10 @@
74
  "legacy": true,
75
  "max_length": 1536,
76
  "model_max_length": 1000000000000000019884624838656,
 
77
  "pad_token": "<|bos|>",
 
 
78
  "sp_model_kwargs": {},
79
  "spaces_between_special_tokens": false,
80
  "stride": 0,
 
74
  "legacy": true,
75
  "max_length": 1536,
76
  "model_max_length": 1000000000000000019884624838656,
77
+ "pad_to_multiple_of": null,
78
  "pad_token": "<|bos|>",
79
+ "pad_token_type_id": 0,
80
+ "padding_side": "left",
81
  "sp_model_kwargs": {},
82
  "spaces_between_special_tokens": false,
83
  "stride": 0,