appvoid commited on
Commit
cb96f87
1 Parent(s): 0f2ddb3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -37
README.md CHANGED
@@ -1,48 +1,40 @@
1
  ---
2
- base_model:
3
- - h2oai/h2o-danube3-500m-base
4
- - appvoid/arco-put-6
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
-
10
  ---
11
- # arco-put-9
12
 
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
 
15
- ## Merge Details
16
- ### Merge Method
 
 
 
 
 
 
 
 
 
 
17
 
18
- This model was merged using the SLERP merge method.
 
 
19
 
20
- ### Models Merged
21
 
22
- The following models were included in the merge:
23
- * [h2oai/h2o-danube3-500m-base](https://huggingface.co/h2oai/h2o-danube3-500m-base)
24
- * [appvoid/arco-put-6](https://huggingface.co/appvoid/arco-put-6)
25
 
26
- ### Configuration
27
 
28
- The following YAML configuration was used to produce this model:
 
 
 
 
 
 
29
 
30
- ```yaml
31
- slices:
32
- - sources:
33
- - model: h2oai/h2o-danube3-500m-base
34
- layer_range: [0, 16]
35
- - model: appvoid/arco-put-6
36
- layer_range: [0, 16]
37
- merge_method: slerp
38
- base_model: appvoid/arco-put-6
39
- parameters:
40
- t:
41
- - filter: self_attn
42
- value: [0, 0.5, 0.3, 0.7, 1]
43
- - filter: mlp
44
- value: [1, 0.5, 0.7, 0.3, 0]
45
- - value: 0.5
46
- dtype: float16
47
 
48
- ```
 
 
1
  ---
2
+ license: apache-2.0
 
 
 
 
 
 
 
3
  ---
 
4
 
 
5
 
6
+ <style>
7
+ img{
8
+ user-select: none;
9
+ transition: all 0.2s ease;
10
+ border-radius: .5rem;
11
+ }
12
+ img:hover{
13
+ transform: rotate(2deg);
14
+ filter: invert(100%);
15
+ }
16
+ @import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap');
17
+ </style>
18
 
19
+ <div style="background-color: transparent; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;">
20
+
21
+ ![palmer-004](https://huggingface.co/appvoid/palmer-004-original/resolve/main/palmer-004.jpeg)
22
 
23
+ **September Update** - this is the official model used in dot, keep in mind, none of these models use specific prompts, you might need to fine-tune them to use them as chatbots.
24
 
25
+ #### benchmarks
 
 
26
 
27
+ zero-shot evaluations performed on current sota ~0.5b models against the best language model below 2b parameters.
28
 
29
+ | Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
30
+ | -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
31
+ | 0.5b | qwen2 |**0.4413**| 0.2892| 0.4905 | 0.6931 | 0.5699 | 0.4968 |
32
+ | 0.6b | mobilellm | - | 0.3580| 0.5590 | 0.7230 | 0.5860 | - |
33
+ | 0.5b | danube3 | 0.2481| 0.3618| 0.6046 | 0.7378 | 0.6101 | 0.5125 |
34
+ | 0.5b | palmer |0.2617|**0.3729**|**0.6288**|**0.7437**| **0.6227** |**0.5260**|
35
+ | 1.7b | smollm |0.2765|0.4626| 0.6574 | 0.7606 | 0.6093 | 0.5533 |
36
 
37
+ #### supporters
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
+ <a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 34px !important; margin-top: -4px;width: 128px !important; filter: contrast(2) grayscale(100%) brightness(100%);" ></a>
40
+ </div>