Heng666 commited on
Commit
7b42a3a
1 Parent(s): afe4a04

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - causal-lm
9
+ - text-generation-inference
10
+ - merge
11
+ ---
12
+
13
+ # FOR EXPERIMENT
14
+
15
+ ## Description
16
+
17
+ [**stabilityai/stablelm-zephyr-3b**](https://huggingface.co/stabilityai/stablelm-zephyr-3b), [**StableMed-3b**](https://huggingface.co/cxllin/StableMed-3b) merged with a new, experimental implementation of "dare ties" via mergekit. See:
18
+
19
+ > [Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch](https://github.com/yule-BUAA/MergeLM)
20
+
21
+ > https://github.com/cg123/mergekit/tree/dare
22
+
23
+
24
+ ## Usage
25
+
26
+ `StableLM Zephyr 3B` uses the following instruction format:
27
+ ```
28
+ <|user|>
29
+ List 3 synonyms for the word "tiny"<|endoftext|>
30
+ <|assistant|>
31
+ 1. Dwarf
32
+ 2. Little
33
+ 3. Petite<|endoftext|>
34
+ ```
35
+
36
+ ***
37
+ ## Testing Notes
38
+
39
+ Merged in mergekit with the following config, and the tokenizer from chargoddard's Yi-Llama:
40
+
41
+ ```
42
+ models:
43
+ - model: stabilityai/stablelm-zephyr-3b
44
+ # no parameters necessary for base model
45
+ - model: cxllin/StableMed-3b
46
+ parameters:
47
+ weight: 0.08
48
+ density: 0.5
49
+ merge_method: dare_ties
50
+ base_model: stabilityai/stablelm-zephyr-3b
51
+ parameters:
52
+ int8_mask: true
53
+ dtype: bfloat16
54
+ ```
55
+
56
+ ## Model Details
57
+ - License: [StabilityAI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stablelm-zephyr-3b/raw/main/LICENSE)