Undi95 commited on
Commit
cf19925
1 Parent(s): 9f50118

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - mistral
5
+ - pretrained
6
+ ---
7
+
8
+ This is Mistral, but in 11B.
9
+
10
+ I took layers of the original Mistral-7B, and duplicated some layer, this is the first frankeinstein method that I found "acceptable" to expend Mistral.
11
+
12
+ It seems that the first 8 layers of the model is very important, having duplicate of those layers in the model make me think it confuse the model.
13
+ ```
14
+ UPDATE: Forced mergekit to output bfloat16 file, should be the same thing, but since the base model is bfloat16, needed to stay like that.
15
+ Even if it was written bfloat16 earlier, it was float16.
16
+ ```
17
+ <!-- description start -->
18
+ ## Description
19
+
20
+ This repo contains fp16 files of Mistral-11B-v0.1.
21
+
22
+ <!-- description end -->
23
+ <!-- description start -->
24
+ ## Model used
25
+
26
+ - [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1/)
27
+
28
+ <!-- description end -->
29
+ <!-- prompt-template start -->
30
+ ## Prompt template: Alpaca
31
+
32
+ ```
33
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
34
+
35
+ ### Instruction:
36
+ {prompt}
37
+
38
+ ### Response:
39
+
40
+ ```
41
+
42
+ ## The secret sauce
43
+
44
+ ```
45
+ slices:
46
+ - sources:
47
+ - model: mistralai/Mistral-7B-v0.1
48
+ layer_range: [0, 24]
49
+ - sources:
50
+ - model: mistralai/Mistral-7B-v0.1
51
+ layer_range: [8, 32]
52
+ merge_method: passthrough
53
+ dtype: float16
54
+ ```
55
+
56
+
57
+ Special thanks to Sushi.
58
+
59
+ If you want to support me, you can [here](https://ko-fi.com/undiai).