Kquant03 commited on
Commit
0198161
1 Parent(s): b7b0e79

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -113
README.md CHANGED
@@ -1,125 +1,67 @@
1
  ---
2
- base_model:
3
- - abacaj/phi-2-super
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # Teldrassil
 
10
 
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
13
- ## Merge Details
14
- ### Merge Method
15
 
16
- This model was merged using the passthrough merge method.
17
 
18
- ### Models Merged
19
 
20
- The following models were included in the merge:
21
- * [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super)
22
 
23
- ### Configuration
24
 
25
- The following YAML configuration was used to produce this model:
 
26
 
27
- ```yaml
28
- dtype: float16
29
- merge_method: passthrough
30
- slices:
31
- - sources:
32
- - model: abacaj/phi-2-super
33
- layer_range: [0,2]
34
- - sources:
35
- - model: abacaj/phi-2-super
36
- layer_range: [1,3]
37
- - sources:
38
- - model: abacaj/phi-2-super
39
- layer_range: [2,4]
40
- - sources:
41
- - model: abacaj/phi-2-super
42
- layer_range: [3,5]
43
- - sources:
44
- - model: abacaj/phi-2-super
45
- layer_range: [4,6]
46
- - sources:
47
- - model: abacaj/phi-2-super
48
- layer_range: [5,7]
49
- - sources:
50
- - model: abacaj/phi-2-super
51
- layer_range: [6,8]
52
- - sources:
53
- - model: abacaj/phi-2-super
54
- layer_range: [7,9]
55
- - sources:
56
- - model: abacaj/phi-2-super
57
- layer_range: [8,10]
58
- - sources:
59
- - model: abacaj/phi-2-super
60
- layer_range: [9,11]
61
- - sources:
62
- - model: abacaj/phi-2-super
63
- layer_range: [10,12]
64
- - sources:
65
- - model: abacaj/phi-2-super
66
- layer_range: [11,13]
67
- - sources:
68
- - model: abacaj/phi-2-super
69
- layer_range: [12,14]
70
- - sources:
71
- - model: abacaj/phi-2-super
72
- layer_range: [13,15]
73
- - sources:
74
- - model: abacaj/phi-2-super
75
- layer_range: [14,16]
76
- - sources:
77
- - model: abacaj/phi-2-super
78
- layer_range: [15,17]
79
- - sources:
80
- - model: abacaj/phi-2-super
81
- layer_range: [16,18]
82
- - sources:
83
- - model: abacaj/phi-2-super
84
- layer_range: [17,19]
85
- - sources:
86
- - model: abacaj/phi-2-super
87
- layer_range: [18,20]
88
- - sources:
89
- - model: abacaj/phi-2-super
90
- layer_range: [19,21]
91
- - sources:
92
- - model: abacaj/phi-2-super
93
- layer_range: [20,22]
94
- - sources:
95
- - model: abacaj/phi-2-super
96
- layer_range: [21,23]
97
- - sources:
98
- - model: abacaj/phi-2-super
99
- layer_range: [22,24]
100
- - sources:
101
- - model: abacaj/phi-2-super
102
- layer_range: [23,25]
103
- - sources:
104
- - model: abacaj/phi-2-super
105
- layer_range: [24,26]
106
- - sources:
107
- - model: abacaj/phi-2-super
108
- layer_range: [25,27]
109
- - sources:
110
- - model: abacaj/phi-2-super
111
- layer_range: [26,28]
112
- - sources:
113
- - model: abacaj/phi-2-super
114
- layer_range: [27,29]
115
- - sources:
116
- - model: abacaj/phi-2-super
117
- layer_range: [28,30]
118
- - sources:
119
- - model: abacaj/phi-2-super
120
- layer_range: [29,31]
121
- - sources:
122
- - model: abacaj/phi-2-super
123
- layer_range: [30,32]
124
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
125
  ```
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ thumbnail: "https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/TqnMpteVAyfiiNHx4lVkU.png"
 
 
6
  ---
7
+ # You are welcome here, traveler.
8
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/TqnMpteVAyfiiNHx4lVkU.png)
9
 
10
+ ### Named after the method used to create it, interleaving the layers of its predecessor to become far larger, giving it much more potential.
11
 
 
 
12
 
13
+ [Elothir](https://wowpedia.fandom.com/wiki/Elothir) was an ancient treeant, and I couldn't think of a better name for a model that was created using the passthrough method.
14
 
15
+ The passthrough method differs significantly from the previous ones. By concatenating layers from different LLMs, it can produce models with an exotic number of parameters (e.g., 9B with two 7B parameter models). These models are often referred to as "frankenmerges" or "Frankenstein models" by the community.
16
 
 
 
17
 
18
+ Many thanks to [Abacaj](https://huggingface.co/abacaj) for providing the [fine tuned weights](https://huggingface.co/abacaj/phi-2-super) that were used in the creation of this base model. You can find the full script for how the model was merged [here](https://huggingface.co/Replete-AI/Phi-Elothir/blob/main/mergekit_config.yml)...thanks to [KatyTheCutie](https://huggingface.co/KatyTheCutie) for helping me figure out how to make the model as big as I possibly could.
19
 
20
+ ## This idea was brought to me by [The Face of Goonery](https://huggingface.co/The-Face-Of-Goonery), also known as Caleb Morgan. I have him to thank if fine-tuning this model turns out to be a success
21
+ # How to run inference:
22
 
23
+ ```python
24
+ import transformers
25
+ import torch
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ if __name__ == "__main__":
28
+ model_name = "abacaj/phi-2-super"
29
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
30
+
31
+ model = (
32
+ transformers.AutoModelForCausalLM.from_pretrained(
33
+ model_name,
34
+ )
35
+ .to("cuda:0")
36
+ .eval()
37
+ )
38
+
39
+ messages = [
40
+ {"role": "user", "content": "Hello, who are you?"}
41
+ ]
42
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
43
+ input_ids_cutoff = inputs.size(dim=1)
44
+
45
+ with torch.no_grad():
46
+ generated_ids = model.generate(
47
+ input_ids=inputs,
48
+ use_cache=True,
49
+ max_new_tokens=512,
50
+ temperature=0.2,
51
+ top_p=0.95,
52
+ do_sample=True,
53
+ eos_token_id=tokenizer.eos_token_id,
54
+ pad_token_id=tokenizer.pad_token_id,
55
+ )
56
+
57
+ completion = tokenizer.decode(
58
+ generated_ids[0][input_ids_cutoff:],
59
+ skip_special_tokens=True,
60
+ )
61
+
62
+ print(completion)
63
  ```
64
+
65
+ # Chat template
66
+
67
+ The model uses the same chat template as found in Mistral instruct models: