Update README.md
Browse files
README.md
CHANGED
@@ -1,54 +1,17 @@
|
|
1 |
---
|
2 |
base_model: []
|
3 |
library_name: transformers
|
4 |
-
tags:
|
5 |
-
- mergekit
|
6 |
-
- merge
|
7 |
-
|
8 |
---
|
9 |
-
|
10 |
-
|
11 |
-
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
12 |
-
|
13 |
-
## Merge Details
|
14 |
-
### Merge Method
|
15 |
-
|
16 |
-
This model was merged using the passthrough merge method.
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
-
* /media/kquant/SSD/Model-2
|
22 |
-
* /media/kquant/SSD/Model-1
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
|
28 |
-
|
29 |
-
dtype: float16
|
30 |
-
merge_method: passthrough
|
31 |
-
slices:
|
32 |
-
- sources:
|
33 |
-
- layer_range: [0, 8]
|
34 |
-
model: /media/kquant/SSD/Model-1
|
35 |
-
- sources:
|
36 |
-
- layer_range: [4, 12]
|
37 |
-
model: /media/kquant/SSD/Model-2
|
38 |
-
- sources:
|
39 |
-
- layer_range: [8, 16]
|
40 |
-
model: /media/kquant/SSD/Model-1
|
41 |
-
- sources:
|
42 |
-
- layer_range: [12, 20]
|
43 |
-
model: /media/kquant/SSD/Model-2
|
44 |
-
- sources:
|
45 |
-
- layer_range: [16, 24]
|
46 |
-
model: /media/kquant/SSD/Model-1
|
47 |
-
- sources:
|
48 |
-
- layer_range: [20, 28]
|
49 |
-
model: /media/kquant/SSD/Model-2
|
50 |
-
- sources:
|
51 |
-
- layer_range: [24, 32]
|
52 |
-
model: /media/kquant/SSD/Model-1
|
53 |
|
54 |
-
|
|
|
1 |
---
|
2 |
base_model: []
|
3 |
library_name: transformers
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
+
Llama-3-13B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
+
Thank you to Meta for the weights for Meta-Llama-3-8B-Instruct
|
8 |
|
9 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png)
|
|
|
|
|
10 |
|
11 |
+
This is an upscaling of the Meta-Llama-3-8B-Instruct Ai using techniques created for Mistral-Evolved-11b-v0.1. This Ai model has been upscaled from 8b parameters to 13b parameters without any continuous pretraining or fine-tuning.
|
12 |
|
13 |
+
From testing, the model seems to function perfectly at fp16, but has some issues at 4-bit quantization.
|
14 |
|
15 |
+
The model that was used to create this one is linked below:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
+
https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
|