rombodawg commited on
Commit
5960e84
1 Parent(s): 9ef25ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -44
README.md CHANGED
@@ -1,54 +1,17 @@
1
  ---
2
  base_model: []
3
  library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # Llama-3-BIG-Instruct
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the passthrough merge method.
17
 
18
- ### Models Merged
19
 
20
- The following models were included in the merge:
21
- * /media/kquant/SSD/Model-2
22
- * /media/kquant/SSD/Model-1
23
 
24
- ### Configuration
25
 
26
- The following YAML configuration was used to produce this model:
27
 
28
- ```yaml
29
- dtype: float16
30
- merge_method: passthrough
31
- slices:
32
- - sources:
33
- - layer_range: [0, 8]
34
- model: /media/kquant/SSD/Model-1
35
- - sources:
36
- - layer_range: [4, 12]
37
- model: /media/kquant/SSD/Model-2
38
- - sources:
39
- - layer_range: [8, 16]
40
- model: /media/kquant/SSD/Model-1
41
- - sources:
42
- - layer_range: [12, 20]
43
- model: /media/kquant/SSD/Model-2
44
- - sources:
45
- - layer_range: [16, 24]
46
- model: /media/kquant/SSD/Model-1
47
- - sources:
48
- - layer_range: [20, 28]
49
- model: /media/kquant/SSD/Model-2
50
- - sources:
51
- - layer_range: [24, 32]
52
- model: /media/kquant/SSD/Model-1
53
 
54
- ```
 
1
  ---
2
  base_model: []
3
  library_name: transformers
 
 
 
 
4
  ---
5
+ Llama-3-13B-Instruct
 
 
 
 
 
 
 
6
 
7
+ Thank you to Meta for the weights for Meta-Llama-3-8B-Instruct
8
 
9
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png)
 
 
10
 
11
+ This is an upscaling of the Meta-Llama-3-8B-Instruct Ai using techniques created for Mistral-Evolved-11b-v0.1. This Ai model has been upscaled from 8b parameters to 13b parameters without any continuous pretraining or fine-tuning.
12
 
13
+ From testing, the model seems to function perfectly at fp16, but has some issues at 4-bit quantization.
14
 
15
+ The model that was used to create this one is linked below:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
+ https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct