cstr commited on
Commit
c00fcd8
1 Parent(s): 6302746

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +21 -9
  2. model-1.safetensors +1 -1
README.md CHANGED
@@ -4,9 +4,11 @@ tags:
4
  - mergekit
5
  - lazymergekit
6
  - abhishek/autotrain-llama3-8b-open-hermes-sft
 
7
  - DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
8
  base_model:
9
  - abhishek/autotrain-llama3-8b-open-hermes-sft
 
10
  - DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
11
  ---
12
 
@@ -14,24 +16,34 @@ base_model:
14
 
15
  llama3-discolm-orpo-t2 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
16
  * [abhishek/autotrain-llama3-8b-open-hermes-sft](https://huggingface.co/abhishek/autotrain-llama3-8b-open-hermes-sft)
 
17
  * [DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental](https://huggingface.co/DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental)
18
 
19
  ## 🧩 Configuration
20
 
21
  ```yaml
22
  models:
23
- - layer_range: [0, 40]
24
- model: abhishek/autotrain-llama3-8b-open-hermes-sft
25
  parameters:
26
- weight: 0.2
27
- - layer_range: [0, 40]
28
- model: DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
29
  parameters:
30
- weight: 0.8
31
- merge_method: task_arithmetic
32
- base_model: abhishek/autotrain-llama3-8b-open-hermes-sft
 
 
 
 
 
 
 
 
 
 
 
33
  dtype: bfloat16
34
- random_seed: 0
35
  ```
36
 
37
  ## 💻 Usage
 
4
  - mergekit
5
  - lazymergekit
6
  - abhishek/autotrain-llama3-8b-open-hermes-sft
7
+ - cognitivecomputations/dolphin-2.9-llama3-8b
8
  - DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
9
  base_model:
10
  - abhishek/autotrain-llama3-8b-open-hermes-sft
11
+ - cognitivecomputations/dolphin-2.9-llama3-8b
12
  - DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
13
  ---
14
 
 
16
 
17
  llama3-discolm-orpo-t2 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
18
  * [abhishek/autotrain-llama3-8b-open-hermes-sft](https://huggingface.co/abhishek/autotrain-llama3-8b-open-hermes-sft)
19
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
20
  * [DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental](https://huggingface.co/DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental)
21
 
22
  ## 🧩 Configuration
23
 
24
  ```yaml
25
  models:
26
+ - model: abhishek/autotrain-llama3-8b-open-hermes-sft
 
27
  parameters:
28
+ density: 0.5
29
+ weight: 0.4
30
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
31
  parameters:
32
+ density: 0.5
33
+ weight: 0.3
34
+ - model: DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
35
+ parameters:
36
+ density: 0.6
37
+ weight: [0, 0.3, 0.7, 1]
38
+ # - filter: mlp
39
+ # value: 0.5
40
+ # - value: 0.3
41
+ merge_method: ties
42
+ base_model: mlabonne/OrpoLlama-3-8B
43
+ parameters:
44
+ normalize: true
45
+ int8_mask: true
46
  dtype: bfloat16
 
47
  ```
48
 
49
  ## 💻 Usage
model-1.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8f684ee24fe9636ae436d8fe061adc1805d2a1bdca54b2ea4dc8588e7f542d21
3
  size 1979781432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d4176e7db405441843d00a3105c3a50e180c584b0676a7542a4987c1aeb7c9d
3
  size 1979781432