mllm-dev commited on
Commit
761111c
1 Parent(s): ad47923

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -2,9 +2,9 @@
2
  base_model:
3
  - mllm-dev/gpt2_f_experiment_0_1000
4
  - mllm-dev/gpt2_f_experiment_4_1000
5
- - mllm-dev/gpt2_f_experiment_3_1000
6
- - mllm-dev/gpt2_f_experiment_2_1000
7
  - mllm-dev/gpt2_f_experiment_1_1000
 
 
8
  library_name: transformers
9
  tags:
10
  - mergekit
@@ -18,15 +18,15 @@ This is a merge of pre-trained language models created using [mergekit](https://
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_4_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_1000) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
- * [mllm-dev/gpt2_f_experiment_0_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_1000)
27
- * [mllm-dev/gpt2_f_experiment_3_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_1000)
28
- * [mllm-dev/gpt2_f_experiment_2_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_1000)
29
  * [mllm-dev/gpt2_f_experiment_1_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_1000)
 
 
30
 
31
  ### Configuration
32
 
@@ -35,7 +35,7 @@ The following YAML configuration was used to produce this model:
35
  ```yaml
36
  base_model:
37
  model:
38
- path: mllm-dev/gpt2_f_experiment_4_1000
39
  dtype: float16
40
  merge_method: dare_ties
41
  parameters:
@@ -45,7 +45,7 @@ slices:
45
  - layer_range: [0, 12]
46
  model:
47
  model:
48
- path: mllm-dev/gpt2_f_experiment_4_1000
49
  - layer_range: [0, 12]
50
  model:
51
  model:
@@ -70,7 +70,7 @@ slices:
70
  - layer_range: [0, 12]
71
  model:
72
  model:
73
- path: mllm-dev/gpt2_f_experiment_0_1000
74
  parameters:
75
  density: 0.8
76
  weight: 0.2
 
2
  base_model:
3
  - mllm-dev/gpt2_f_experiment_0_1000
4
  - mllm-dev/gpt2_f_experiment_4_1000
 
 
5
  - mllm-dev/gpt2_f_experiment_1_1000
6
+ - mllm-dev/gpt2_f_experiment_2_1000
7
+ - mllm-dev/gpt2_f_experiment_3_1000
8
  library_name: transformers
9
  tags:
10
  - mergekit
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_0_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_1000) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
+ * [mllm-dev/gpt2_f_experiment_4_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_1000)
 
 
27
  * [mllm-dev/gpt2_f_experiment_1_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_1000)
28
+ * [mllm-dev/gpt2_f_experiment_2_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_1000)
29
+ * [mllm-dev/gpt2_f_experiment_3_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_1000)
30
 
31
  ### Configuration
32
 
 
35
  ```yaml
36
  base_model:
37
  model:
38
+ path: mllm-dev/gpt2_f_experiment_0_1000
39
  dtype: float16
40
  merge_method: dare_ties
41
  parameters:
 
45
  - layer_range: [0, 12]
46
  model:
47
  model:
48
+ path: mllm-dev/gpt2_f_experiment_0_1000
49
  - layer_range: [0, 12]
50
  model:
51
  model:
 
70
  - layer_range: [0, 12]
71
  model:
72
  model:
73
+ path: mllm-dev/gpt2_f_experiment_4_1000
74
  parameters:
75
  density: 0.8
76
  weight: 0.2
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "mllm-dev/gpt2_f_experiment_4_1000",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
 
1
  {
2
+ "_name_or_path": "mllm-dev/gpt2_f_experiment_0_1000",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
mergekit_config.yml CHANGED
@@ -1,6 +1,6 @@
1
  base_model:
2
  model:
3
- path: mllm-dev/gpt2_f_experiment_4_1000
4
  dtype: float16
5
  merge_method: dare_ties
6
  parameters:
@@ -10,7 +10,7 @@ slices:
10
  - layer_range: [0, 12]
11
  model:
12
  model:
13
- path: mllm-dev/gpt2_f_experiment_4_1000
14
  - layer_range: [0, 12]
15
  model:
16
  model:
@@ -35,7 +35,7 @@ slices:
35
  - layer_range: [0, 12]
36
  model:
37
  model:
38
- path: mllm-dev/gpt2_f_experiment_0_1000
39
  parameters:
40
  density: 0.8
41
  weight: 0.2
 
1
  base_model:
2
  model:
3
+ path: mllm-dev/gpt2_f_experiment_0_1000
4
  dtype: float16
5
  merge_method: dare_ties
6
  parameters:
 
10
  - layer_range: [0, 12]
11
  model:
12
  model:
13
+ path: mllm-dev/gpt2_f_experiment_0_1000
14
  - layer_range: [0, 12]
15
  model:
16
  model:
 
35
  - layer_range: [0, 12]
36
  model:
37
  model:
38
+ path: mllm-dev/gpt2_f_experiment_4_1000
39
  parameters:
40
  density: 0.8
41
  weight: 0.2
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cf13e744899befbea89e28fbd70be6753a3cec1298ade623b82a8ae04077295e
3
  size 248902264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d420ef69a28358b3734eac69f3840aa3558517eec8921307dad490d29ddf218
3
  size 248902264