mllm-dev commited on
Commit
7da5612
1 Parent(s): 9083620

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
  base_model:
3
- - mllm-dev/gpt2_f_experiment_0_1000
4
- - mllm-dev/gpt2_f_experiment_3_1000
5
- - mllm-dev/gpt2_f_experiment_2_1000
6
  - mllm-dev/gpt2_f_experiment_1_1000
7
  - mllm-dev/gpt2_f_experiment_4_1000
 
 
 
8
  library_name: transformers
9
  tags:
10
  - mergekit
@@ -18,15 +18,15 @@ This is a merge of pre-trained language models created using [mergekit](https://
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_0_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_1000) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
 
26
  * [mllm-dev/gpt2_f_experiment_3_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_1000)
27
  * [mllm-dev/gpt2_f_experiment_2_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_1000)
28
- * [mllm-dev/gpt2_f_experiment_1_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_1000)
29
- * [mllm-dev/gpt2_f_experiment_4_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_1000)
30
 
31
  ### Configuration
32
 
@@ -35,7 +35,7 @@ The following YAML configuration was used to produce this model:
35
  ```yaml
36
  base_model:
37
  model:
38
- path: mllm-dev/gpt2_f_experiment_0_1000
39
  dtype: float16
40
  merge_method: dare_ties
41
  parameters:
@@ -45,11 +45,11 @@ slices:
45
  - layer_range: [0, 12]
46
  model:
47
  model:
48
- path: mllm-dev/gpt2_f_experiment_0_1000
49
  - layer_range: [0, 12]
50
  model:
51
  model:
52
- path: mllm-dev/gpt2_f_experiment_1_1000
53
  parameters:
54
  density: 0.8
55
  weight: 0.2
 
1
  ---
2
  base_model:
 
 
 
3
  - mllm-dev/gpt2_f_experiment_1_1000
4
  - mllm-dev/gpt2_f_experiment_4_1000
5
+ - mllm-dev/gpt2_f_experiment_3_1000
6
+ - mllm-dev/gpt2_f_experiment_2_1000
7
+ - mllm-dev/gpt2_f_experiment_0_1000
8
  library_name: transformers
9
  tags:
10
  - mergekit
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_1_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_1000) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
+ * [mllm-dev/gpt2_f_experiment_4_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_1000)
27
  * [mllm-dev/gpt2_f_experiment_3_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_1000)
28
  * [mllm-dev/gpt2_f_experiment_2_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_1000)
29
+ * [mllm-dev/gpt2_f_experiment_0_1000](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_1000)
 
30
 
31
  ### Configuration
32
 
 
35
  ```yaml
36
  base_model:
37
  model:
38
+ path: mllm-dev/gpt2_f_experiment_1_1000
39
  dtype: float16
40
  merge_method: dare_ties
41
  parameters:
 
45
  - layer_range: [0, 12]
46
  model:
47
  model:
48
+ path: mllm-dev/gpt2_f_experiment_1_1000
49
  - layer_range: [0, 12]
50
  model:
51
  model:
52
+ path: mllm-dev/gpt2_f_experiment_0_1000
53
  parameters:
54
  density: 0.8
55
  weight: 0.2
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "mllm-dev/gpt2_f_experiment_0_1000",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
 
1
  {
2
+ "_name_or_path": "mllm-dev/gpt2_f_experiment_1_1000",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
mergekit_config.yml CHANGED
@@ -1,6 +1,6 @@
1
  base_model:
2
  model:
3
- path: mllm-dev/gpt2_f_experiment_0_1000
4
  dtype: float16
5
  merge_method: dare_ties
6
  parameters:
@@ -10,11 +10,11 @@ slices:
10
  - layer_range: [0, 12]
11
  model:
12
  model:
13
- path: mllm-dev/gpt2_f_experiment_0_1000
14
  - layer_range: [0, 12]
15
  model:
16
  model:
17
- path: mllm-dev/gpt2_f_experiment_1_1000
18
  parameters:
19
  density: 0.8
20
  weight: 0.2
 
1
  base_model:
2
  model:
3
+ path: mllm-dev/gpt2_f_experiment_1_1000
4
  dtype: float16
5
  merge_method: dare_ties
6
  parameters:
 
10
  - layer_range: [0, 12]
11
  model:
12
  model:
13
+ path: mllm-dev/gpt2_f_experiment_1_1000
14
  - layer_range: [0, 12]
15
  model:
16
  model:
17
+ path: mllm-dev/gpt2_f_experiment_0_1000
18
  parameters:
19
  density: 0.8
20
  weight: 0.2
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:751d37329ac0cb3875dd22a5ef5cb1886c7238a90378cff2918ae05491ae9db0
3
  size 248902264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0c4e976b3588d122d65ca3d4732e087356109eaa0e662609b3738547512252b
3
  size 248902264
tokenizer.json CHANGED
@@ -1,6 +1,11 @@
1
  {
2
  "version": "1.0",
3
- "truncation": null,
 
 
 
 
 
4
  "padding": null,
5
  "added_tokens": [
6
  {
 
1
  {
2
  "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 1024,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
  "padding": null,
10
  "added_tokens": [
11
  {
tokenizer_config.json CHANGED
@@ -13,8 +13,12 @@
13
  "bos_token": "<|endoftext|>",
14
  "clean_up_tokenization_spaces": true,
15
  "eos_token": "<|endoftext|>",
 
16
  "model_max_length": 1024,
17
  "pad_token": "<|endoftext|>",
 
18
  "tokenizer_class": "GPT2Tokenizer",
 
 
19
  "unk_token": "<|endoftext|>"
20
  }
 
13
  "bos_token": "<|endoftext|>",
14
  "clean_up_tokenization_spaces": true,
15
  "eos_token": "<|endoftext|>",
16
+ "max_length": 1024,
17
  "model_max_length": 1024,
18
  "pad_token": "<|endoftext|>",
19
+ "stride": 0,
20
  "tokenizer_class": "GPT2Tokenizer",
21
+ "truncation_side": "right",
22
+ "truncation_strategy": "longest_first",
23
  "unk_token": "<|endoftext|>"
24
  }