FatCat87 commited on
Commit
3774fd0
·
verified ·
1 Parent(s): 6dd03a5

End of training

Browse files
Files changed (2) hide show
  1. README.md +19 -18
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -1,12 +1,11 @@
1
  ---
2
- license: llama3
3
  library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
- base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
8
  model-index:
9
- - name: 0c862fee-2042-414b-98c3-2b6c8e57613b
10
  results: []
11
  ---
12
 
@@ -19,14 +18,14 @@ should probably proofread and complete it, then remove this comment. -->
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
- base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
- - 1cdad3506d86664d_train_data.json
27
  ds_type: json
28
  format: custom
29
- path: 1cdad3506d86664d_train_data.json
30
  type:
31
  field: null
32
  field_input: input
@@ -51,7 +50,7 @@ fsdp_config: null
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
- hub_model_id: FatCat87/0c862fee-2042-414b-98c3-2b6c8e57613b
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
@@ -73,7 +72,8 @@ sample_packing: true
73
  saves_per_epoch: 1
74
  seed: 701
75
  sequence_len: 4096
76
- special_tokens: null
 
77
  strict: false
78
  tf32: false
79
  tokenizer_type: AutoTokenizer
@@ -82,9 +82,9 @@ val_set_size: 0.1
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
- wandb_name: 0c862fee-2042-414b-98c3-2b6c8e57613b
86
  wandb_project: subnet56
87
- wandb_runid: 0c862fee-2042-414b-98c3-2b6c8e57613b
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
@@ -94,12 +94,12 @@ xformers_attention: null
94
 
95
  </details><br>
96
 
97
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/2fhkuh2w)
98
- # 0c862fee-2042-414b-98c3-2b6c8e57613b
99
 
100
- This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
- - Loss: 2.2348
103
 
104
  ## Model description
105
 
@@ -129,16 +129,17 @@ The following hyperparameters were used during training:
129
  - total_eval_batch_size: 4
130
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
131
  - lr_scheduler_type: cosine
 
132
  - num_epochs: 1
133
 
134
  ### Training results
135
 
136
  | Training Loss | Epoch | Step | Validation Loss |
137
  |:-------------:|:------:|:----:|:---------------:|
138
- | 3.2967 | 0.1026 | 1 | 3.1530 |
139
- | 2.7988 | 0.3077 | 3 | 2.5441 |
140
- | 2.3077 | 0.6154 | 6 | 2.3045 |
141
- | 2.2536 | 0.9231 | 9 | 2.2348 |
142
 
143
 
144
  ### Framework versions
 
1
  ---
 
2
  library_name: peft
3
  tags:
4
  - axolotl
5
  - generated_from_trainer
6
+ base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
7
  model-index:
8
+ - name: 97f7fd69-882c-4376-9bac-c02431181f3a
9
  results: []
10
  ---
11
 
 
18
  axolotl version: `0.4.1`
19
  ```yaml
20
  adapter: lora
21
+ base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
22
  bf16: auto
23
  datasets:
24
  - data_files:
25
+ - 78e65bb45152e40a_train_data.json
26
  ds_type: json
27
  format: custom
28
+ path: 78e65bb45152e40a_train_data.json
29
  type:
30
  field: null
31
  field_input: input
 
50
  gradient_accumulation_steps: 4
51
  gradient_checkpointing: true
52
  group_by_length: false
53
+ hub_model_id: FatCat87/97f7fd69-882c-4376-9bac-c02431181f3a
54
  learning_rate: 0.0002
55
  load_in_4bit: false
56
  load_in_8bit: true
 
72
  saves_per_epoch: 1
73
  seed: 701
74
  sequence_len: 4096
75
+ special_tokens:
76
+ pad_token: <|eot_id|>
77
  strict: false
78
  tf32: false
79
  tokenizer_type: AutoTokenizer
 
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
+ wandb_name: 97f7fd69-882c-4376-9bac-c02431181f3a
86
  wandb_project: subnet56
87
+ wandb_runid: 97f7fd69-882c-4376-9bac-c02431181f3a
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
 
94
 
95
  </details><br>
96
 
97
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/y55y4sdv)
98
+ # 97f7fd69-882c-4376-9bac-c02431181f3a
99
 
100
+ This model is a fine-tuned version of [Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
+ - Loss: 1.4332
103
 
104
  ## Model description
105
 
 
129
  - total_eval_batch_size: 4
130
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
131
  - lr_scheduler_type: cosine
132
+ - lr_scheduler_warmup_steps: 7
133
  - num_epochs: 1
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 2.0611 | 0.0067 | 1 | 2.2397 |
140
+ | 1.5555 | 0.2550 | 38 | 1.5148 |
141
+ | 1.4785 | 0.5101 | 76 | 1.4563 |
142
+ | 1.4806 | 0.7651 | 114 | 1.4332 |
143
 
144
 
145
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:03639821080911dde8f21be1edad879c180e8dee6c1e684b0211264efe2b2c6c
3
  size 335706186
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c82fabfc0a3e4575ef2c84a58656de1323d57b11085c5eba4dee5daedfd63e5
3
  size 335706186