FatCat87
/

97f7fd69-882c-4376-9bac-c02431181f3a

@@ -1,12 +1,11 @@
 ---
-license: llama3
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
-base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
 model-index:
-- name: 0c862fee-2042-414b-98c3-2b6c8e57613b
   results: []
 ---
@@ -19,14 +18,14 @@ should probably proofread and complete it, then remove this comment. -->
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
-base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
 bf16: auto
 datasets:
 - data_files:
-  - 1cdad3506d86664d_train_data.json
   ds_type: json
   format: custom
-  path: 1cdad3506d86664d_train_data.json
   type:
     field: null
     field_input: input
@@ -51,7 +50,7 @@ fsdp_config: null
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
-hub_model_id: FatCat87/0c862fee-2042-414b-98c3-2b6c8e57613b
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
@@ -73,7 +72,8 @@ sample_packing: true
 saves_per_epoch: 1
 seed: 701
 sequence_len: 4096
-special_tokens: null
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
@@ -82,9 +82,9 @@ val_set_size: 0.1
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
-wandb_name: 0c862fee-2042-414b-98c3-2b6c8e57613b
 wandb_project: subnet56
-wandb_runid: 0c862fee-2042-414b-98c3-2b6c8e57613b
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
@@ -94,12 +94,12 @@ xformers_attention: null
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/2fhkuh2w)
-# 0c862fee-2042-414b-98c3-2b6c8e57613b
-This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2348
 ## Model description
@@ -129,16 +129,17 @@ The following hyperparameters were used during training:
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 3.2967        | 0.1026 | 1    | 3.1530          |
-| 2.7988        | 0.3077 | 3    | 2.5441          |
-| 2.3077        | 0.6154 | 6    | 2.3045          |
-| 2.2536        | 0.9231 | 9    | 2.2348          |
 ### Framework versions

 ---
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
+base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
 model-index:
+- name: 97f7fd69-882c-4376-9bac-c02431181f3a
   results: []
 ---
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
+base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
 bf16: auto
 datasets:
 - data_files:
+  - 78e65bb45152e40a_train_data.json
   ds_type: json
   format: custom
+  path: 78e65bb45152e40a_train_data.json
   type:
     field: null
     field_input: input
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
+hub_model_id: FatCat87/97f7fd69-882c-4376-9bac-c02431181f3a
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
 saves_per_epoch: 1
 seed: 701
 sequence_len: 4096
+special_tokens:
+  pad_token: <|eot_id|>
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
+wandb_name: 97f7fd69-882c-4376-9bac-c02431181f3a
 wandb_project: subnet56
+wandb_runid: 97f7fd69-882c-4376-9bac-c02431181f3a
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/y55y4sdv)
+# 97f7fd69-882c-4376-9bac-c02431181f3a
+This model is a fine-tuned version of [Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4332
 ## Model description
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 7
 - num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 2.0611        | 0.0067 | 1    | 2.2397          |
+| 1.5555        | 0.2550 | 38   | 1.5148          |
+| 1.4785        | 0.5101 | 76   | 1.4563          |
+| 1.4806        | 0.7651 | 114  | 1.4332          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:03639821080911dde8f21be1edad879c180e8dee6c1e684b0211264efe2b2c6c
 size 335706186

 version https://git-lfs.github.com/spec/v1
+oid sha256:3c82fabfc0a3e4575ef2c84a58656de1323d57b11085c5eba4dee5daedfd63e5
 size 335706186