Tuch
/

results_1

Generated from Trainer

Model card Files Files and versions Community

Tuch commited on Jul 17, 2024

Commit

7ad17e7

·

verified ·

1 Parent(s): e91c1c2

Model save

Files changed (2) hide show

README.md +10 -20
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -16,9 +16,14 @@ should probably proofread and complete it, then remove this comment. -->
 # results_1
-This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5411
 ## Model description
@@ -38,33 +43,18 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 3
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 12
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 12
-### Training results
-| Training Loss | Epoch   | Step | Validation Loss |
-|:-------------:|:-------:|:----:|:---------------:|
-| 0.7711        | 1.1844  | 53   | 0.6969          |
-| 0.614         | 2.3687  | 106  | 0.6113          |
-| 0.5994        | 3.5531  | 159  | 0.5945          |
-| 0.5602        | 4.7374  | 212  | 0.5745          |
-| 0.5807        | 5.9218  | 265  | 0.5630          |
-| 0.5484        | 7.1061  | 318  | 0.5532          |
-| 0.543         | 8.2905  | 371  | 0.5472          |
-| 0.4977        | 9.4749  | 424  | 0.5436          |
-| 0.5072        | 10.6592 | 477  | 0.5411          |
 ### Framework versions
-- PEFT 0.10.0
 - Transformers 4.42.3
 - Pytorch 2.3.1+cu121
 - Datasets 2.18.0

 # results_1
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: 1.3737
+- eval_runtime: 77.6899
+- eval_samples_per_second: 5.779
+- eval_steps_per_second: 0.734
+- epoch: 2.4053
+- step: 270
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 12
 ### Framework versions
+- PEFT 0.11.1
 - Transformers 4.42.3
 - Pytorch 2.3.1+cu121
 - Datasets 2.18.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:caa2a38e26fbcedb79d02dfa52e64db5a2faab9e02d09d8637aed1b8d49e388a
 size 125889008

 version https://git-lfs.github.com/spec/v1
+oid sha256:d2d33e80f0005f1074da3326773743efdfcd82e44609befee1b774c786c2106f
 size 125889008