llm/llama38binstruct-summary-100s

Browse files

Files changed (4) hide show

README.md +7 -15
adapter_model.safetensors +2 -2
runs/Jun19_04-27-12_0113f146e29c/events.out.tfevents.1718771246.0113f146e29c.1122.6 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.2478
 ## Model description
@@ -39,7 +39,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
 - train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
@@ -48,24 +48,16 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_steps: 10
-- training_steps: 300
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.0063        | 10.0  | 25   | 2.9544          |
-| 0.0033        | 20.0  | 50   | 3.1133          |
-| 0.0057        | 30.0  | 75   | 2.5821          |
-| 0.0032        | 40.0  | 100  | 2.9857          |
-| 0.0021        | 50.0  | 125  | 3.1502          |
-| 0.0019        | 60.0  | 150  | 3.0546          |
-| 0.0026        | 70.0  | 175  | 2.7894          |
-| 0.0045        | 80.0  | 200  | 2.6616          |
-| 0.0014        | 90.0  | 225  | 3.1916          |
-| 0.0009        | 100.0 | 250  | 3.2146          |
-| 0.0007        | 110.0 | 275  | 3.2346          |
-| 0.0006        | 120.0 | 300  | 3.2478          |
 ### Framework versions

 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.8113
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_steps: 10
+- training_steps: 100
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.3176        | 10.0  | 25   | 2.8113          |
+| 2.3111        | 20.0  | 50   | 2.8113          |
+| 2.3098        | 30.0  | 75   | 2.8113          |
+| 2.3188        | 40.0  | 100  | 2.8113          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ef728538a619ab3e4f2aa024958451bbd939b654cde566a8a811535264e05487
-size 167832240

 version https://git-lfs.github.com/spec/v1
+oid sha256:e44ce263e6fd885f50d82ca515b9325375b43ee36ededb75acf161ce88bc2e41
+size 48

runs/Jun19_04-27-12_0113f146e29c/events.out.tfevents.1718771246.0113f146e29c.1122.6 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5329dace34d485bd18ce3c6d5f1c91551e37130c2869f081938d5b56cd7d5c0a
+size 9236

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:da7ab063a3e2ac331c4fbea43e632b2ee7ef1029442c59327726294ac03a99a1
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:a3e39e9be223e4e51725d5a54334094cd4d30b30442d71c05ce7347ff488a3f1
 size 5432