Sorour
/

cls_fomc_llama3_v1

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Sorour commited on May 24

Commit

09f1ef9

•

1 Parent(s): 546f3a3

Model save

Files changed (1) hide show

README.md +7 -17

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6391
 ## Model description
@@ -40,11 +40,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
@@ -55,24 +55,14 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.019         | 0.1646 | 10   | 0.6988          |
-| 0.6868        | 0.3292 | 20   | 0.6575          |
-| 0.6395        | 0.4938 | 30   | 0.6430          |
-| 0.6148        | 0.6584 | 40   | 0.6401          |
-| 0.6257        | 0.8230 | 50   | 0.6329          |
-| 0.5999        | 0.9877 | 60   | 0.6305          |
-| 0.4955        | 1.1523 | 70   | 0.6351          |
-| 0.4613        | 1.3169 | 80   | 0.6439          |
-| 0.473         | 1.4815 | 90   | 0.6487          |
-| 0.4793        | 1.6461 | 100  | 0.6423          |
-| 0.4812        | 1.8107 | 110  | 0.6449          |
-| 0.4892        | 1.9753 | 120  | 0.6391          |
 ### Framework versions
 - PEFT 0.11.1
-- Transformers 4.41.0
-- Pytorch 2.2.1+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6615
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.6769        | 0.6723 | 20   | 0.6754          |
+| 0.5815        | 1.3445 | 40   | 0.6615          |
 ### Framework versions
 - PEFT 0.11.1
+- Transformers 4.41.1
+- Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1