Shakhovak
/

Mistral-7B-Instruct-v0.2-absa-restaurants

Generated from Trainer

Model card Files Files and versions Community

Shakhovak commited on Apr 24

Commit

e11a93b

•

1 Parent(s): dc42ddc

End of training

Browse files

Files changed (3) hide show

README.md +13 -16
adapter_model.bin +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0395
 ## Model description
@@ -34,7 +34,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0003
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
@@ -43,26 +43,23 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- training_steps: 550
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.3557        | 0.36  | 40   | 0.0287          |
-| 0.0312        | 0.72  | 80   | 0.0241          |
-| 0.0235        | 1.08  | 120  | 0.0290          |
-| 0.0167        | 1.44  | 160  | 0.0240          |
-| 0.0163        | 1.8   | 200  | 0.0288          |
-| 0.0195        | 2.16  | 240  | 0.0280          |
-| 0.0094        | 2.52  | 280  | 0.0281          |
-| 0.0077        | 2.88  | 320  | 0.0292          |
-| 0.0047        | 3.24  | 360  | 0.0341          |
-| 0.003         | 3.6   | 400  | 0.0331          |
-| 0.003         | 3.96  | 440  | 0.0348          |
-| 0.0016        | 4.32  | 480  | 0.0408          |
-| 0.0009        | 4.68  | 520  | 0.0395          |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0261
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 3e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- training_steps: 400
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.1448        | 0.36  | 40   | 0.1779          |
+| 0.0645        | 0.72  | 80   | 0.0358          |
+| 0.0289        | 1.08  | 120  | 0.0301          |
+| 0.0243        | 1.44  | 160  | 0.0282          |
+| 0.0229        | 1.8   | 200  | 0.0263          |
+| 0.0197        | 2.16  | 240  | 0.0260          |
+| 0.0165        | 2.52  | 280  | 0.0258          |
+| 0.0163        | 2.88  | 320  | 0.0255          |
+| 0.014         | 3.24  | 360  | 0.0262          |
+| 0.0123        | 3.6   | 400  | 0.0261          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a04b8c726294798206cc30072f5c44bd5893ddc9c8b4f02af7c3526ec9c0ef96
 size 218196746

 version https://git-lfs.github.com/spec/v1
+oid sha256:b5cd2658e0e7ba6f153322fd68774ae0a038151a513b533c8c5c5262e2ea146d
 size 218196746

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:40e0f1c5bdd3d65cf06cc06df1518bebe77c69aa65d78467bcd7320c46fbab81
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:f387d87d263c4265b24a4fda94152600a997b452545a50195ad69fe6a0688fbe
 size 4984