jschmock
/

kaelte

@@ -15,8 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [svalabs/gbert-large-zeroshot-nli](https://huggingface.co/svalabs/gbert-large-zeroshot-nli) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0022
-- F1: 1.0
 ## Model description
@@ -36,11 +36,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 10
@@ -49,16 +49,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
-| No log        | 0.94  | 11   | 0.6783          | 0.7786 |
-| No log        | 1.94  | 22   | 0.1718          | 0.9272 |
-| No log        | 2.94  | 33   | 0.0769          | 0.9887 |
-| No log        | 3.94  | 44   | 0.0686          | 0.9887 |
-| No log        | 4.94  | 55   | 0.0227          | 0.9887 |
-| No log        | 5.94  | 66   | 0.0075          | 1.0    |
-| No log        | 6.94  | 77   | 0.0099          | 1.0    |
-| No log        | 7.94  | 88   | 0.0025          | 1.0    |
-| No log        | 8.94  | 99   | 0.0023          | 1.0    |
-| No log        | 9.94  | 110  | 0.0022          | 1.0    |
 ### Framework versions

 This model is a fine-tuned version of [svalabs/gbert-large-zeroshot-nli](https://huggingface.co/svalabs/gbert-large-zeroshot-nli) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1126
+- F1: 0.9887
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 10
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
+| No log        | 0.98  | 46   | 0.3264          | 0.9074 |
+| No log        | 1.98  | 92   | 0.1266          | 0.9590 |
+| No log        | 2.98  | 138  | 0.0603          | 0.9887 |
+| No log        | 3.98  | 184  | 0.1000          | 0.9887 |
+| No log        | 4.98  | 230  | 0.1075          | 0.9887 |
+| No log        | 5.98  | 276  | 0.1091          | 0.9887 |
+| No log        | 6.98  | 322  | 0.1109          | 0.9887 |
+| No log        | 7.98  | 368  | 0.1119          | 0.9887 |
+| No log        | 8.98  | 414  | 0.1124          | 0.9887 |
+| No log        | 9.98  | 460  | 0.1126          | 0.9887 |
 ### Framework versions