nejox
/

distilbert-base-cased-distilled-squad-coffee20230108

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

nejox commited on Jan 8, 2023

Commit

5229ef2

•

1 Parent(s): 73dec53

update model card README.md

Files changed (1) hide show

README.md +18 -18

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert-base-cased-distilled-squad) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.1937
 ## Model description
@@ -34,8 +34,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -46,21 +46,21 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 2    | 2.5987          |
-| No log        | 2.0   | 4    | 2.5874          |
-| No log        | 3.0   | 6    | 2.5674          |
-| No log        | 4.0   | 8    | 2.5434          |
-| No log        | 5.0   | 10   | 2.5217          |
-| No log        | 6.0   | 12   | 2.4956          |
-| No log        | 7.0   | 14   | 2.4719          |
-| No log        | 8.0   | 16   | 2.4341          |
-| No log        | 9.0   | 18   | 2.3780          |
-| No log        | 10.0  | 20   | 2.3192          |
-| No log        | 11.0  | 22   | 2.2701          |
-| No log        | 12.0  | 24   | 2.2405          |
-| No log        | 13.0  | 26   | 2.2175          |
-| No log        | 14.0  | 28   | 2.2044          |
-| No log        | 15.0  | 30   | 2.1937          |
 ### Framework versions

 This model is a fine-tuned version of [distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert-base-cased-distilled-squad) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7601
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 3    | 2.3917          |
+| No log        | 2.0   | 6    | 2.3782          |
+| No log        | 3.0   | 9    | 2.3533          |
+| No log        | 4.0   | 12   | 2.3143          |
+| No log        | 5.0   | 15   | 2.2670          |
+| No log        | 6.0   | 18   | 2.2112          |
+| No log        | 7.0   | 21   | 2.1549          |
+| No log        | 8.0   | 24   | 2.0915          |
+| No log        | 9.0   | 27   | 2.0377          |
+| No log        | 10.0  | 30   | 1.9545          |
+| No log        | 11.0  | 33   | 1.8787          |
+| No log        | 12.0  | 36   | 1.8299          |
+| No log        | 13.0  | 39   | 1.7897          |
+| No log        | 14.0  | 42   | 1.7662          |
+| No log        | 15.0  | 45   | 1.7601          |
 ### Framework versions