End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8141
 ## Model description
@@ -36,10 +36,10 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
-- eval_batch_size: 16
 - seed: 42
-- gradient_accumulation_steps: 64
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1500
@@ -49,8 +49,17 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.8515        | 1.42  | 2    | 4.1959          |
-| 1.9816        | 2.84  | 4    | 3.8141          |
 ### Framework versions

 This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 20.5838
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
+- eval_batch_size: 4
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1500
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 21.7989       | 0.36  | 4    | 22.5208         |
+| 20.1605       | 0.71  | 8    | 22.5098         |
+| 21.2348       | 1.07  | 12   | 22.6825         |
+| 21.615        | 1.42  | 16   | 22.2299         |
+| 20.4123       | 1.78  | 20   | 22.0543         |
+| 20.8174       | 2.13  | 24   | 22.1491         |
+| 20.7756       | 2.49  | 28   | 21.9382         |
+| 19.8654       | 2.84  | 32   | 21.6735         |
+| 20.6743       | 3.2   | 36   | 21.3893         |
+| 20.3151       | 3.56  | 40   | 21.1435         |
+| 19.6614       | 3.91  | 44   | 20.5838         |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:776994b40ac37027e65ee158ca40c781444730396f82ac9339f3f0f344301f77
 size 594941266

 version https://git-lfs.github.com/spec/v1
+oid sha256:33d7a6ab97ae94c2ab33ae3d3ffdaf065ba7e61d85b6eb52a45f99be35e9f77b
 size 594941266

tokenizer.json CHANGED Viewed

@@ -1,7 +1,21 @@
 {
   "version": "1.0",
-  "truncation": null,
-  "padding": null,
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Right",
+    "max_length": 1028,
+    "strategy": "LongestFirst",
+    "stride": 0
+  },
+  "padding": {
+    "strategy": {
+      "Fixed": 1028
+    },
+    "direction": "Right",
+    "pad_to_multiple_of": null,
+    "pad_id": 1,
+    "pad_type_id": 0,
+    "pad_token": "<pad>"
+  },
   "added_tokens": [
     {
       "id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:afb0ffc0dc18e0133082b740d841c21a8318df481825164d2f9a7d761d17ae78
 size 4536

 version https://git-lfs.github.com/spec/v1
+oid sha256:49aba76b9e4e90c8c3669d6397a822e705d137da816c65797e3b29b6eeb24a83
 size 4536