End of training

Browse files

Files changed (9) hide show

README.md +7 -17
added_tokens.json +0 -0
config.json +1 -1
model.safetensors +3 -0
runs/Nov03_17-27-05_ec4c1cbe8b31/events.out.tfevents.1699032431.ec4c1cbe8b31.220.0 +3 -0
special_tokens_map.json +1 -1
tokenizer.json +0 -0
tokenizer_config.json +0 -0
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.6416
 ## Model description
@@ -36,10 +36,10 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
-- eval_batch_size: 4
 - seed: 42
-- gradient_accumulation_steps: 56
-- total_train_batch_size: 112
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1500
@@ -49,23 +49,13 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 4.3605        | 0.31  | 50   | 3.7983          |
-| 3.9502        | 0.62  | 100  | 3.4157          |
-| 3.4719        | 0.93  | 150  | 3.2378          |
-| 3.3937        | 1.24  | 200  | 3.0982          |
-| 3.3123        | 1.56  | 250  | 3.0227          |
-| 3.3164        | 1.87  | 300  | 2.9341          |
-| 2.9789        | 2.18  | 350  | 2.8408          |
-| 2.9593        | 2.49  | 400  | 2.8201          |
-| 2.8724        | 2.8   | 450  | 2.7561          |
-| 2.8753        | 3.11  | 500  | 2.7083          |
-| 2.8017        | 3.42  | 550  | 2.6593          |
-| 2.7496        | 3.73  | 600  | 2.6416          |
 ### Framework versions
-- Transformers 4.34.1
 - Pytorch 2.1.0+cu118
 - Datasets 2.14.6
 - Tokenizers 0.14.1

 This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.5439
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 2
+- eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 64
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1500
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.5237        | 1.42  | 2    | 4.9660          |
+| 2.6047        | 2.84  | 4    | 4.5439          |
 ### Framework versions
+- Transformers 4.35.0
 - Pytorch 2.1.0+cu118
 - Datasets 2.14.6
 - Tokenizers 0.14.1

added_tokens.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

config.json CHANGED Viewed

@@ -37,7 +37,7 @@
   "pad_token_id": 1,
   "sep_token_id": 2,
   "torch_dtype": "float32",
-  "transformers_version": "4.34.1",
   "type_vocab_size": 1,
   "vocab_size": 60474
 }

   "pad_token_id": 1,
   "sep_token_id": 2,
   "torch_dtype": "float32",
+  "transformers_version": "4.35.0",
   "type_vocab_size": 1,
   "vocab_size": 60474
 }

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2aa52e2f2845f15d778d84ff5f991f3752048bae656b8f9bff0f115157ed0975
+size 626282192

runs/Nov03_17-27-05_ec4c1cbe8b31/events.out.tfevents.1699032431.ec4c1cbe8b31.220.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0718df9a92d32f3780076c44cc1510c0d9217cfedd4292288f0e6a5c1d50ee07
+size 5600

special_tokens_map.json CHANGED Viewed

@@ -9,7 +9,7 @@
     "rstrip": false,
     "single_word": false
   },
-  "pad_token": "</s>",
   "sep_token": "</s>",
   "unk_token": "<unk>"
 }

     "rstrip": false,
     "single_word": false
   },
+  "pad_token": "<pad>",
   "sep_token": "</s>",
   "unk_token": "<unk>"
 }

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b0f86460df581f0cf4d866cd5242d81490b367c4cdb1e6f5e92bea790fa94125
-size 4536

 version https://git-lfs.github.com/spec/v1
+oid sha256:063e274ba203b8e66b8dcadbf0e2b51a6300f8a5ae9a4f3c887b57632113f4e3
+size 4600