darrenfishell
/

t5-small-samsum-ft-experiment_2

@@ -22,7 +22,7 @@ model-index:
     metrics:
     - name: Rouge1
       type: rouge
-      value: 0.4496
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,12 +32,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the samsum dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5427
-- Rouge1: 0.4496
-- Rouge2: 0.2191
-- Rougel: 0.3787
-- Rougelsum: 0.3788
-- Gen Len: 16.8863
 ## Model description
@@ -56,9 +56,9 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.001
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -69,9 +69,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
-| 0.6546        | 1.0   | 921  | 0.5639          | 0.4347 | 0.2026 | 0.3638 | 0.3641    | 17.077  |
-| 0.5514        | 2.0   | 1842 | 0.5464          | 0.4414 | 0.2121 | 0.3735 | 0.3735    | 16.8105 |
-| 0.4867        | 3.0   | 2763 | 0.5427          | 0.4496 | 0.2191 | 0.3787 | 0.3788    | 16.8863 |
 ### Framework versions

     metrics:
     - name: Rouge1
       type: rouge
+      value: 0.0982
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the samsum dataset.
 It achieves the following results on the evaluation set:
+- Loss: 8.2125
+- Rouge1: 0.0982
+- Rouge2: 0.0087
+- Rougel: 0.0982
+- Rougelsum: 0.0972
+- Gen Len: 19.0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
+| No log        | 1.0   | 1    | 8.2709          | 0.0982 | 0.0087 | 0.0982 | 0.0972    | 19.0    |
+| No log        | 2.0   | 2    | 8.2709          | 0.0982 | 0.0087 | 0.0982 | 0.0972    | 19.0    |
+| No log        | 3.0   | 3    | 8.2125          | 0.0982 | 0.0087 | 0.0982 | 0.0972    | 19.0    |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e8035478314ba52484b522e80c3561c8ad2f6e4f8dafd4308df32c873d40b80
 size 242041896

 version https://git-lfs.github.com/spec/v1
+oid sha256:7729ea41f89b067af4f76ce84bd87cfcd2ee1a2fe06460ef9721fa3657428e46
 size 242041896

tokenizer.json CHANGED Viewed

@@ -1,19 +1,7 @@
 {
   "version": "1.0",
-  "truncation": {
-    "direction": "Right",
-    "max_length": 128,
-    "strategy": "LongestFirst",
-    "stride": 0
-  },
-  "padding": {
-    "strategy": "BatchLongest",
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 0,
-    "pad_type_id": 0,
-    "pad_token": "<pad>"
-  },
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": null,
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d7808f354d86b88bbd30c88270101dc85d549508ce5b6881a8e4d8796c5ddd98
-size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:18011ce513982a71ecbb17e0a85a7cf23c5ebb82a01571dfbdf52a5e20236775
+size 5304