End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -15,8 +15,6 @@ should probably proofread and complete it, then remove this comment. -->
 # t5-large-finetuned-lora
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: nan
 ## Model description
@@ -37,21 +35,17 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
 - train_batch_size: 1
-- eval_batch_size: 1
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- training_steps: 1000
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 0.0           | 1.0    | 295  | nan             |
-| 0.0           | 2.0    | 590  | nan             |
-| 0.0           | 3.0    | 885  | nan             |
-| 0.0           | 3.3898 | 1000 | nan             |
 ### Framework versions

 # t5-large-finetuned-lora
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
 - train_batch_size: 1
+- eval_batch_size: 16
 - seed: 42
+- gradient_accumulation_steps: 32
+- total_train_batch_size: 32
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 50
 - mixed_precision_training: Native AMP
 ### Training results
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -1,6 +1,9 @@
 {
   "alpha_pattern": {},
-  "auto_mapping": null,
   "base_model_name_or_path": "google/flan-t5-large",
   "bias": "none",
   "fan_in_fan_out": false,
@@ -22,10 +25,10 @@
   "target_modules": [
     "q",
     "o",
-    "v",
-    "k"
   ],
-  "task_type": "SEQ_2_SEQ_LM",
   "use_dora": false,
   "use_rslora": false
 }

 {
   "alpha_pattern": {},
+  "auto_mapping": {
+    "base_model_class": "T5ForConditionalGeneration",
+    "parent_library": "transformers.models.t5.modeling_t5"
+  },
   "base_model_name_or_path": "google/flan-t5-large",
   "bias": "none",
   "fan_in_fan_out": false,
   "target_modules": [
     "q",
     "o",
+    "k",
+    "v"
   ],
+  "task_type": null,
   "use_dora": false,
   "use_rslora": false
 }

runs/Nov23_13-11-26_ab82850555d3/events.out.tfevents.1732367488.ab82850555d3.387.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:dee2de4dc2fe4082411103d886c74f9f4835d1ec57070b19e2d969baffe60aed
+size 5209

runs/Nov23_13-11-55_ab82850555d3/events.out.tfevents.1732367519.ab82850555d3.387.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:d1ef41435e48a59b250012bd61eee21c2d92acaf7d6d39726dc7e04c558f0ec3
+size 17278

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a91dfa8ac0cc0f6bbe9e012432269dc557ad2471b78795c12cc37cef5982d787
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:c71a8bd68dc8b7808a874fb824404acecb320e8d7c9de0fb194365b6f0888348
 size 5304