kornellewy commited on
Commit
440eb11
·
verified ·
1 Parent(s): 80cbddc

End of training

Browse files
README.md CHANGED
@@ -15,8 +15,6 @@ should probably proofread and complete it, then remove this comment. -->
15
  # t5-large-finetuned-lora
16
 
17
  This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: nan
20
 
21
  ## Model description
22
 
@@ -37,21 +35,17 @@ More information needed
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0001
39
  - train_batch_size: 1
40
- - eval_batch_size: 1
41
  - seed: 42
 
 
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
- - training_steps: 1000
45
  - mixed_precision_training: Native AMP
46
 
47
  ### Training results
48
 
49
- | Training Loss | Epoch | Step | Validation Loss |
50
- |:-------------:|:------:|:----:|:---------------:|
51
- | 0.0 | 1.0 | 295 | nan |
52
- | 0.0 | 2.0 | 590 | nan |
53
- | 0.0 | 3.0 | 885 | nan |
54
- | 0.0 | 3.3898 | 1000 | nan |
55
 
56
 
57
  ### Framework versions
 
15
  # t5-large-finetuned-lora
16
 
17
  This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 
 
18
 
19
  ## Model description
20
 
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.0001
37
  - train_batch_size: 1
38
+ - eval_batch_size: 16
39
  - seed: 42
40
+ - gradient_accumulation_steps: 32
41
+ - total_train_batch_size: 32
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 50
45
  - mixed_precision_training: Native AMP
46
 
47
  ### Training results
48
 
 
 
 
 
 
 
49
 
50
 
51
  ### Framework versions
adapter_config.json CHANGED
@@ -1,6 +1,9 @@
1
  {
2
  "alpha_pattern": {},
3
- "auto_mapping": null,
 
 
 
4
  "base_model_name_or_path": "google/flan-t5-large",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
@@ -22,10 +25,10 @@
22
  "target_modules": [
23
  "q",
24
  "o",
25
- "v",
26
- "k"
27
  ],
28
- "task_type": "SEQ_2_SEQ_LM",
29
  "use_dora": false,
30
  "use_rslora": false
31
  }
 
1
  {
2
  "alpha_pattern": {},
3
+ "auto_mapping": {
4
+ "base_model_class": "T5ForConditionalGeneration",
5
+ "parent_library": "transformers.models.t5.modeling_t5"
6
+ },
7
  "base_model_name_or_path": "google/flan-t5-large",
8
  "bias": "none",
9
  "fan_in_fan_out": false,
 
25
  "target_modules": [
26
  "q",
27
  "o",
28
+ "k",
29
+ "v"
30
  ],
31
+ "task_type": null,
32
  "use_dora": false,
33
  "use_rslora": false
34
  }
runs/Nov23_13-11-26_ab82850555d3/events.out.tfevents.1732367488.ab82850555d3.387.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dee2de4dc2fe4082411103d886c74f9f4835d1ec57070b19e2d969baffe60aed
3
+ size 5209
runs/Nov23_13-11-55_ab82850555d3/events.out.tfevents.1732367519.ab82850555d3.387.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1ef41435e48a59b250012bd61eee21c2d92acaf7d6d39726dc7e04c558f0ec3
3
+ size 17278
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a91dfa8ac0cc0f6bbe9e012432269dc557ad2471b78795c12cc37cef5982d787
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c71a8bd68dc8b7808a874fb824404acecb320e8d7c9de0fb194365b6f0888348
3
  size 5304