tyzhu commited on
Commit
c21f053
1 Parent(s): fc8b17f

Model save

Browse files
Files changed (1) hide show
  1. README.md +30 -28
README.md CHANGED
@@ -6,6 +6,7 @@ metrics:
6
  model-index:
7
  - name: lmind_nq_train6000_eval6489_v1_recite_qa_v3__home_aiops_zhuty_lm_indexer_data_tyzhu_lmin
8
  results: []
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model was trained from scratch on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 0.3984
19
- - Accuracy: 0.8068
20
 
21
  ## Model description
22
 
@@ -36,12 +37,12 @@ More information needed
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0001
39
- - train_batch_size: 2
40
  - eval_batch_size: 2
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
  - num_devices: 4
44
- - gradient_accumulation_steps: 4
45
  - total_train_batch_size: 32
46
  - total_eval_batch_size: 8
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -53,31 +54,32 @@ The following hyperparameters were used during training:
53
 
54
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
55
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
56
- | 0.2952 | 1.0 | 529 | 0.3721 | 0.7925 |
57
- | 0.2481 | 2.0 | 1058 | 0.3231 | 0.8003 |
58
- | 0.1935 | 3.0 | 1587 | 0.2963 | 0.8044 |
59
- | 0.1593 | 4.0 | 2116 | 0.2872 | 0.8062 |
60
- | 0.1405 | 5.0 | 2645 | 0.2908 | 0.8067 |
61
- | 0.1235 | 6.0 | 3174 | 0.2929 | 0.8072 |
62
- | 0.1154 | 7.0 | 3703 | 0.3109 | 0.8071 |
63
- | 0.106 | 8.0 | 4232 | 0.3179 | 0.8069 |
64
- | 0.0997 | 9.0 | 4761 | 0.3339 | 0.8071 |
65
- | 0.095 | 10.0 | 5290 | 0.3424 | 0.8067 |
66
- | 0.0922 | 11.0 | 5819 | 0.3516 | 0.8066 |
67
- | 0.089 | 12.0 | 6348 | 0.3720 | 0.8063 |
68
- | 0.0862 | 13.0 | 6877 | 0.3740 | 0.8065 |
69
- | 0.0862 | 14.0 | 7406 | 0.3681 | 0.8070 |
70
- | 0.0852 | 15.0 | 7935 | 0.3771 | 0.8067 |
71
- | 0.0849 | 16.0 | 8464 | 0.3814 | 0.8066 |
72
- | 0.083 | 17.0 | 8993 | 0.3799 | 0.8065 |
73
- | 0.0838 | 18.0 | 9522 | 0.3887 | 0.8068 |
74
- | 0.0834 | 19.0 | 10051 | 0.3909 | 0.8067 |
75
- | 0.0818 | 20.0 | 10580 | 0.3984 | 0.8068 |
76
 
77
 
78
  ### Framework versions
79
 
80
- - Transformers 4.34.0
81
- - Pytorch 2.1.0+cu121
82
- - Datasets 2.18.0
83
- - Tokenizers 0.14.1
 
 
6
  model-index:
7
  - name: lmind_nq_train6000_eval6489_v1_recite_qa_v3__home_aiops_zhuty_lm_indexer_data_tyzhu_lmin
8
  results: []
9
+ library_name: peft
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  This model was trained from scratch on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.4657
20
+ - Accuracy: 0.7995
21
 
22
  ## Model description
23
 
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 0.0001
40
+ - train_batch_size: 1
41
  - eval_batch_size: 2
42
  - seed: 42
43
  - distributed_type: multi-GPU
44
  - num_devices: 4
45
+ - gradient_accumulation_steps: 8
46
  - total_train_batch_size: 32
47
  - total_eval_batch_size: 8
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
56
  |:-------------:|:-----:|:-----:|:---------------:|:--------:|
57
+ | 0.4253 | 1.0 | 529 | 0.5042 | 0.7770 |
58
+ | 0.3444 | 2.0 | 1058 | 0.4371 | 0.7875 |
59
+ | 0.2679 | 3.0 | 1587 | 0.3925 | 0.7946 |
60
+ | 0.2195 | 4.0 | 2116 | 0.3709 | 0.7977 |
61
+ | 0.1889 | 5.0 | 2645 | 0.3616 | 0.7998 |
62
+ | 0.1724 | 6.0 | 3174 | 0.3608 | 0.8002 |
63
+ | 0.1573 | 7.0 | 3703 | 0.3646 | 0.8006 |
64
+ | 0.144 | 8.0 | 4232 | 0.3774 | 0.8000 |
65
+ | 0.1353 | 9.0 | 4761 | 0.3889 | 0.8000 |
66
+ | 0.1281 | 10.0 | 5290 | 0.3975 | 0.8000 |
67
+ | 0.124 | 11.0 | 5819 | 0.4108 | 0.7998 |
68
+ | 0.1169 | 12.0 | 6348 | 0.4183 | 0.8001 |
69
+ | 0.1128 | 13.0 | 6877 | 0.4249 | 0.7997 |
70
+ | 0.1108 | 14.0 | 7406 | 0.4259 | 0.8004 |
71
+ | 0.1078 | 15.0 | 7935 | 0.4435 | 0.7994 |
72
+ | 0.1065 | 16.0 | 8464 | 0.4421 | 0.7999 |
73
+ | 0.104 | 17.0 | 8993 | 0.4450 | 0.7998 |
74
+ | 0.103 | 18.0 | 9522 | 0.4554 | 0.7995 |
75
+ | 0.1033 | 19.0 | 10051 | 0.4556 | 0.7997 |
76
+ | 0.1041 | 20.0 | 10580 | 0.4657 | 0.7995 |
77
 
78
 
79
  ### Framework versions
80
 
81
+ - PEFT 0.5.0
82
+ - Transformers 4.40.2
83
+ - Pytorch 2.3.0
84
+ - Datasets 2.19.1
85
+ - Tokenizers 0.19.1