tyzhu commited on
Commit
e472186
1 Parent(s): 5767912

Model save

Browse files
Files changed (1) hide show
  1. README.md +32 -30
README.md CHANGED
@@ -6,6 +6,7 @@ metrics:
6
  model-index:
7
  - name: lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3__home_aiops_zhuty_lm_indexer_data_tyzhu_
8
  results: []
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model was trained from scratch on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Accuracy: 0.7597
19
- - Loss: 0.8055
20
 
21
  ## Model description
22
 
@@ -36,12 +37,12 @@ More information needed
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0001
39
- - train_batch_size: 2
40
  - eval_batch_size: 2
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
  - num_devices: 4
44
- - gradient_accumulation_steps: 4
45
  - total_train_batch_size: 32
46
  - total_eval_batch_size: 8
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -51,33 +52,34 @@ The following hyperparameters were used during training:
51
 
52
  ### Training results
53
 
54
- | Training Loss | Epoch | Step | Accuracy | Validation Loss |
55
- |:-------------:|:-----:|:----:|:--------:|:---------------:|
56
- | 0.5411 | 1.0 | 187 | 0.7904 | 0.3938 |
57
- | 0.362 | 2.0 | 375 | 0.7918 | 0.3804 |
58
- | 0.3047 | 3.0 | 562 | 0.7891 | 0.3934 |
59
- | 0.2469 | 4.0 | 750 | 0.7846 | 0.4226 |
60
- | 0.2022 | 5.0 | 937 | 0.7803 | 0.4661 |
61
- | 0.1681 | 6.0 | 1125 | 0.7761 | 0.5123 |
62
- | 0.1404 | 7.0 | 1312 | 0.7721 | 0.5731 |
63
- | 0.1197 | 8.0 | 1500 | 0.7701 | 0.6075 |
64
- | 0.1 | 9.0 | 1687 | 0.7688 | 0.6317 |
65
- | 0.089 | 10.0 | 1875 | 0.7664 | 0.6718 |
66
- | 0.0837 | 11.0 | 2062 | 0.7653 | 0.6922 |
67
- | 0.0788 | 12.0 | 2250 | 0.7632 | 0.7254 |
68
- | 0.0761 | 13.0 | 2437 | 0.7629 | 0.7256 |
69
- | 0.0749 | 14.0 | 2625 | 0.7621 | 0.7534 |
70
- | 0.0741 | 15.0 | 2812 | 0.7620 | 0.7529 |
71
- | 0.0726 | 16.0 | 3000 | 0.7611 | 0.7678 |
72
- | 0.0687 | 17.0 | 3187 | 0.7610 | 0.7728 |
73
- | 0.0682 | 18.0 | 3375 | 0.7603 | 0.7807 |
74
- | 0.0682 | 19.0 | 3562 | 0.7610 | 0.7872 |
75
- | 0.0682 | 19.95 | 3740 | 0.7597 | 0.8055 |
76
 
77
 
78
  ### Framework versions
79
 
80
- - Transformers 4.34.0
81
- - Pytorch 2.1.0+cu121
82
- - Datasets 2.18.0
83
- - Tokenizers 0.14.1
 
 
6
  model-index:
7
  - name: lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3__home_aiops_zhuty_lm_indexer_data_tyzhu_
8
  results: []
9
+ library_name: peft
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  This model was trained from scratch on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.0955
20
+ - Accuracy: 0.7196
21
 
22
  ## Model description
23
 
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 0.0001
40
+ - train_batch_size: 1
41
  - eval_batch_size: 2
42
  - seed: 42
43
  - distributed_type: multi-GPU
44
  - num_devices: 4
45
+ - gradient_accumulation_steps: 8
46
  - total_train_batch_size: 32
47
  - total_eval_batch_size: 8
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
52
 
53
  ### Training results
54
 
55
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
56
+ |:-------------:|:-------:|:----:|:---------------:|:--------:|
57
+ | 0.7054 | 0.9973 | 187 | 0.5535 | 0.7686 |
58
+ | 0.4975 | 2.0 | 375 | 0.5416 | 0.7693 |
59
+ | 0.422 | 2.9973 | 562 | 0.5611 | 0.7645 |
60
+ | 0.3527 | 4.0 | 750 | 0.6100 | 0.7573 |
61
+ | 0.2941 | 4.9973 | 937 | 0.6599 | 0.7522 |
62
+ | 0.2518 | 6.0 | 1125 | 0.7200 | 0.7458 |
63
+ | 0.2138 | 6.9973 | 1312 | 0.7651 | 0.7421 |
64
+ | 0.1824 | 8.0 | 1500 | 0.8280 | 0.7379 |
65
+ | 0.1481 | 8.9973 | 1687 | 0.8700 | 0.7355 |
66
+ | 0.1298 | 10.0 | 1875 | 0.9146 | 0.7329 |
67
+ | 0.1167 | 10.9973 | 2062 | 0.9337 | 0.7309 |
68
+ | 0.1094 | 12.0 | 2250 | 0.9733 | 0.7281 |
69
+ | 0.1052 | 12.9973 | 2437 | 0.9980 | 0.7266 |
70
+ | 0.1007 | 14.0 | 2625 | 1.0022 | 0.7256 |
71
+ | 0.0971 | 14.9973 | 2812 | 1.0422 | 0.7234 |
72
+ | 0.0954 | 16.0 | 3000 | 1.0441 | 0.7236 |
73
+ | 0.0888 | 16.9973 | 3187 | 1.0574 | 0.7223 |
74
+ | 0.0879 | 18.0 | 3375 | 1.0728 | 0.7216 |
75
+ | 0.0879 | 18.9973 | 3562 | 1.0768 | 0.7200 |
76
+ | 0.0883 | 19.9467 | 3740 | 1.0955 | 0.7196 |
77
 
78
 
79
  ### Framework versions
80
 
81
+ - PEFT 0.5.0
82
+ - Transformers 4.40.2
83
+ - Pytorch 2.3.0
84
+ - Datasets 2.19.1
85
+ - Tokenizers 0.19.1