NibiruTwin commited on
Commit
7d2e768
·
verified ·
1 Parent(s): cfff9c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -6
README.md CHANGED
@@ -69,19 +69,59 @@ https://huggingface.co/NibiruTwin/llm-jp-3-13b-c_it
69
  のデータセットを使用しました。
70
 
71
  A100の環境を用いても、DPOを回すのはGPUのメモリが足りなかったり、epoch数に限界があったので
 
72
 
73
  ```
74
- use_dataset = train_dataset.select(range(100))
75
-
76
- ```
77
 
78
- で100件までのモデルに絞り込み、
79
 
80
  ```
81
- num_train_epochs = 3,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  ```
84
- でtrainしました。
85
 
86
  結果として大喜利については以下のような答えになっていました。
87
 
 
69
  のデータセットを使用しました。
70
 
71
  A100の環境を用いても、DPOを回すのはGPUのメモリが足りなかったり、epoch数に限界があったので
72
+ num_train_epochs = 3,
73
 
74
  ```
75
+ でtrainしました。
 
 
76
 
 
77
 
78
  ```
79
+ ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
80
+ \\ /| Num examples = 100 | Num Epochs = 3
81
+ O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
82
+ \ / Total batch size = 8 | Total steps = 36
83
+ "-____-" Number of trainable parameters = 125,173,760
84
+ [36/36 01:54, Epoch 2/3]
85
+ Step Training Loss rewards / chosen rewards / rejected rewards / accuracies rewards / margins logps / rejected logps / chosen logits / rejected logits / chosen
86
+ 1 0.000100 3.348554 -6.467402 1.000000 9.815956 -169.229355 -175.684265 0.611595 0.875623
87
+ 2 0.000100 2.975397 -6.360792 1.000000 9.336189 -154.576660 -196.990601 0.632885 0.986017
88
+ 3 0.000100 4.033119 -4.941322 1.000000 8.974442 -127.297821 -175.932999 0.575199 1.004188
89
+ 4 0.000200 3.079573 -5.701199 1.000000 8.780772 -139.620758 -173.067078 -0.431688 0.508375
90
+ 5 0.000200 3.642621 -5.261364 1.000000 8.903986 -121.130615 -171.747650 0.855356 0.792505
91
+ 6 0.000300 3.081389 -5.276991 1.000000 8.358380 -131.040268 -180.695892 1.087221 1.099403
92
+ 7 0.000400 4.341475 -4.463219 1.000000 8.804693 -115.383461 -138.774704 -0.299891 0.583815
93
+ 8 0.001200 2.155223 -5.431589 1.000000 7.586812 -133.833633 -142.437195 0.526511 0.799039
94
+ 9 0.000500 2.844069 -4.996197 1.000000 7.840266 -150.136200 -176.394653 0.631835 0.720139
95
+ 10 0.001800 3.158137 -3.853688 1.000000 7.011826 -108.524597 -141.101532 0.738414 0.989277
96
+ 11 0.001300 3.399171 -3.538917 1.000000 6.938087 -78.884750 -146.172821 0.182737 0.624548
97
+ 12 0.003200 2.742315 -4.005011 1.000000 6.747325 -95.816872 -137.083588 -0.016122 0.276348
98
+ 13 0.000300 2.219271 -6.403042 1.000000 8.622313 -166.142792 -178.155762 0.178710 0.761686
99
+ 14 0.000300 2.699187 -6.124379 1.000000 8.823566 -131.572098 -140.794952 0.002769 0.447839
100
+ 15 0.000500 4.734462 -4.763044 1.000000 9.497507 -111.550545 -181.736755 1.172358 1.424275
101
+ 16 0.000100 3.982580 -5.619477 1.000000 9.602057 -129.902466 -197.776779 0.071804 0.588056
102
+ 17 0.000100 4.331498 -6.222175 1.000000 10.553673 -165.588058 -164.591766 1.094546 0.889692
103
+ 18 0.000100 4.991781 -4.481319 1.000000 9.473101 -91.877243 -150.005219 -0.047461 0.751593
104
+ 19 0.000200 3.501364 -6.373612 1.000000 9.874977 -140.134232 -170.552658 0.189669 0.601683
105
+ 20 0.000200 3.605657 -5.074142 1.000000 8.679799 -117.568741 -153.246170 -0.309885 0.501098
106
+ 21 0.000200 3.203712 -6.348371 1.000000 9.552082 -156.897690 -186.776581 1.007442 1.271394
107
+ 22 0.000200 3.929119 -5.364758 1.000000 9.293877 -112.621918 -93.523651 -0.448841 0.339729
108
+ 23 0.000200 4.845633 -4.518156 1.000000 9.363789 -96.659676 -147.822693 -0.233996 0.634066
109
+ 24 0.000200 3.045211 -6.681721 1.000000 9.726932 -144.385818 -143.605927 0.440999 0.664074
110
+ 25 0.000200 2.850045 -6.654698 1.000000 9.504744 -190.274475 -173.323746 1.471910 0.817418
111
+ 26 0.000100 3.326446 -6.145639 1.000000 9.472086 -139.847061 -194.209137 1.316685 1.462867
112
+ 27 0.000100 3.676937 -6.083375 1.000000 9.760312 -129.533386 -160.582367 -0.027238 0.892122
113
+ 28 0.000100 4.144113 -5.807096 1.000000 9.951208 -145.089432 -207.662384 1.121619 1.289729
114
+ 29 0.000100 3.373916 -6.345547 1.000000 9.719462 -183.863174 -159.793610 0.931905 0.620221
115
+ 30 0.000200 3.944859 -5.516920 1.000000 9.461779 -128.763199 -124.433006 0.235643 0.282284
116
+ 31 0.000200 3.264518 -6.059732 1.000000 9.324250 -132.112762 -134.414032 0.035966 0.557026
117
+ 32 0.000100 3.494095 -6.447097 1.000000 9.941193 -130.592957 -129.188766 -0.017757 0.388424
118
+ 33 0.000200 3.858253 -5.999524 1.000000 9.857778 -144.017731 -133.201035 0.412320 0.247053
119
+ 34 0.000200 4.195073 -5.941508 1.000000 10.136581 -144.858795 -162.841766 -0.020949 0.564567
120
+ 35 0.000100 5.392914 -4.364581 1.000000 9.757494 -97.780762 -143.621002 0.270843 0.839165
121
+ 36 0.000100 2.788383 -7.393952 1.000000 10.182335 -154.236618 -184.300690 0.392709 0.757870
122
+ TrainOutput(global_step=36, training_loss=0.00038064550871139445, metrics={'train_runtime': 118.2651, 'train_samples_per_second': 2.537, 'train_steps_per_second': 0.304, 'total_flos': 0.0, 'train_loss': 0.00038064550871139445, 'epoch': 2.88})
123
 
124
  ```
 
125
 
126
  結果として大喜利については以下のような答えになっていました。
127