NibiruTwin commited on
Commit
41b673b
1 Parent(s): 7d2e768

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -81,6 +81,9 @@ num_train_epochs = 3,
81
  O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
82
  \ / Total batch size = 8 | Total steps = 36
83
  "-____-" Number of trainable parameters = 125,173,760
 
 
 
84
  [36/36 01:54, Epoch 2/3]
85
  Step Training Loss rewards / chosen rewards / rejected rewards / accuracies rewards / margins logps / rejected logps / chosen logits / rejected logits / chosen
86
  1 0.000100 3.348554 -6.467402 1.000000 9.815956 -169.229355 -175.684265 0.611595 0.875623
@@ -120,7 +123,6 @@ Step Training Loss rewards / chosen rewards / rejected rewards / accuracies rewa
120
  35 0.000100 5.392914 -4.364581 1.000000 9.757494 -97.780762 -143.621002 0.270843 0.839165
121
  36 0.000100 2.788383 -7.393952 1.000000 10.182335 -154.236618 -184.300690 0.392709 0.757870
122
  TrainOutput(global_step=36, training_loss=0.00038064550871139445, metrics={'train_runtime': 118.2651, 'train_samples_per_second': 2.537, 'train_steps_per_second': 0.304, 'total_flos': 0.0, 'train_loss': 0.00038064550871139445, 'epoch': 2.88})
123
-
124
  ```
125
 
126
  結果として大喜利については以下のような答えになっていました。
 
81
  O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
82
  \ / Total batch size = 8 | Total steps = 36
83
  "-____-" Number of trainable parameters = 125,173,760
84
+
85
+
86
+ ```python:
87
  [36/36 01:54, Epoch 2/3]
88
  Step Training Loss rewards / chosen rewards / rejected rewards / accuracies rewards / margins logps / rejected logps / chosen logits / rejected logits / chosen
89
  1 0.000100 3.348554 -6.467402 1.000000 9.815956 -169.229355 -175.684265 0.611595 0.875623
 
123
  35 0.000100 5.392914 -4.364581 1.000000 9.757494 -97.780762 -143.621002 0.270843 0.839165
124
  36 0.000100 2.788383 -7.393952 1.000000 10.182335 -154.236618 -184.300690 0.392709 0.757870
125
  TrainOutput(global_step=36, training_loss=0.00038064550871139445, metrics={'train_runtime': 118.2651, 'train_samples_per_second': 2.537, 'train_steps_per_second': 0.304, 'total_flos': 0.0, 'train_loss': 0.00038064550871139445, 'epoch': 2.88})
 
126
  ```
127
 
128
  結果として大喜利については以下のような答えになっていました。