--- license: mit --- | Step | Training Loss | |------|---------------| | 1 | 2.888900 | | 2 | 2.837100 | | 3 | 2.635100 | | 4 | 2.492900 | | 5 | 2.323500 | | 6 | 2.192500 | | 7 | 2.019600 | | 8 | 1.975600 | | 9 | 1.957800 | | 10 | 1.857500 | | 11 | 1.869300 | | 12 | 1.649600 | | 13 | 1.460900 | | 14 | 1.401600 | | 15 | 1.352100 | | 16 | 1.311700 | | 17 | 1.302700 | | 18 | 1.079800 | | 19 | 1.008900 | | 20 | 0.965100 | ``` TrainOutput(global_step=20, training_loss=1.8291159331798554, metrics={'train_runtime': 327.2521, 'train_samples_per_second': 7.823, 'train_steps_per_second': 0.061, 'total_flos': 4549261115523072.0, 'train_loss': 1.8291159331798554, 'epoch': 3.48}) ```