astrollama-3-8b-chat_aic / train_results.json
research4pan's picture
First model version
642bd85
raw
history blame contribute delete
232 Bytes
{
"epoch": 3.0,
"total_flos": 247329718272000.0,
"train_loss": 0.7211369104044778,
"train_runtime": 32331.6664,
"train_samples": 25188,
"train_samples_per_second": 2.337,
"train_steps_per_second": 0.049
}