logical-reasoning / results /mgtv-glm-4-9b_p1_full_metrics.csv
dh-mc's picture
completed Llama3-8b analysis
f74c0d2
raw
history blame
514 Bytes
epoch,model,accuracy,precision,recall,f1
0,THUDM/glm-4-9b-chat-1m,0.581,0.7030063507392827,0.581,0.6169152550060966
1,THUDM/glm-4-9b-chat-1m_checkpoint-175,0.465,0.4626983424145981,0.47806716929403703,0.45282323637311034
2,THUDM/glm-4-9b-chat-1m_checkpoint-350,0.579,0.677205249618572,0.579,0.607768567422897
3,THUDM/glm-4-9b-chat-1m_checkpoint-525,0.6053333333333333,0.7220227211816735,0.6053333333333333,0.6379065537566594
4,THUDM/glm-4-9b-chat-1m_checkpoint-700,0.593,0.7202870096146446,0.593,0.631178867641037