logical-reasoning / results /mgtv-glm-4-9b_p2_full_metrics.csv
dh-mc's picture
more analysis
a2e3a5e
raw
history blame
550 Bytes
epoch,model,accuracy,precision,recall,f1
0,THUDM/glm-4-9b-chat-1m,0.395,0.667648445898918,0.395,0.4583896948511759
1,THUDM/glm-4-9b-chat-1m_checkpoint-175,0.5946666666666667,0.7056249544340455,0.5946666666666667,0.631524021431916
2,THUDM/glm-4-9b-chat-1m_checkpoint-350,0.549,0.7006542472262678,0.549,0.595639891731556
3,THUDM/glm-4-9b-chat-1m_checkpoint-525,0.5986666666666667,0.7150511133873729,0.5986666666666667,0.6253567774596996
4,THUDM/glm-4-9b-chat-1m_checkpoint-700,0.5843333333333334,0.730089967300272,0.5843333333333334,0.6195784049291421