logical-reasoning / results /mgtv-results_p2_r2_full_metrics.csv
dh-mc's picture
completed comparison of evaluation results using different GPU and precision
1ec2139
raw
history blame
586 Bytes
epoch,model,accuracy,precision,recall,f1
0,internlm/internlm2_5-7b-chat-1m,0.766,0.7479690198649127,0.7875257025359835,0.7649220492304646
1,internlm/internlm2_5-7b-chat-1m_checkpoint-175,0.812,0.8122861942516547,0.812,0.8102342544894316
2,internlm/internlm2_5-7b-chat-1m_checkpoint-350,0.7653333333333333,0.8068892149662973,0.7653333333333333,0.7799982606366916
3,internlm/internlm2_5-7b-chat-1m_checkpoint-525,0.7476666666666667,0.8120325497709814,0.7476666666666667,0.7731222076608317
4,internlm/internlm2_5-7b-chat-1m_checkpoint-700,0.717,0.8046420022590015,0.717,0.7510339687376877