File size: 534 Bytes
5f37ab9 |
1 2 3 4 5 6 7 8 9 |
The baseline_example folder provides a simple baseline implementation along with the evaluation logic for reference.
Methodology: The approach involves using chatglm3_6B to perform pointwise (5-level) evaluation on question-answer pairs.
**baseline3.py** stores the model's evaluation results in output/baseline1_chatglm3_6B.txt.
**eval.py** calculates the evaluation metrics based on the model's evaluation results and the human annotation results.
The human annotation results are temporarily hidden due to testing requirements. |