Accuracy results seems to be wrong

by ayazdan - opened Nov 8, 2022

Nov 8, 2022

Thanks for the model. The accuracy of the model seems to be 79.42 (slightly lower). I am using the following script to evaluate:

python3 transformer-sparsity/examples/pytorch/text-classification/run_glue.py \
        --model_name_or_path ${ckpt} \
        --task_name "rte" \
        --do_eval \
        --max_seq_length 512 \
        --per_device_eval_batch_size 32 \
        --evaluation_strategy steps \
        --logging_steps ${eval_steps} \
        --logging_strategy steps \
        --overwrite_output_dir \
        --output_dir ${ckpt_path} 2>&1 | tee ~/${ckpt_path}/finetune_run_$(date +"%Y_%m_%d_%I_%M_%p").log

JeremiahZ

Owner Nov 8, 2022

AutoEvaluator provided by Huggingface gives 0.791, which means that your results are actually slightly better.
The results that I posted was auto-generated by Trainer. I am assuming that the discrepancy should be imputed to the per_device_eval_batch_size argument. By default, Trainer would discard examples that cannot fill a full batch.

JeremiahZ changed discussion status to closed Nov 8, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment