Original issue here: https://github.com/huggingface/transformers/issues/17885
Hello! I had originally posted this on the forums but it seems like there's not much foot traffic there, so hoping to get more visibility here.
I'm trying to replicate RoBERTa-base GLUE results as reported in the model card. The numbers in the model card look like they were copied from the paper. Has anyone made an attempt to actually match these numbers with run_glue.py? If so, what configuration was used for the trainer?
If I follow the original configs from fairseq, I am unable to match the reported numbers for RTE, CoLA, STS-B, and MRPC.
Any pointers would be much appreciated, thanks!