bert-base-uncased fine-tuned on RTE dataset, using fine-tuned bert-large-uncased as a teacher model, torchdistill and Google Colab for knowledge distillation.
The training configuration (including hyperparameters) is available here.
I submitted prediction files to the GLUE leaderboard, and the overall GLUE score was 78.9.

Downloads last month
12
Hosted inference API
Text Classification
This model can be loaded on the Inference API on-demand.