Edit model card

bert-large-uncased fine-tuned on SST-2 dataset, using torchdistill and Google Colab.
The hyperparameters are the same as those in Hugging Face's example and/or the paper of BERT, and the training configuration (including hyperparameters) is available here.
I submitted prediction files to the GLUE leaderboard, and the overall GLUE score was 80.2.

Yoshitomo Matsubara: "torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP" at EMNLP 2023 Workshop for Natural Language Processing Open Source Software (NLP-OSS)

[Paper] [OpenReview] [Preprint]

@inproceedings{matsubara2023torchdistill,
  title={{torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP}},
  author={Matsubara, Yoshitomo},
  booktitle={Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)},
  publisher={Empirical Methods in Natural Language Processing},
  pages={153--164},
  year={2023}
}
Downloads last month
33
Safetensors
Model size
335M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train yoshitomo-matsubara/bert-large-uncased-sst2