File size: 1,441 Bytes
8aa1940
 
 
4b108fe
8aa1940
 
 
 
 
 
 
 
 
 
 
 
4b108fe
a38c343
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
language: en
tags:
- bert
- sst2
- glue
- torchdistill
license: apache-2.0
datasets:
- sst2
metrics:
- accuracy
---

`bert-large-uncased` fine-tuned on SST-2 dataset, using [***torchdistill***](https://github.com/yoshitomo-matsubara/torchdistill) and [Google Colab](https://colab.research.google.com/github/yoshitomo-matsubara/torchdistill/blob/master/demo/glue_finetuning_and_submission.ipynb).  
The hyperparameters are the same as those in Hugging Face's example and/or the paper of BERT, and the training configuration (including hyperparameters) is available [here](https://github.com/yoshitomo-matsubara/torchdistill/blob/main/configs/sample/glue/sst2/ce/bert_large_uncased.yaml).  
I submitted prediction files to [the GLUE leaderboard](https://gluebenchmark.com/leaderboard), and the overall GLUE score was **80.2**.

Yoshitomo Matsubara: **"torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP"** at *EMNLP 2023 Workshop for Natural Language Processing Open Source Software (NLP-OSS)*

[[OpenReview](https://openreview.net/forum?id=A5Axeeu1Bo)] [[Preprint](https://arxiv.org/abs/2310.17644)]  
```bibtex
@article{matsubara2023torchdistill,
  title={{torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP}},
  author={Matsubara, Yoshitomo},
  journal={arXiv preprint arXiv:2310.17644},
  year={2023}
}
```