|
# LIMIT-BERT |
|
|
|
Code and model for the *EMNLP 2020 Findings* paper: |
|
|
|
[LIMIT-BERT: Linguistic Informed Multi-task BERT](https://arxiv.org/abs/1910.14296)) |
|
|
|
## Contents |
|
|
|
1. [Requirements](#Requirements) |
|
2. [Training](#Training) |
|
|
|
## Requirements |
|
|
|
* Python 3.6 or higher. |
|
* Cython 0.25.2 or any compatible version. |
|
* [PyTorch](http://pytorch.org/) 1.0.0+. |
|
* [EVALB](http://nlp.cs.nyu.edu/evalb/). Before starting, run `make` inside the `EVALB/` directory to compile an `evalb` executable. This will be called from Python for evaluation. |
|
* [pytorch-transformers](https://github.com/huggingface/pytorch-transformers) PyTorch 1.0.0+ or any compatible version. |
|
|
|
#### Pre-trained Models (PyTorch) |
|
The following pre-trained models are available for download from Google Drive: |
|
* [`LIMIT-BERT`](https://drive.google.com/open?id=1fm0cK2A91iLG3lCpwowCCQSALnWS2X4i): |
|
PyTorch version, same setting with BERT-Large-WWM,loading model with [pytorch-transformers](https://github.com/huggingface/pytorch-transformers). |
|
|
|
## How to use |
|
|
|
``` |
|
from transformers import AutoTokenizer, AutoModel |
|
tokenizer = AutoTokenizer.from_pretrained("cooelf/limitbert") |
|
model = AutoModel.from_pretrained("cooelf/limitbert") |
|
``` |
|
|
|
Please see our original repo for the training scripts. |
|
|
|
https://github.com/cooelf/LIMIT-BERT |
|
|
|
## Training |
|
|
|
To train LIMIT-BERT, simply run: |
|
``` |
|
sh run_limitbert.sh |
|
``` |
|
### Evaluation Instructions |
|
|
|
To test after setting model path: |
|
``` |
|
sh test_bert.sh |
|
``` |
|
|
|
## Citation |
|
|
|
``` |
|
@article{zhou2019limit, |
|
title={{LIMIT-BERT}: Linguistic informed multi-task {BERT}}, |
|
author={Zhou, Junru and Zhang, Zhuosheng and Zhao, Hai}, |
|
journal={arXiv preprint arXiv:1910.14296}, |
|
year={2019} |
|
} |
|
``` |