---
license: mit
---

## NER-PMR-large
NER-PMR-large is initialized with [PMR-large](https://huggingface.co/DAMO-NLP-SG/PMR-large) and further fine-tuned with 4 NER training data, namely [CoNLL](https://huggingface.co/datasets/conll2003), [WNUT17](https://huggingface.co/datasets/wnut_17), [ACE2004](https://paperswithcode.com/sota/nested-named-entity-recognition-on-ace-2004), and [ACE2005](https://paperswithcode.com/sota/nested-named-entity-recognition-on-ace-2005).

The model performance on the test sets are: 

||  CoNLL | WNUT17 | ACE2004 | ACE2005 |
|--|------------|-----------|----------|--|
|RoBERTa-large (single-task model)|  92.8 | 57.1 | 86.3|87.0|
|PMR-large  (single-task model)|   93.6 | 60.8 | 87.5 | 87.4|
|NER-PMR-large  (multi-task model)|   92.9   | 54.7| 87.8| 88.4|

Note that the performance of RoBERTa-large and PMR-large are single-task fine-tuning, while NER-PMR-large is a multi-task fine-tuned model. 
As it is fine-tuned on multiple datasets, we believe that NER-PMR-large has a better generalization capability to other NER tasks than PMR-large and RoBERTa-large.


### How to use
You can try the codes from [this repo](https://github.com/DAMO-NLP-SG/PMR/NER) for both training and inference.


### BibTeX entry and citation info
```bibtxt
@article{xu2022clozing,
  title={From Clozing to Comprehending: Retrofitting Pre-trained Language Model to Pre-trained Machine Reader},
  author={Xu, Weiwen and Li, Xin and Zhang, Wenxuan and Zhou, Meng and Bing, Lidong and Lam, Wai and Si, Luo},
  journal={arXiv preprint arXiv:2212.04755},
  year={2022}
}
```