--- annotations_creators: - machine-generated language_creators: - machine-generated widget: - text: George Washington went to Washington. - text: What is the seventh tallest mountain in North America? tags: - named-entity-recognition - sequence-tagger-model datasets: - Babelscape/cner language: - en pretty_name: cner-model source_datasets: - original task_categories: - structure-prediction task_ids: - named-entity-recognition --- # CNER: Concept and Named Entity Recognition This is the model card for the NAACL 2024 paper [CNER: Concept and Named Entity Recognition](https://aclanthology.org/2024.naacl-long.461/). We fine-tuned a language model (DeBERTa-v3-base) for 1 epoch on our [CNER dataset](https://huggingface.co/datasets/Babelscape/cner) using the default hyperparameters, optimizer and architecture of Hugging Face, therefore the results of this model may differ from the ones presented in the paper. The resulting CNER model is able to jointly identifying and classifying concepts and named entities with fine-grained tags. **If you use the model, please reference this work in your paper**: ```bibtex @inproceedings{martinelli-etal-2024-cner, title = "{CNER}: Concept and Named Entity Recognition", author = "Martinelli, Giuliano and Molfese, Francesco and Tedeschi, Simone and Fern{\'a}ndez-Castro, Alberte and Navigli, Roberto", editor = "Duh, Kevin and Gomez, Helena and Bethard, Steven", booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", month = jun, year = "2024", address = "Mexico City, Mexico", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.naacl-long.461", pages = "8329--8344", } ``` The original repository for the paper can be found at [https://github.com/Babelscape/cner](https://github.com/Babelscape/cner). ## How to use You can use this model with Transformers NER *pipeline*. ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("Babelscape/cner-model") model = AutoModelForTokenClassification.from_pretrained("Babelscape/cner-model") nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True) example = "What is the seventh tallest mountain in North America?" ner_results = nlp(example) print(ner_results) ``` ## Classes drawing ## Licensing Information Contents of this repository are restricted to only non-commercial research purposes under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright of the dataset contents and models belongs to the original copyright holders. `microsoft/deberta-v3-base` is released under the [MIT license](https://choosealicense.com/licenses/mit/).