EmergentMethods
/

gliner_large_news-v2.1

Token Classification

Model card Files Files and versions Community

thorntwig commited on May 20

Commit

343d7d6

•

1 Parent(s): 0aadb39

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ datasets:
 This model is a fine-tune of [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1) aimed at improving accuracy across a broad range of topics, especially with respect to long-context news entity extraction. As shown in the table below, these fine-tunes improved upon the base GLiNER model zero-shot accuracy by up to 7.5% across 18 benchmark datasets.
-![results table](assets/zero-shot_20_table.png)
 The underlying dataset, [AskNews-NER-v0](https://huggingface.co/datasets/EmergentMethods/AskNews-NER-v0) was engineered with the objective of diversifying global perspectives by enforcing country/language/topic/temporal diversity. All data used to fine-tune this model was synthetically generated. WizardLM 13B v2.0 was used for translation/summarization of open-web news articles, while Llama3 70b instruct was used for entity extraction. Both the diversification and fine-tuning methods are presented in a [pre-print submitted to NeurIps2024](https://linktoarxiv.org).
@@ -68,7 +68,7 @@ Topics:
 - **Funded by:** [Emergent Methods](https://www.emergentmethods.ai/)
 - **Shared by:** [Emergent Methods](https://www.emergentmethods.ai/)
 - **Model type:** microsoft/deberta
-- **Language(s) (NLP):** English (en), and translated French, Spanish, German, Swedish, Italian, Arabic, Chinese, Norwegian, Danish, Dutch, Russian, Ukranian
 - **License:** Apache 2.0
 - **Finetuned from model:** [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1)

 This model is a fine-tune of [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1) aimed at improving accuracy across a broad range of topics, especially with respect to long-context news entity extraction. As shown in the table below, these fine-tunes improved upon the base GLiNER model zero-shot accuracy by up to 7.5% across 18 benchmark datasets.
+![results table](assets/zero-shot_18_table.png)
 The underlying dataset, [AskNews-NER-v0](https://huggingface.co/datasets/EmergentMethods/AskNews-NER-v0) was engineered with the objective of diversifying global perspectives by enforcing country/language/topic/temporal diversity. All data used to fine-tune this model was synthetically generated. WizardLM 13B v2.0 was used for translation/summarization of open-web news articles, while Llama3 70b instruct was used for entity extraction. Both the diversification and fine-tuning methods are presented in a [pre-print submitted to NeurIps2024](https://linktoarxiv.org).
 - **Funded by:** [Emergent Methods](https://www.emergentmethods.ai/)
 - **Shared by:** [Emergent Methods](https://www.emergentmethods.ai/)
 - **Model type:** microsoft/deberta
+- **Language(s) (NLP):** English (en) (English texts and translations from Spanish (es), Portuguese (pt), German (de), Russian (ru), French (fr), Arabic (ar), Italian (it), Ukrainian (uk), Norwegian (no), Swedish (sv), Danish (da)).
 - **License:** Apache 2.0
 - **Finetuned from model:** [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1)