Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ datasets:
|
|
7 |
|
8 |
This model is a fine-tune of [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1) aimed at improving accuracy across a broad range of topics, especially with respect to long-context news entity extraction. As shown in the table below, these fine-tunes improved upon the base GLiNER model zero-shot accuracy by up to 7.5% across 18 benchmark datasets.
|
9 |
|
10 |
-
![results table](assets/zero-
|
11 |
|
12 |
The underlying dataset, [AskNews-NER-v0](https://huggingface.co/datasets/EmergentMethods/AskNews-NER-v0) was engineered with the objective of diversifying global perspectives by enforcing country/language/topic/temporal diversity. All data used to fine-tune this model was synthetically generated. WizardLM 13B v2.0 was used for translation/summarization of open-web news articles, while Llama3 70b instruct was used for entity extraction. Both the diversification and fine-tuning methods are presented in a [pre-print submitted to NeurIps2024](https://linktoarxiv.org).
|
13 |
|
@@ -68,7 +68,7 @@ Topics:
|
|
68 |
- **Funded by:** [Emergent Methods](https://www.emergentmethods.ai/)
|
69 |
- **Shared by:** [Emergent Methods](https://www.emergentmethods.ai/)
|
70 |
- **Model type:** microsoft/deberta
|
71 |
-
- **Language(s) (NLP):** English (en)
|
72 |
- **License:** Apache 2.0
|
73 |
- **Finetuned from model:** [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1)
|
74 |
|
|
|
7 |
|
8 |
This model is a fine-tune of [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1) aimed at improving accuracy across a broad range of topics, especially with respect to long-context news entity extraction. As shown in the table below, these fine-tunes improved upon the base GLiNER model zero-shot accuracy by up to 7.5% across 18 benchmark datasets.
|
9 |
|
10 |
+
![results table](assets/zero-shot_18_table.png)
|
11 |
|
12 |
The underlying dataset, [AskNews-NER-v0](https://huggingface.co/datasets/EmergentMethods/AskNews-NER-v0) was engineered with the objective of diversifying global perspectives by enforcing country/language/topic/temporal diversity. All data used to fine-tune this model was synthetically generated. WizardLM 13B v2.0 was used for translation/summarization of open-web news articles, while Llama3 70b instruct was used for entity extraction. Both the diversification and fine-tuning methods are presented in a [pre-print submitted to NeurIps2024](https://linktoarxiv.org).
|
13 |
|
|
|
68 |
- **Funded by:** [Emergent Methods](https://www.emergentmethods.ai/)
|
69 |
- **Shared by:** [Emergent Methods](https://www.emergentmethods.ai/)
|
70 |
- **Model type:** microsoft/deberta
|
71 |
+
- **Language(s) (NLP):** English (en) (English texts and translations from Spanish (es), Portuguese (pt), German (de), Russian (ru), French (fr), Arabic (ar), Italian (it), Ukrainian (uk), Norwegian (no), Swedish (sv), Danish (da)).
|
72 |
- **License:** Apache 2.0
|
73 |
- **Finetuned from model:** [GLiNER](https://huggingface.co/urchade/gliner_large-v2.1)
|
74 |
|