File size: 1,307 Bytes
a3ecf80
 
b84e7cd
 
 
 
 
 
 
 
 
 
 
 
a05ef74
5b4f964
 
a23ddde
5b4f964
 
 
 
 
d20461d
5b4f964
76b6115
1175889
c41b2d9
 
a23ddde
a585746
 
a23ddde
 
 
 
 
 
a05ef74
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
license: mit
datasets:
- Wojood
tags:
- Named Entity Recognition
- Arabic NER
- Nested NER
language:
- ar
metrics:
- f1
- precision
- recall
pipeline_tag: token-classification
---

## Wojood - Nested/Flat Arabic NER Models
Wojood is a corpus for Arabic nested Named Entity Recognition (NER). Nested entities occur when one entity mention is embedded inside another entity mention. 550K tokens (MSA and dialect) This repo contains the source-code to train Wojood nested NER.

Online Demo
You can try our model using the demo link below

https://sina.birzeit.edu/wojood/

https://arxiv.org/abs/2205.09651

https://huggingface.co/aubmindlab/bert-base-arabertv2/tree/main

### Models
* Nested NER (main branch), with micro-F1 score of 0.909551
* Flat NER (flat branch), with micro-F1 score 0.883847

### Google Colab Notebooks
You can test our model using our Google Colab notebooks
* Train flat NER: https://gist.github.com/mohammedkhalilia/72c3261734d7715094089bdf4de74b4a
* Evaluate your model using flat NER model: https://gist.github.com/mohammedkhalilia/c807eb1ccb15416b187c32a362001665  
* Train nested NER: https://gist.github.com/mohammedkhalilia/a4d83d4e43682d1efcdf299d41beb3da
* Evaluate your data using nested NER model: https://gist.github.com/mohammedkhalilia/9134510aa2684464f57de7934c97138b