Update README.md
Browse files
README.md
CHANGED
@@ -8,16 +8,6 @@ widget:
|
|
8 |
- text: "George Washington est allé à Washington"
|
9 |
---
|
10 |
|
11 |
-
**People Involved**
|
12 |
-
|
13 |
-
* LABRAK Yanis (1)
|
14 |
-
* DUFOUR Richard (2)
|
15 |
-
|
16 |
-
**Affiliations**
|
17 |
-
|
18 |
-
1. LIA, Avignon University, Avignon, France.
|
19 |
-
2. LS2N, Nantes University, Nantes, France.
|
20 |
-
|
21 |
# POET: A French Extended Part-of-Speech Tagger
|
22 |
|
23 |
- Corpus: [UD_FRENCH_TREEBANKS](https://universaldependencies.org/treebanks/fr_gsd/index.html)
|
@@ -26,6 +16,16 @@ widget:
|
|
26 |
- Additionnel: [LSTM-CRF](https://arxiv.org/abs/1011.4088)
|
27 |
- Nombre d'Epochs: 115
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
## Demo: How to use in Flair
|
30 |
|
31 |
Requires [Flair](https://pypi.org/project/flair/): ```pip install flair```
|
@@ -58,13 +58,15 @@ Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17
|
|
58 |
|
59 |
We based our tags on the level of details given by the [LIA_TAGG](http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html) statistical POS tagger written by [Frédéric Béchet](http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html) in 2001.
|
60 |
|
|
|
|
|
61 |
## Original Tags
|
62 |
|
63 |
```plain
|
64 |
PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
|
65 |
```
|
66 |
|
67 |
-
## New
|
68 |
|
69 |
| Abbreviation | Description | Examples |
|
70 |
|:--------:|:--------:|:--------:|
|
@@ -75,55 +77,55 @@ PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
|
|
75 |
| COCO | Coordinating Conjunction | et |
|
76 |
| PART | Demonstrative particle | -t |
|
77 |
| PRON | Pronoun | qui ce quoi |
|
78 |
-
| PDEMMS | Singular Masculine
|
79 |
-
| PDEMMP |
|
80 |
-
| PDEMFS | Singular Feminine
|
81 |
-
| PDEMFP |
|
82 |
-
| PINDMS | Singular Masculine
|
83 |
-
| PINDMP |
|
84 |
-
| PINDFS | Singular Feminine
|
85 |
-
| PINDFP |
|
86 |
-
| PROPN | Proper noun |
|
87 |
-
| XFAMIL | Last name |
|
88 |
-
| NUM | Numerical
|
89 |
-
| DINTMS | Masculine Numerical
|
90 |
-
| DINTFS | Feminine Numerical
|
91 |
-
| PPOBJMS |
|
92 |
-
| PPOBJMP |
|
93 |
-
| PPOBJFS |
|
94 |
-
| PPOBJFP |
|
95 |
-
| PPER1S | Personal Pronoun First
|
96 |
-
| PPER2S | Personal Pronoun Second
|
97 |
-
| PPER3MS | Personal Pronoun Third
|
98 |
-
| PPER3MP | Personal Pronoun Third
|
99 |
-
| PPER3FS | Personal Pronoun Third
|
100 |
-
| PPER3FP | Personal Pronoun Third
|
101 |
-
| PREFS | Reflexive
|
102 |
-
| PREF | Reflexive
|
103 |
-
| PREFP | Reflexive
|
104 |
| VERB | Verb | obtient |
|
105 |
-
| VPPMS |
|
106 |
-
| VPPMP |
|
107 |
-
| VPPFS |
|
108 |
-
| VPPFP |
|
109 |
| DET | Determinant | les l' |
|
110 |
-
| DETMS | Singular Masculine
|
111 |
-
| DETFS | Singular Feminine
|
112 |
| ADJ | Adjective | capable sérieux |
|
113 |
-
| ADJMS | Singular Masculine
|
114 |
-
| ADJMP |
|
115 |
-
| ADJFS | Singular Feminine
|
116 |
-
| ADJFP |
|
117 |
| NOUN | Noun | temps |
|
118 |
-
| NMS | Singular Masculine
|
119 |
-
| NMP |
|
120 |
-
| NFS | Singular Feminine
|
121 |
-
| NFP |
|
122 |
| PREL | Relative Pronoun | qui dont |
|
123 |
-
| PRELMS | Singular Masculine
|
124 |
-
| PRELMP |
|
125 |
-
| PRELFS | Singular Feminine
|
126 |
-
| PRELFP |
|
127 |
| INTJ | Interjection | merci bref |
|
128 |
| CHIF | Numbers | 1979 10 |
|
129 |
| SYM | Symbol | € % |
|
@@ -134,6 +136,8 @@ PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
|
|
134 |
|
135 |
## Evaluation results
|
136 |
|
|
|
|
|
137 |
```plain
|
138 |
Results:
|
139 |
- F-score (micro): 0.952
|
@@ -245,3 +249,7 @@ Flair Embeddings:
|
|
245 |
year = {2018}
|
246 |
}
|
247 |
```
|
|
|
|
|
|
|
|
|
|
8 |
- text: "George Washington est allé à Washington"
|
9 |
---
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
# POET: A French Extended Part-of-Speech Tagger
|
12 |
|
13 |
- Corpus: [UD_FRENCH_TREEBANKS](https://universaldependencies.org/treebanks/fr_gsd/index.html)
|
|
|
16 |
- Additionnel: [LSTM-CRF](https://arxiv.org/abs/1011.4088)
|
17 |
- Nombre d'Epochs: 115
|
18 |
|
19 |
+
**People Involved**
|
20 |
+
|
21 |
+
* [LABRAK Yanis](https://www.linkedin.com/in/yanis-labrak-8a7412145/) (1)
|
22 |
+
* [DUFOUR Richard](https://cv.archives-ouvertes.fr/richard-dufour) (2)
|
23 |
+
|
24 |
+
**Affiliations**
|
25 |
+
|
26 |
+
1. [LIA, NLP team](https://lia.univ-avignon.fr/), Avignon University, Avignon, France.
|
27 |
+
2. [LS2N, TALN team](https://www.ls2n.fr/equipe/taln/), Nantes University, Nantes, France.
|
28 |
+
|
29 |
## Demo: How to use in Flair
|
30 |
|
31 |
Requires [Flair](https://pypi.org/project/flair/): ```pip install flair```
|
|
|
58 |
|
59 |
We based our tags on the level of details given by the [LIA_TAGG](http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html) statistical POS tagger written by [Frédéric Béchet](http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html) in 2001.
|
60 |
|
61 |
+
The corpora used for this model is available on [Github](https://github.com/qanastek/UD_FRENCH_GSD_PLUS) at the [CoNLL-U format](https://universaldependencies.org/format.html).
|
62 |
+
|
63 |
## Original Tags
|
64 |
|
65 |
```plain
|
66 |
PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
|
67 |
```
|
68 |
|
69 |
+
## New additional POS tags
|
70 |
|
71 |
| Abbreviation | Description | Examples |
|
72 |
|:--------:|:--------:|:--------:|
|
|
|
77 |
| COCO | Coordinating Conjunction | et |
|
78 |
| PART | Demonstrative particle | -t |
|
79 |
| PRON | Pronoun | qui ce quoi |
|
80 |
+
| PDEMMS | Demonstrative Pronoun - Singular Masculine | ce |
|
81 |
+
| PDEMMP | Demonstrative Pronoun - Plural Masculine | ceux |
|
82 |
+
| PDEMFS | Demonstrative Pronoun - Singular Feminine | cette |
|
83 |
+
| PDEMFP | Demonstrative Pronoun - Plural Feminine | celles |
|
84 |
+
| PINDMS | Indefinite Pronoun - Singular Masculine | tout |
|
85 |
+
| PINDMP | Indefinite Pronoun - Plural Masculine | autres |
|
86 |
+
| PINDFS | Indefinite Pronoun - Singular Feminine | chacune |
|
87 |
+
| PINDFP | Indefinite Pronoun - Plural Feminine | certaines |
|
88 |
+
| PROPN | Proper noun | Houston |
|
89 |
+
| XFAMIL | Last name | Levy |
|
90 |
+
| NUM | Numerical Adjective | trentaine vingtaine |
|
91 |
+
| DINTMS | Masculine Numerical Adjective | un |
|
92 |
+
| DINTFS | Feminine Numerical Adjective | une |
|
93 |
+
| PPOBJMS | Pronoun complements of objects - Singular Masculine | le lui |
|
94 |
+
| PPOBJMP | Pronoun complements of objects - Plural Masculine | eux y |
|
95 |
+
| PPOBJFS | Pronoun complements of objects - Singular Feminine | moi la |
|
96 |
+
| PPOBJFP | Pronoun complements of objects - Plural Feminine | en y |
|
97 |
+
| PPER1S | Personal Pronoun First-Person - Singular | je |
|
98 |
+
| PPER2S | Personal Pronoun Second-Person - Singular | tu |
|
99 |
+
| PPER3MS | Personal Pronoun Third-Person - Singular Masculine | il |
|
100 |
+
| PPER3MP | Personal Pronoun Third-Person - Plural Masculine | ils |
|
101 |
+
| PPER3FS | Personal Pronoun Third-Person - Singular Feminine | elle |
|
102 |
+
| PPER3FP | Personal Pronoun Third-Person - Plural Feminine | elles |
|
103 |
+
| PREFS | Reflexive Pronoun First-Person - Singular | me m' |
|
104 |
+
| PREF | Reflexive Pronoun Third-Person - Singular | se s' |
|
105 |
+
| PREFP | Reflexive Pronoun First / Second-Person - Plural | nous vous |
|
106 |
| VERB | Verb | obtient |
|
107 |
+
| VPPMS | Past Participle - Singular Masculine | formulé |
|
108 |
+
| VPPMP | Past Participle - Plural Masculine | classés |
|
109 |
+
| VPPFS | Past Participle - Singular Feminine | appelée |
|
110 |
+
| VPPFP | Past Participle - Plural Feminine | sanctionnées |
|
111 |
| DET | Determinant | les l' |
|
112 |
+
| DETMS | Determinant - Singular Masculine | les |
|
113 |
+
| DETFS | Determinant - Singular Feminine | la |
|
114 |
| ADJ | Adjective | capable sérieux |
|
115 |
+
| ADJMS | Adjective - Singular Masculine | grand important |
|
116 |
+
| ADJMP | Adjective - Plural Masculine | grands petits |
|
117 |
+
| ADJFS | Adjective - Singular Feminine | française petite |
|
118 |
+
| ADJFP | Adjective - Plural Feminine | légères petites |
|
119 |
| NOUN | Noun | temps |
|
120 |
+
| NMS | Noun - Singular Masculine | drapeau |
|
121 |
+
| NMP | Noun - Plural Masculine | journalistes |
|
122 |
+
| NFS | Noun - Singular Feminine | tête |
|
123 |
+
| NFP | Noun - Plural Feminine | ondes |
|
124 |
| PREL | Relative Pronoun | qui dont |
|
125 |
+
| PRELMS | Relative Pronoun - Singular Masculine | lequel |
|
126 |
+
| PRELMP | Relative Pronoun - Plural Masculine | lesquels |
|
127 |
+
| PRELFS | Relative Pronoun - Singular Feminine | laquelle |
|
128 |
+
| PRELFP | Relative Pronoun - Plural Feminine | lesquelles |
|
129 |
| INTJ | Interjection | merci bref |
|
130 |
| CHIF | Numbers | 1979 10 |
|
131 |
| SYM | Symbol | € % |
|
|
|
136 |
|
137 |
## Evaluation results
|
138 |
|
139 |
+
The test corpora used for this evaluation is available on [Github](https://github.com/qanastek/UD_FRENCH_GSD_PLUS/blob/main/UD_FRENCH_GSD_PLUS/fr_gsd-ud-plus-test.conllu).
|
140 |
+
|
141 |
```plain
|
142 |
Results:
|
143 |
- F-score (micro): 0.952
|
|
|
249 |
year = {2018}
|
250 |
}
|
251 |
```
|
252 |
+
|
253 |
+
## Acknowledgment
|
254 |
+
|
255 |
+
This work was financially supported by [Zenidoc](https://zenidoc.fr/)
|