qanastek commited on
Commit
a6d0d0d
1 Parent(s): c224469
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -61,7 +61,9 @@ Washington <PROPN>
61
 
62
  `UD_FRENCH_GSD_PLUS` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
63
 
64
- Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add semantic information such as the gender, number, mood, person, tense or verb form.
 
 
65
 
66
  ## Original Tags
67
 
@@ -152,6 +154,17 @@ UD_French-GSD corpora:
152
  }
153
  ```
154
 
 
 
 
 
 
 
 
 
 
 
 
155
  Flair Embeddings:
156
 
157
  ```latex
 
61
 
62
  `UD_FRENCH_GSD_PLUS` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
63
 
64
+ Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add semantic information such as the gender, number, mood, person, tense or verb form given in the different CoNLL-03 fields from the original corpora.
65
+
66
+ We based our tags on the level of details given by the [LIA_TAGG](http://pageperso.lif.univ-mrs.fr/frederic.bechet/download.html) statistical POS tagger written by [Frédéric Béchet](http://pageperso.lif.univ-mrs.fr/frederic.bechet/index-english.html) in 2001.
67
 
68
  ## Original Tags
69
 
 
154
  }
155
  ```
156
 
157
+ LIA TAGG:
158
+
159
+ ```latex
160
+ @techreport{LIA_TAGG,
161
+ author = {Frédéric Béchet},
162
+ title = {LIA_TAGG: a statistical POS tagger + syntactic bracketer},
163
+ institution = {Aix-Marseille University & CNRS},
164
+ year = {2001}
165
+ }
166
+ ```
167
+
168
  Flair Embeddings:
169
 
170
  ```latex