1
---
2
tags:
3
- flair
4
- token-classification
5
- sequence-tagger-model
6
language: en
7
datasets:
8
- conll2000
9
widget:
10
- text: "The happy man has been eating at the diner"
11
---
12
13
## English Chunking in Flair (fast model)
14
15
This is the fast phrase chunking model for English that ships with [Flair](https://github.com/flairNLP/flair/).
16
17
F1-Score: **96,22** (CoNLL-2000)
18
19
Predicts 4 tags:
20
21
| **tag**                        | **meaning** |
22
|---------------------------------|-----------|
23
| ADJP     | adjectival |
24
| ADVP     | adverbial |
25
| CONJP     | conjunction |
26
| INTJ     | interjection |
27
| LST     | list marker |
28
| NP     | noun phrase |
29
| PP     | prepositional |
30
| PRT     | particle |
31
| SBAR      | subordinate clause |
32
| VP      | verb phrase |
33
34
Based on [Flair embeddings](https://www.aclweb.org/anthology/C18-1139/) and LSTM-CRF.
35
36
---
37
38
### Demo: How to use in Flair
39
40
Requires: **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
41
42
```python
43
from flair.data import Sentence
44
from flair.models import SequenceTagger
45
46
# load tagger
47
tagger = SequenceTagger.load("flair/chunk-english-fast")
48
49
# make example sentence
50
sentence = Sentence("The happy man has been eating at the diner")
51
52
# predict NER tags
53
tagger.predict(sentence)
54
55
# print sentence
56
print(sentence)
57
58
# print predicted NER spans
59
print('The following NER tags are found:')
60
# iterate over entities and print
61
for entity in sentence.get_spans('np'):
62
    print(entity)
63
64
```
65
66
This yields the following output:
67
```
68
Span [1,2,3]: "The happy man"   [− Labels: NP (0.9958)]
69
Span [4,5,6]: "has been eating"   [− Labels: VP (0.8759)]
70
Span [7]: "at"   [− Labels: PP (1.0)]
71
Span [8,9]: "the diner"   [− Labels: NP (0.9991)]
72
73
```
74
75
So, the spans "*The happy man*" and "*the diner*" are labeled as **noun phrases** (NP) and "*has been eating*" is labeled as a **verb phrase** (VP) in the sentence "*The happy man has been eating at the diner*". 
76
77
78
---
79
80
### Training: Script to train this model
81
82
The following Flair script was used to train this model: 
83
84
```python
85
from flair.data import Corpus
86
from flair.datasets import CONLL_2000
87
from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
88
89
# 1. get the corpus
90
corpus: Corpus = CONLL_2000()
91
92
# 2. what tag do we want to predict?
93
tag_type = 'np'
94
95
# 3. make the tag dictionary from the corpus
96
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
97
98
# 4. initialize each embedding we use
99
embedding_types = [
100
101
    # contextual string embeddings, forward
102
    FlairEmbeddings('news-forward-fast'),
103
104
    # contextual string embeddings, backward
105
    FlairEmbeddings('news-backward-fast'),
106
]
107
108
# embedding stack consists of Flair and GloVe embeddings
109
embeddings = StackedEmbeddings(embeddings=embedding_types)
110
111
# 5. initialize sequence tagger
112
from flair.models import SequenceTagger
113
114
tagger = SequenceTagger(hidden_size=256,
115
                        embeddings=embeddings,
116
                        tag_dictionary=tag_dictionary,
117
                        tag_type=tag_type)
118
119
# 6. initialize trainer
120
from flair.trainers import ModelTrainer
121
122
trainer = ModelTrainer(tagger, corpus)
123
124
# 7. run training
125
trainer.train('resources/taggers/chunk-english-fast',
126
              train_with_dev=True,
127
              max_epochs=150)
128
```
129
130
131
132
---
133
134
### Cite
135
136
Please cite the following paper when using this model.
137
138
```
139
@inproceedings{akbik2018coling,
140
  title={Contextual String Embeddings for Sequence Labeling},
141
  author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
142
  booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
143
  pages     = {1638--1649},
144
  year      = {2018}
145
}
146
```
147
148
---
149
150
### Issues?
151
152
The Flair issue tracker is available [here](https://github.com/flairNLP/flair/issues/).
153