alanakbik commited on
Commit
dfbfe88
•
1 Parent(s): bd60203

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -57,3 +57,60 @@ yields the following output:
57
  Span [1,2]: "George Washington" [− Labels: PER (0.9968)]
58
  Span [5]: "Washington" [− Labels: LOC (0.9994)]
59
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  Span [1,2]: "George Washington" [− Labels: PER (0.9968)]
58
  Span [5]: "Washington" [− Labels: LOC (0.9994)]
59
  ```
60
+
61
+
62
+ ### Script to train this model
63
+
64
+ The following Flair script was used to train this model:
65
+
66
+ ```python
67
+ from flair import set_seed
68
+ from flair.data import Corpus
69
+ from flair.datasets import CONLL_03
70
+ from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, FlairEmbeddings
71
+ from typing import List
72
+
73
+
74
+ # 1. get the corpus
75
+ corpus: Corpus = CONLL_03()
76
+
77
+ # 2. what tag do we want to predict?
78
+ tag_type = 'ner'
79
+
80
+ # 3. make the tag dictionary from the corpus
81
+ tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
82
+
83
+ # 4. initialize embeddings
84
+ embedding_types: List[TokenEmbeddings] = [
85
+
86
+ # GloVe embeddings
87
+ WordEmbeddings('glove'),
88
+
89
+ # contextual string embeddings, forward
90
+ FlairEmbeddings('news-forward'),
91
+
92
+ # contextual string embeddings, backward
93
+ FlairEmbeddings('news-backward'),
94
+ ]
95
+
96
+ # embedding stack consists of Flair and GloVe embeddings
97
+ embeddings = StackedEmbeddings(embeddings=embedding_types)
98
+
99
+ # 5. initialize sequence tagger
100
+ from flair.models import SequenceTagger
101
+
102
+ tagger: SequenceTagger = SequenceTagger(hidden_size=256,
103
+ embeddings=embeddings,
104
+ tag_dictionary=tag_dictionary,
105
+ tag_type=tag_type)
106
+
107
+ # 6. initialize trainer
108
+ from flair.trainers import ModelTrainer
109
+
110
+ trainer: ModelTrainer = ModelTrainer(tagger, corpus)
111
+
112
+ # 7. run training
113
+ trainer.train('resources/taggers/ner-english',
114
+ train_with_dev=True,
115
+ max_epochs=150)
116
+ ```