Update README.md
Browse files
README.md
CHANGED
@@ -158,16 +158,20 @@ Secondly, we could use it as an analytical tool, to study how temporal variation
|
|
158 |
|
159 |
## Limitations
|
160 |
|
161 |
-
|
162 |
-
The ERWT series were trained for evaluation purposes, and carry some critical limitations.
|
163 |
|
164 |
### Training Data
|
165 |
|
166 |
Many of the limitations are a direct result of the data. ERWT models are trained on a rather small subsample of nineteenth-century British newspapers, and its predictions have to be understood in this context (remember, Her Majesty?). Moreover, the corpus has a strong Metropolitan and liberal bias (see section on Data Description for more information).
|
167 |
|
|
|
|
|
|
|
168 |
|
|
|
|
|
169 |
|
170 |
-
We
|
171 |
|
172 |
## Data Description
|
173 |
|
|
|
158 |
|
159 |
## Limitations
|
160 |
|
161 |
+
The ERWT series were trained for evaluation purposes, and therefore carry some critical limitations.
|
|
|
162 |
|
163 |
### Training Data
|
164 |
|
165 |
Many of the limitations are a direct result of the data. ERWT models are trained on a rather small subsample of nineteenth-century British newspapers, and its predictions have to be understood in this context (remember, Her Majesty?). Moreover, the corpus has a strong Metropolitan and liberal bias (see section on Data Description for more information).
|
166 |
|
167 |
+
### Training Routine
|
168 |
+
|
169 |
+
We created this model as part of a wider experiment, which attempted to establish best practices for training models with metadata. An overview of all the models is available on our [GitHub](https://github.com/Living-with-machines/ERWT/) page.
|
170 |
|
171 |
+
To reduce training time, we based our experiments on a random subsample of the HMD corpus, consisting of half a billion tokens.
|
172 |
+
Furthermore, we only trained the models for one epoch, which implies .
|
173 |
|
174 |
+
We were mainly interested in the relative performance of the different ERWT models and .
|
175 |
|
176 |
## Data Description
|
177 |
|