Livingwithmachines
/

erwt-year

Inference Endpoints

Model card Files Files and versions Community

Kaspar commited on Nov 18, 2022

Commit

a2502c1

•

1 Parent(s): e40f0e6

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -151,16 +151,23 @@ ERWT clearly learned a lot about history of German unification by ploughing thro
 Again, we have to ask: Who cares? Wikipedia can tell us pretty much the same. More importantly, don't we already have timestamps for newspaper data.
-In both cases, our answers would be "yes, but...". ERWT's time-stamping powers has little instrumental use and won't make us rich (but donations are welcome of course 🤑) we nonetheless believe date prediction has value for research purposes. We can use ERWT for "fictitious" prediction, i.e. as a diagnostic tool.
 Firstly, we used date prediction for evaluation purposes, to measure which training routine produces models
 Secondly, we could use it as an analytical tool, to study how temporal variation **within** text documents and further scrutinise which features drive the time prediction (it goes without saying that the same applies to other metadata fields, but example predicting political orientation).
 ## Limitations
-The ERWT series were trained for evaluation purposes, and cary critical limitations. First of all, as explained in more detail below, this model is trained on a rather small subsample of British newspapers, with a strong Metropolitan and liberal bias.
-Secondly, we only trained for one epoch, which suggests. For the evaluation purposes we were interested in the relative performance of our models.
 ## Data Description

 Again, we have to ask: Who cares? Wikipedia can tell us pretty much the same. More importantly, don't we already have timestamps for newspaper data.
+In both cases, our answers would be "yes, but...". ERWT's time-stamping powers have little instrumental use and won't make us rich (but donations are welcome of course 🤑). Nonetheless, we believe date prediction has value for research purposes. We can use ERWT for "fictitious" prediction, i.e. as a diagnostic tool.
 Firstly, we used date prediction for evaluation purposes, to measure which training routine produces models
 Secondly, we could use it as an analytical tool, to study how temporal variation **within** text documents and further scrutinise which features drive the time prediction (it goes without saying that the same applies to other metadata fields, but example predicting political orientation).
 ## Limitations
+The ERWT series were trained for evaluation purposes, and carry some critical limitations.
+### Training Data
+Many of the limitations are a direct result of the data. ERWT models are trained on a rather small subsample of nineteenth-century British newspapers, and its predictions have to be understood in this context (remember, Her Majesty?). Moreover, the corpus has a strong Metropolitan and liberal bias (see section on Data Description for more information).
+We only trained for one epoch, which suggests. For the evaluation purposes we were interested in the relative performance of our models.
 ## Data Description