Kaspar commited on
Commit
e40f0e6
1 Parent(s): 66285cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -125,9 +125,21 @@ Reusing to code above, we can time-stamp documents by masking the year. For exam
125
 
126
  ```python
127
  mask_filler("[MASK] [DATE] The Schleswig war is a matter of great concern.")
 
 
 
 
 
 
 
 
 
 
 
128
  ```
129
 
130
- Return 1864, which makes sense as this was indeed the year of Prussian troops (with some help of their Austrian friends) crossed the border into Schleswig, then part of the Kingdom of Denmark.
 
131
 
132
  A few years later, in 1870, Prussia aimed artillery southwards and invaded France.
133
 
@@ -141,16 +153,12 @@ Again, we have to ask: Who cares? Wikipedia can tell us pretty much the same. Mo
141
 
142
  In both cases, our answers would be "yes, but...". ERWT's time-stamping powers has little instrumental use and won't make us rich (but donations are welcome of course 🤑) we nonetheless believe date prediction has value for research purposes. We can use ERWT for "fictitious" prediction, i.e. as a diagnostic tool.
143
 
144
- Firstly, masking the temporal information,
145
-
146
- Secondly,
147
-
148
 
149
  ## Limitations
150
 
151
- ### The models
152
-
153
- ERWT models were trained for evaluation purposes, and cary critical limitations. First of all, as explained in more detail below, this model is trained on a rather small subsample of British newspapers, with a strong Metropolitan and liberal bias.
154
 
155
  Secondly, we only trained for one epoch, which suggests. For the evaluation purposes we were interested in the relative performance of our models.
156
 
 
125
 
126
  ```python
127
  mask_filler("[MASK] [DATE] The Schleswig war is a matter of great concern.")
128
+
129
+ ```
130
+
131
+ Outputs as most likely filler:
132
+
133
+ ```python
134
+ {'score': 0.48822104930877686,
135
+ 'token': 6717,
136
+ 'token_str': '1864',
137
+ 'sequence': '1864 the schleswig war is a matter of great concern.'}
138
+
139
  ```
140
 
141
+
142
+ The prediction "1864" makes sense as this was indeed the year of Prussian troops (with some help of their Austrian friends) crossed the border into Schleswig, then part of the Kingdom of Denmark.
143
 
144
  A few years later, in 1870, Prussia aimed artillery southwards and invaded France.
145
 
 
153
 
154
  In both cases, our answers would be "yes, but...". ERWT's time-stamping powers has little instrumental use and won't make us rich (but donations are welcome of course 🤑) we nonetheless believe date prediction has value for research purposes. We can use ERWT for "fictitious" prediction, i.e. as a diagnostic tool.
155
 
156
+ Firstly, we used date prediction for evaluation purposes, to measure which training routine produces models
157
+ Secondly, we could use it as an analytical tool, to study how temporal variation **within** text documents and further scrutinise which features drive the time prediction (it goes without saying that the same applies to other metadata fields, but example predicting political orientation).
 
 
158
 
159
  ## Limitations
160
 
161
+ The ERWT series were trained for evaluation purposes, and cary critical limitations. First of all, as explained in more detail below, this model is trained on a rather small subsample of British newspapers, with a strong Metropolitan and liberal bias.
 
 
162
 
163
  Secondly, we only trained for one epoch, which suggests. For the evaluation purposes we were interested in the relative performance of our models.
164