Kaspar commited on
Commit
54c9f3c
1 Parent(s): 9c7b1be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -4
README.md CHANGED
@@ -81,7 +81,8 @@ mask_filler = pipeline("fill-mask",
81
  mask_filler(f"1820 [DATE] We received a letter from [MASK] Majesty.")
82
  ```
83
 
84
- Returns as most likely prediction
 
85
  ```python
86
  {'score': 0.8527863025665283,
87
  'token': 2010,
@@ -95,11 +96,22 @@ However, if we change the date at the start of the sentence to 1850:
95
  mask_filler(f"1820 [DATE] We received a letter from [MASK] Majesty.")
96
  ```
97
 
98
- Will put more probability mass on the token 'her`.
 
 
 
 
 
 
 
 
 
 
 
99
 
100
- You can try this for yourself in Example sentences at the top right.
101
 
102
- But why is this interesting? Firstly
103
 
104
  ### Date Prediction
105
 
 
81
  mask_filler(f"1820 [DATE] We received a letter from [MASK] Majesty.")
82
  ```
83
 
84
+ Returns as most likely prediction:
85
+
86
  ```python
87
  {'score': 0.8527863025665283,
88
  'token': 2010,
 
96
  mask_filler(f"1820 [DATE] We received a letter from [MASK] Majesty.")
97
  ```
98
 
99
+ Will put most of probability mass on the token "her" and only a little bit on "him".
100
+
101
+ ```python
102
+ {'score': 0.8168327212333679,
103
+ 'token': 2014,
104
+ 'token_str': 'her',
105
+ 'sequence': '1850 we received a letter from her majesty.'}
106
+ ```
107
+
108
+ You can repeat this experiment for yourself using the example sentences in the **Hosted inference API** at the top right.
109
+
110
+ Okay, but why is this interesting?
111
 
112
+ Firstly, eyeballing some toy-examples (but also using more rigorous metrics such as perplexity) shows that MLMs can perform more accurate predictions when it has access to temporal metadata. In other words, ERWT's prediction reflects historical language use more accurately. Model that are sensitive to historical context could
113
 
114
+ Secondly, we anticipate the MDMA may reduce bias, or at least gives us more of a handle on this problem. Admittedly, we have to prove this more formally, but some experiments at least hint in this direction.
115
 
116
  ### Date Prediction
117