Kaspar commited on
Commit
dc917cd
1 Parent(s): 969da0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -19,19 +19,18 @@ widget:
19
 
20
  <img src="https://upload.wikimedia.org/wikipedia/commons/5/5b/NCI_peas_in_pod.jpg" alt="erwt" width="200" >
21
 
22
- # ERWT-year
23
- A language model that is better at history than you...
24
 
25
- ...maybe.
26
 
27
  ERWT is a fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) model trained on historical newspapers from the [Heritage Made Digital collection](https://huggingface.co/datasets/davanstrien/hmd-erwt-training) with temporal metadata.
28
 
29
  This model is served you by Kaspar von Beelen and Daniel van Strien.
30
 
31
- Improving AI, one pea at a time.
32
 
33
 
34
- ## Note
35
 
36
  This model was trained for **experimental purposes**, please use it with care.
37
 
@@ -40,7 +39,7 @@ You find more detailed information below, especially the "limitations" section.
40
 
41
  If you can't get enough, you can still consult our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing) for more background and nerdy evaluation stuff (still, work in progress, handle with care and kindness).
42
 
43
- ## Background
44
 
45
  ERWT was created using a **M**eta**D**ata **M**asking **A**pproach (or **MDMA** 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
46
 
 
19
 
20
  <img src="https://upload.wikimedia.org/wikipedia/commons/5/5b/NCI_peas_in_pod.jpg" alt="erwt" width="200" >
21
 
22
+ # ERWT-year
23
+ \~🌺\~A language model that is (🤭 maybe 🤫) better at history than you...\~🌺\~
24
 
 
25
 
26
  ERWT is a fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) model trained on historical newspapers from the [Heritage Made Digital collection](https://huggingface.co/datasets/davanstrien/hmd-erwt-training) with temporal metadata.
27
 
28
  This model is served you by Kaspar von Beelen and Daniel van Strien.
29
 
30
+ *Improving AI, one pea at a time.*
31
 
32
 
33
+ ## Introductory Note: Repent Now. 😇
34
 
35
  This model was trained for **experimental purposes**, please use it with care.
36
 
 
39
 
40
  If you can't get enough, you can still consult our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing) for more background and nerdy evaluation stuff (still, work in progress, handle with care and kindness).
41
 
42
+ ## Background: MDMA to the rescue. 🙂
43
 
44
  ERWT was created using a **M**eta**D**ata **M**asking **A**pproach (or **MDMA** 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
45