Kaspar commited on
Commit
b3b9085
1 Parent(s): 90a348e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -19,17 +19,19 @@ widget:
19
 
20
  # ERWT-year
21
 
22
- A fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) model trained on historical newspapers from the [Heritage Made Digital collection](https://huggingface.co/datasets/davanstrien/hmd-erwt-training) with temporal metadata.
 
 
23
 
24
 
25
  **Warning**: This model was trained for **experimental purposes**, please use it with care.
26
 
27
 
28
- You find more detailed information below and in our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing).
29
 
30
  ## Background
31
 
32
- ERWT was created using a MetaData Masking Approach (or MDMA 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
33
 
34
  To create this ERWT model we fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) on a random subsample of the Heritage Made Digital newspaper of about half a billion words. We slightly adapted to the training routine by adding the year of publication and a special token `[DATE]` in front of each text segment (i.e. a chunk of hundred tokens).
35
 
 
19
 
20
  # ERWT-year
21
 
22
+ A Historical Language Model,
23
+
24
+ ERWT is a fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) model trained on historical newspapers from the [Heritage Made Digital collection](https://huggingface.co/datasets/davanstrien/hmd-erwt-training) with temporal metadata.
25
 
26
 
27
  **Warning**: This model was trained for **experimental purposes**, please use it with care.
28
 
29
 
30
+ You find more detailed information below, especially the "limitations" section (hey, seriously, read this ... very important, we don't write this just to look smart). You can also consult our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing) for more background and nerdy evaluation stuff (still, work in progress, handle with care and kindness).
31
 
32
  ## Background
33
 
34
+ ERWT was created using a **M**eta**D**ata **M**asking **A**pproach (or **MDMA** 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
35
 
36
  To create this ERWT model we fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) on a random subsample of the Heritage Made Digital newspaper of about half a billion words. We slightly adapted to the training routine by adding the year of publication and a special token `[DATE]` in front of each text segment (i.e. a chunk of hundred tokens).
37