Livingwithmachines
/

erwt-year

Inference Endpoints

Model card Files Files and versions Community

Kaspar commited on Nov 18, 2022

Commit

cf9efaf

•

1 Parent(s): 7b5ed33

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -169,9 +169,10 @@ Many of the limitations are a direct result of the data. ERWT models are trained
 We created this model as part of a wider experiment, which attempted to establish best practices for training models with metadata. An overview of all the models is available on our [GitHub](https://github.com/Living-with-machines/ERWT/) page.
 To reduce training time, we based our experiments on a random subsample of the HMD corpus, consisting of half a billion tokens.
-Furthermore, we only trained the models for one epoch, which implies .
-We were mainly interested in the relative performance of the different ERWT models and .
 ## Data Description

 We created this model as part of a wider experiment, which attempted to establish best practices for training models with metadata. An overview of all the models is available on our [GitHub](https://github.com/Living-with-machines/ERWT/) page.
 To reduce training time, we based our experiments on a random subsample of the HMD corpus, consisting of half a billion tokens.
+Furthermore, we only trained the models for one epoch, which implies they are most likely undertrained at the moment.
+We were mainly interested in the **relative** performance of the different ERWT models. We did, however, compared ERWT with with [`distilbert-base-cased`](https://huggingface.co/distilbert-base-cased) in our evaluation experiments, and of course, our tiny LM peas
+did much better. 🥳
 ## Data Description