MacBERTh
This model is a Historical Language Model for English coming from the MacBERTh project.
The architecture is based on BERT base uncased from the original BERT pre-training codebase. The training material comes from different sources including:
- EEBO
- ECCO
- COHA
- CLMET3.1
- EVANS
- Hansard Corpus
with a total word count of approximately 3.9B tokens.
Details and evaluation can be found in the accompanying publications:
- Downloads last month
- 127