pstroe
/

roberta-base-latin-cased3

Inference Endpoints

Model card Files Files and versions Community

pstroe commited on Aug 13, 2022

Commit

9013599

•

1 Parent(s): 66bffc8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ This is a Latin RoBERTa-based LM model, version 3.
 The intention of the Transformer-based LM is twofold: on the one hand, it will be used for the evaluation of HTR results; on the other, it should be used as a decoder for the TrOCR architecture.
-The training data differs from the one used in the RoBERTa Bas Latin Cased V1 and V2, and therefore also by what is used by [Bamman and Burns (2020)](https://arxiv.org/pdf/2009.10053.pdf). We exclusively used the text from the [Corpus Corporum](https://www.mlat.uzh.ch).
 The overall corpus contains 1.5G of text data (3x as much as has been used for V2 and very likely of better quality).

 The intention of the Transformer-based LM is twofold: on the one hand, it will be used for the evaluation of HTR results; on the other, it should be used as a decoder for the TrOCR architecture.
+The training data differs from the one used in the RoBERTa Bas Latin Cased V1 and V2, and therefore also by what is used by [Bamman and Burns (2020)](https://arxiv.org/pdf/2009.10053.pdf). We exclusively used the text from the [Corpus Corporum](https://www.mlat.uzh.ch) collected and maintained by the University of Zurich.
 The overall corpus contains 1.5G of text data (3x as much as has been used for V2 and very likely of better quality).