Language
#3
by johnlockejrr - opened
Is the model trained only on Ivrit or Classical Hebrew proper also (Biblical/Tannaitic etc...)?
the training data does include CH and is trained on it, but it's portion is probably hidden well under the main Modern Hebrew distribution mass. Having said that, I'd expect it to understand CH relatively well, although I doubt it can generate it.
Dicta reports a better model for that, but I couldn't find its repo. The paper is available from here:
https://arxiv.org/pdf/2309.14568
Just wanted to be sure because I work with CH only. Thank you for clarification!