|
--- |
|
license: agpl-3.0 |
|
--- |
|
|
|
Model is developed in support of the University of Belgrade doctoral dissertation "Composite pseudogrammars based on parallel language models of Serbian" by Mihailo Škorić. |
|
|
|
It generates syntactly masked sentences for Serbian. |
|
|
|
This small gpt-2 model was fine-tuned on several corpora for Serbian, augmented using [Serbian Morphological Dictionaries](http://poincare.matf.bg.ac.rs/~cvetana/biblio/22_Vitas_Krstev.pdf)). |
|
|
|
The corpora include ["The corpus of Contemporary Serbian"](https://drive.google.com/file/d/1wRgoWer6YULGCXR0zWOl1fVA6VIe1DOR), [SrpELTeC](https://drive.google.com/file/d/1RtBXyw5Cdh6y_cqbJoMlYhSwNFydBRUv) and WikiKorpus by [JeRTeh – Society for Language Resources and Technologies](https://jerteh.rs/). |
|
|
|
<b style="color:red">This model is purely experimental! For actual models for Serbian see <a href="https://huggingface.co/jerteh/gpt2-orao" style="color:blue;font-weight:bold">GPT2-ORAO</a> and <a style="color:blue;font-weight:bold" href="https://huggingface.co/jerteh/gpt2-orao">GPT2-VRABAC</a></b> |
|
<br/><b>If you use this model for your reseach please cite: [https://doi.org/10.3390/math11224660](https://doi.org/10.3390/math11224660)</b> |
|
|