mitra-mir's picture
change the disc
a559fe8

A Transformer-based Persian Language Model Further Pretrained on Persian Poetry

ALBERT was first introduced by Hooshvare with 30,000 vocabulary size as lite BERT for self-supervised learning of language representations for the Persian language. Here we wanted to utilize its capabilities by pretraining it on a large corpse of Persian poetry. This model has been post-trained on 80 percent of poetry verses of the Persian poetry dataset - Ganjoor- and has been evaluated on the other 20 percent.