PaloBERT

Model description

A Greek language model based on RoBERTa

Training data

The training data is a corpus of 458,293 documents collected from Greek social media accounts. It also contains a GTP-2 tokenizer trained from scratch on the same corpus.

The training corpus has been collected and provided by Palo LTD

Eval results

BibTeX entry and citation info


@Article{info12080331,
AUTHOR = {Alexandridis, Georgios and Varlamis, Iraklis and Korovesis, Konstantinos and Caridakis, George and Tsantilas, Panagiotis},
TITLE = {A Survey on Sentiment Analysis and Opinion Mining in Greek Social Media},
JOURNAL = {Information},
VOLUME = {12},
YEAR = {2021},
NUMBER = {8},
ARTICLE-NUMBER = {331},
URL = {https://www.mdpi.com/2078-2489/12/8/331},
ISSN = {2078-2489},
DOI = {10.3390/info12080331}
}