Dialog-KoELECTRA is a language model specialized for dialogue. It was trained with 22GB colloquial and written style Korean text data. Dialog-ELECTRA model is made based on the ELECTRA model. ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU.
We are initially releasing small version pre-trained model. The model was trained on Korean text. We hope to release other models, such as base/large models, in the future.
|Batch Size||Train Steps|
Dialog-KoELECTRA shows strong performance in conversational downstream tasks.
|dialog||Aihub Korean dialog corpus||7GB|
|NIKL Spoken corpus|
|Korean chatbot data|
|written||NIKL Newspaper corpus||15GB|
We applied morpheme analysis using huggingface_konlpy when creating a vocabulary dictionary. As a result of the experiment, it showed better performance than a vocabulary dictionary created without applying morpheme analysis.
|vocabulary size||unused token size||limit alphabet||min frequency|
Select AutoNLP in the “Train” menu to fine-tune this model automatically.
- Downloads last month