KoELECTRA (Base Generator)
Pretrained ELECTRA Language Model for Korean (koelectra-base-generator
)
For more detail, please see original repository.
Usage
Load model and tokenizer
>>> from transformers import ElectraModel, ElectraTokenizer
>>> model = ElectraModel.from_pretrained("monologg/koelectra-base-generator")
>>> tokenizer = ElectraTokenizer.from_pretrained("monologg/koelectra-base-generator")
Tokenizer example
>>> from transformers import ElectraTokenizer
>>> tokenizer = ElectraTokenizer.from_pretrained("monologg/koelectra-base-generator")
>>> tokenizer.tokenize("[CLS] 한국어 ELECTRA를 공유합니다. [SEP]")
['[CLS]', '한국어', 'E', '##L', '##EC', '##T', '##RA', '##를', '공유', '##합니다', '.', '[SEP]']
>>> tokenizer.convert_tokens_to_ids(['[CLS]', '한국어', 'E', '##L', '##EC', '##T', '##RA', '##를', '공유', '##합니다', '.', '[SEP]'])
[2, 18429, 41, 6240, 15229, 6204, 20894, 5689, 12622, 10690, 18, 3]
Example using ElectraForMaskedLM
from transformers import pipeline
fill_mask = pipeline(
"fill-mask",
model="monologg/koelectra-base-generator",
tokenizer="monologg/koelectra-base-generator"
)
print(fill_mask("나는 {} 밥을 먹었다.".format(fill_mask.tokenizer.mask_token)))
- Downloads last month
- 355
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.