--- datasets: - klue - wikipedia language: - ko metrics: - accuracy training_args: - num_train_epochs=5, - per_device_train_batch_size=16, - per_device_eval_batch_size=16, - prediction_loss_only=False, - learning_rate=5e-5, - logging_strategy='steps', - logging_steps=100, - save_steps=1000, - eval_steps=1000, - save_strategy="steps", - evaluation_strategy="steps", - load_best_model_at_end=True, - metric_for_best_model="masked_accuracy", - greater_is_better=True, - seed=42, - warmup_steps=5000, info: - MLM (15%) from the checkpoint of klue/roberta-large - LineByLineTextDataset (block_size 384) - PLM for ODQA task based-on Wikipedia questions - Accuracy (for [MASK]) = 0.7066 (CE loss 1.388) - v2 is trained with smaller learning rate and more epochs ---