Finetuning
- Pretrain Model : klue/roberta-small
- Dataset for fine-tuning : 3i4k
- Train : 46,863
- Validation : 8,271 (15% of Train)
- Test : 6,121
- Label info
- 0: "fragment",
- 1: "statement",
- 2: "question",
- 3: "command",
- 4: "rhetorical question",
- 5: "rhetorical command",
- 6: "intonation-dependent utterance"
- Parameters of Training
{
"epochs": 3 (setting 10 but early stopped),
"batch_size":32,
"optimizer_class": "<keras.optimizer_v2.adam.Adam'>",
"optimizer_params": {
"lr": 5e-05
},
"min_delta": 0.01
}
Usage
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification, TextClassificationPipeline
HUGGINGFACE_MODEL_PATH = "bespin-global/klue-roberta-small-3i4k-intent-classification"
loaded_tokenizer = RobertaTokenizerFast.from_pretrained(HUGGINGFACE_MODEL_PATH )
loaded_model = RobertaForSequenceClassification.from_pretrained(HUGGINGFACE_MODEL_PATH )
text_classifier = TextClassificationPipeline(
tokenizer=loaded_tokenizer,
model=loaded_model,
return_all_scores=True
)
text = "your text"
preds_list = text_classifier(text)
best_pred = preds_list[0]
print(f"Label of Best Intentatioin: {best_pred['label']}")
print(f"Score of Best Intentatioin: {best_pred['score']}")
Evaluation
precision recall f1-score support
command 0.89 0.92 0.90 1296
fragment 0.98 0.96 0.97 600
intonation-depedent utterance 0.71 0.69 0.70 327
question 0.95 0.97 0.96 1786
rhetorical command 0.87 0.64 0.74 108
rhetorical question 0.61 0.63 0.62 174
statement 0.91 0.89 0.90 1830
accuracy 0.90 6121
macro avg 0.85 0.81 0.83 6121
weighted avg 0.90 0.90 0.90 6121
Citing & Authors
Jaehyeong at Bespin Global