|
--- |
|
license: mit |
|
base_model: roberta-large |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- launch/open_question_type |
|
metrics: |
|
- f1 |
|
model-index: |
|
- name: roberta-large-question-classifier |
|
results: |
|
- task: |
|
name: Text Classification |
|
type: text-classification |
|
dataset: |
|
name: launch/open_question_type |
|
type: launch/open_question_type |
|
config: default |
|
split: validation |
|
args: default |
|
metrics: |
|
- name: F1 (macro avg.) |
|
type: f1 |
|
value: 0.8123190611646329 |
|
- task: |
|
name: Text Classification |
|
type: text-classification |
|
dataset: |
|
name: launch/open_question_type |
|
type: launch/open_question_type |
|
config: default |
|
split: test |
|
args: default |
|
metrics: |
|
- name: F1 (macro avg.) |
|
type: f1 |
|
value: 0.8 |
|
widget: |
|
- text: When two bacteria exchange genetic information, what is the process called? |
|
language: |
|
- en |
|
arxiv: 2107.00152 |
|
--- |
|
|
|
|
|
# roberta-large-question-classifier |
|
|
|
This model classifies questions according to the question-type ontology defined in following paper: [Controllable Open-ended Question Generation with A New Question Type Ontology](https://aclanthology.org/2021.acl-long.502/) (Cao & Wang, ACL-IJCNLP 2021). |
|
It is a fine-tuned [roberta-large](https://huggingface.co/roberta-large) on the [open_question_type](https://huggingface.co/datasets/launch/open_question_type) dataset. |
|
It achieves the following results on the test set: |
|
|
|
``` |
|
precision recall f1-score support |
|
cause 0.91 0.93 0.92 91 |
|
comparison 0.62 0.83 0.71 30 |
|
concept 0.85 0.65 0.74 54 |
|
consequence 0.80 0.73 0.76 11 |
|
disjunction 0.80 0.78 0.79 36 |
|
example 0.83 0.85 0.84 139 |
|
extent 0.82 0.94 0.87 48 |
|
judgmental 0.68 0.56 0.62 94 |
|
procedural 0.86 0.88 0.87 85 |
|
verification 0.79 0.86 0.83 72 |
|
accuracy 0.81 660 |
|
macro avg 0.80 0.80 0.80 660 |
|
weighted avg 0.81 0.81 0.81 660 |
|
``` |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 512 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.1 |
|
- num_epochs: 30 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | F1 | |
|
|:-------------:|:-----:|:----:|:---------------:|:------:| |
|
| 1.9467 | 1.0 | 233 | 1.3099 | 0.4050 | |
|
| 0.6381 | 2.0 | 466 | 0.5586 | 0.7785 | |
|
| 0.628 | 3.0 | 699 | 0.6419 | 0.7831 | |
|
| 0.4487 | 4.0 | 932 | 0.5770 | 0.8094 | |
|
| 0.3319 | 5.0 | 1165 | 0.7713 | 0.7953 | |
|
| 0.2095 | 6.0 | 1398 | 0.8799 | 0.8018 | |
|
| 0.1355 | 7.0 | 1631 | 1.0646 | 0.7961 | |
|
| 0.0956 | 8.0 | 1864 | 1.2175 | 0.7999 | |
|
| 0.0687 | 9.0 | 2097 | 1.3647 | 0.7892 | |
|
| 0.0371 | 10.0 | 2330 | 1.3809 | 0.7987 | |
|
| 0.0303 | 11.0 | 2563 | 1.3591 | 0.8123 | |
|
| 0.0263 | 12.0 | 2796 | 1.5317 | 0.8100 | |
|
| 0.0144 | 13.0 | 3029 | 1.5726 | 0.7959 | |
|
| 0.0436 | 14.0 | 3262 | 1.6160 | 0.7988 | |
|
| 0.0048 | 15.0 | 3495 | 1.6826 | 0.7957 | |
|
| 0.0001 | 16.0 | 3728 | 1.6913 | 0.7957 | |
|
| 0.0001 | 17.0 | 3961 | 1.7076 | 0.7995 | |
|
| 0.0034 | 18.0 | 4194 | 1.8018 | 0.7960 | |
|
| 0.0228 | 19.0 | 4427 | 1.7457 | 0.7916 | |
|
| 0.0083 | 20.0 | 4660 | 1.9279 | 0.7869 | |
|
| 0.0001 | 21.0 | 4893 | 1.8367 | 0.7915 | |
|
| 0.0003 | 22.0 | 5126 | 1.8620 | 0.7842 | |
|
| 0.0002 | 23.0 | 5359 | 1.9192 | 0.7828 | |
|
| 0.0 | 24.0 | 5592 | 1.9081 | 0.7927 | |
|
| 0.0003 | 25.0 | 5825 | 1.9822 | 0.7813 | |
|
| 0.0059 | 26.0 | 6058 | 1.8737 | 0.7954 | |
|
| 0.0 | 27.0 | 6291 | 1.8793 | 0.7929 | |
|
| 0.0 | 28.0 | 6524 | 1.8905 | 0.7940 | |
|
| 0.0 | 29.0 | 6757 | 1.8971 | 0.7940 | |
|
| 0.0002 | 30.0 | 6990 | 1.9002 | 0.7954 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.33.2 |
|
- Pytorch 2.1.0+cu118 |
|
- Datasets 2.14.5 |
|
- Tokenizers 0.13.3 |