File size: 1,696 Bytes
919d440
ab1d264
 
 
 
919d440
ab1d264
e6037a2
ab1d264
 
f834137
 
 
 
 
 
 
919d440
ab1d264
c797e34
 
 
 
12936ee
c797e34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
language:
- en
tags:
- intent detection
license: "other"
datasets:
- ibm/vira-intents
metrics:
- accuracy
widget:
- text: "Should I be concerned about side effects of the vaccine if I'm breastfeeding?} & Is breastfeeding safe with the vaccine"
  example_title: "Breastfeeding"
- text: "Does the vaccine prevent transmission?"
  example_title: "Transmission"
- text: "Will the vaccine make me sterile or infertile?	"
  example_title: "Infertility" 
---

## Model Description
This model is based on RoBERTa large (Liu, 2019), fine-tuned on a dataset of intent expressions available [here](https://research.ibm.com/haifa/dept/vst/debating_data.shtml) and also on 🤗 Transformer datasets hub [here](https://huggingface.co/datasets/ibm/vira-intents).

The model was created as part of the work described in [Benchmark Data and Evaluation Framework for Intent Discovery Around COVID-19 Vaccine Hesitancy
](https://arxiv.org/abs/2205.11966). The model is released under the Community Data License Agreement - Sharing - Version 1.0 ([link](https://cdla.dev/sharing-1-0/)), If you use this model, please cite our paper.

The official GitHub is [here](https://github.com/IBM/vira-intent-discovery). The script used for training the model is [trainer.py](https://github.com/IBM/vira-intent-discovery/blob/master/trainer.py).


## Training parameters
1. base_model = 'roberta-large'
1. learning_rate=5e-6
1. per_device_train_batch_size=16,
1. per_device_eval_batch_size=16,
1. num_train_epochs=15,
1. load_best_model_at_end=True,
1. save_total_limit=1,
1. save_strategy='epoch',
1. evaluation_strategy='epoch',
1. metric_for_best_model='accuracy',
1. seed=123

## Data collator
DataCollatorWithPadding