chkla commited on
Commit
90d2663
β€’
1 Parent(s): e969c33

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -4
README.md CHANGED
@@ -1,22 +1,45 @@
1
  Welcome to **RoBERTArg**!
2
 
3
- πŸ€– **Training**: Model (RoBERTA_base_) fine-tuned on a heterogeneous argument corpus (~40K sentences, Stab et al. (2018) πŸ“š) corpus of several controversial topics (abortion etc.).
 
4
 
5
- 🏷 Using **NON-ARGUMENT** (0) and **ARGUMENT** (1) as labels.
6
 
7
- **Performance**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  | Model | Acc | F1 | R arg | R non | P arg | P non |
10
  |----|----|----|----|----|----|----|
11
  | RoBERTArg | 0.8193 | 0.8021 | 0.8463 | 0.7986 | 0.7623 | 0.8719 |
12
 
13
- **Confusion Matrix**
14
 
15
  | | ARGUMENT | NON-ARGUMENT |
16
  |----|----|----|
17
  | ARGUMENT | 2213 | 558 |
18
  | NON-ARGUMENT | 325 | 1790 |
19
 
 
 
 
 
20
  πŸ‘‰πŸΎ Check out _chkla/argument-analyzer/_ for more details.
21
 
22
  Enjoy and stay tuned! πŸš€
 
1
  Welcome to **RoBERTArg**!
2
 
3
+ πŸ€– **Model description**:
4
+ This model was trained on ~40k heterogeneous manually annotated sentences (πŸ“š Stab et al. 2018) of controversial topics (abortion etc.) to classify text into one of two labels: 🏷 **NON-ARGUMENT** (0) and **ARGUMENT** (1).
5
 
6
+ **Dataset**
7
 
8
+ Please note that the label distribution in the dataset is imbalanced:
9
+ * NON-ARGUMENTS:
10
+ * ARGUMENTS:
11
+
12
+ **Model training**
13
+ **RoBERTArg** was fine-tuned on a $RoBERTA_{base}$ pre-trained model using the HuggingFace trainer with the following hyperparameters. The hyperparameters were determined using a hyperparameter search on a 20% validation set.
14
+
15
+ ```
16
+ training_args = TrainingArguments(
17
+ num_train_epochs=2,
18
+ learning_rate=2.3102e-06,
19
+ seed=8,
20
+ per_device_train_batch_size=64,
21
+ per_device_eval_batch_size=64,
22
+ )
23
+ ```
24
+
25
+ **Evaluation**
26
+ The model was evaluated using 20% of the sentences (80-20 train-test split).
27
 
28
  | Model | Acc | F1 | R arg | R non | P arg | P non |
29
  |----|----|----|----|----|----|----|
30
  | RoBERTArg | 0.8193 | 0.8021 | 0.8463 | 0.7986 | 0.7623 | 0.8719 |
31
 
32
+ Showing the **confusion matrix** using the 20% of the sentences as an evaluation set:
33
 
34
  | | ARGUMENT | NON-ARGUMENT |
35
  |----|----|----|
36
  | ARGUMENT | 2213 | 558 |
37
  | NON-ARGUMENT | 325 | 1790 |
38
 
39
+ **Intended Uses & Potential Limitations**
40
+ The model can be a practical starting point to the complex topic **Argument Mining**. It is a quite challenging task due to the different conceptions of an argument.
41
+
42
+ This model is a part of an open-source project providing several models to detect arguments in text.
43
  πŸ‘‰πŸΎ Check out _chkla/argument-analyzer/_ for more details.
44
 
45
  Enjoy and stay tuned! πŸš€