Update README.md
Browse files
README.md
CHANGED
@@ -4,11 +4,56 @@ widget:
|
|
4 |
- text: "Pelayanan lama dan tidak ramah."
|
5 |
example_title: "Sentiment analysis"
|
6 |
---
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- text: "Pelayanan lama dan tidak ramah."
|
5 |
example_title: "Sentiment analysis"
|
6 |
---
|
7 |
+
|
8 |
+
### Training hyperparameters
|
9 |
+
* train_batch_size: 32
|
10 |
+
* eval_batch_size: 32
|
11 |
+
* learning_rate: 1e-4
|
12 |
+
* optimizer: AdamW with betas=(0.9, 0.999), eps=1e-8, and weight_decay=0.01
|
13 |
+
* epochs: 3
|
14 |
+
* learning_rate_scheduler: StepLR with step_size=592, gamma=0.1
|
15 |
+
|
16 |
+
### Training Results
|
17 |
+
|
18 |
+
The following table shows the training results for the model:
|
19 |
+
|
20 |
+
| Epoch | Loss | Accuracy |
|
21 |
+
|---|---|---|
|
22 |
+
| 1 | 0.2936 | 0.9310 |
|
23 |
+
| 2 | 0.1212 | 0.9526 |
|
24 |
+
| 3 | 0.0795 | 0.9569 |
|
25 |
+
|
26 |
+
### How to Use
|
27 |
+
|
28 |
+
You can load the model and perform inference as follows:
|
29 |
+
```
|
30 |
+
import torch
|
31 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
32 |
+
|
33 |
+
tokenizer = AutoTokenizer.from_pretrained("taufiqdp/indonesian-sentiment")
|
34 |
+
model = AutoModelForSequenceClassification.from_pretrained("taufiqdp/indonesian-sentiment")
|
35 |
+
|
36 |
+
class_names = ['negatif', 'netral', 'positif']
|
37 |
+
|
38 |
+
text = "Pelayanan lama dan tidak ramah"
|
39 |
+
tokenized_text = tokenizer(text, return_tensors='pt')
|
40 |
+
|
41 |
+
with torch.inference_mode():
|
42 |
+
logits = model(**tokenized_text)['logits']
|
43 |
+
|
44 |
+
result = class_names[logits.argmax(dim=1)]
|
45 |
+
print(result)
|
46 |
+
```
|
47 |
+
|
48 |
+
### Citation
|
49 |
+
|
50 |
+
```
|
51 |
+
@misc{koto2020indolem,
|
52 |
+
title={IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP},
|
53 |
+
author={Fajri Koto and Afshin Rahimi and Jey Han Lau and Timothy Baldwin},
|
54 |
+
year={2020},
|
55 |
+
eprint={2011.00677},
|
56 |
+
archivePrefix={arXiv},
|
57 |
+
primaryClass={cs.CL}
|
58 |
+
}
|
59 |
+
```
|