tyqiangz
/

indobert-lite-large-p2-smsa

@@ -12,42 +12,42 @@ datasets:
 # IndoBERT-Lite Large Model (phase2 - uncased) Finetuned on IndoNLU SmSA dataset
-Finetuned the IndoBERT-Lite Large Model (phase2 - uncased) model following the procedues stated in the paper [IndoNLU: Benchmark and Resources for Evaluating Indonesian
 Natural Language Understanding](https://arxiv.org/pdf/2009.05387.pdf).
-Finetuning hyperparameters:
 - learning rate: 2e-5
 - batch size: 16
 - no. of epochs: 5
 - max sequence length: 512
 - random seed: 42
 ## How to use
 ### Load model and tokenizer
 ```python
-from transformers import BertTokenizer, AutoModelForSequenceClassification
-import torch
-import torch.nn.functional as F
-tokenizer = BertTokenizer.from_pretrained("tyqiangz/indobert-lite-large-p2-smsa")
-model = AutoModel.from_pretrained("tyqiangz/indobert-lite-large-p2-smsa")
 text = "Penyakit koronavirus 2019"
-index_to_word = {0: 'positive', 1: 'neutral', 2: 'negative'}
-subwords = tokenizer.encode(text, add_special_tokens=True)
-subwords = torch.LongTensor(subwords).view(1, -1).to(model.device)
-logits = model(subwords)[0]
-label = torch.topk(logits, k=1, dim=-1)[1].squeeze().item()
-print(index_to_word[label])
 """
 Output:
-'negative'
 """
 ```

 # IndoBERT-Lite Large Model (phase2 - uncased) Finetuned on IndoNLU SmSA dataset
+Finetuned the IndoBERT-Lite Large Model (phase2 - uncased) model on the IndoNLU SmSA dataset following the procedues stated in the paper [IndoNLU: Benchmark and Resources for Evaluating Indonesian
 Natural Language Understanding](https://arxiv.org/pdf/2009.05387.pdf).
+**Finetuning hyperparameters:**
 - learning rate: 2e-5
 - batch size: 16
 - no. of epochs: 5
 - max sequence length: 512
 - random seed: 42
+**Classes:**
+- 0: positive
+- 1: neutral
+- 2: negative
+Validation accuracy: 0.94
+Validation F1: 0.91
+Validation Recall: 0.91
+Validation Precision: 0.93
 ## How to use
 ### Load model and tokenizer
 ```python
+from transformers import pipeline
+classifier = pipeline("text-classification",
+                      model='tyqiangz/indobert-lite-large-p2-smsa', return_all_scores=True)
 text = "Penyakit koronavirus 2019"
+prediction = classifier(text)
+prediction
 """
 Output:
+[[{'label': 'positive', 'score': 0.0006000096909701824},
+  {'label': 'neutral', 'score': 0.01223431620746851},
+  {'label': 'negative', 'score': 0.987165629863739}]]
 """
 ```