End of training

Browse files

Files changed (10) hide show

README.md +78 -46
config.json +3 -2
merges.txt +1 -1
model.safetensors +2 -2
runs/Feb19_11-00-22_bookbot-h100/events.out.tfevents.1708340422.bookbot-h100.8811.0 +3 -0
runs/Feb19_11-00-22_bookbot-h100/events.out.tfevents.1708340574.bookbot-h100.8811.1 +3 -0
special_tokens_map.json +51 -1
tokenizer.json +0 -0
tokenizer_config.json +57 -1
training_args.bin +3 -0

README.md CHANGED Viewed

@@ -1,67 +1,99 @@
 ---
-language: id
-tags:
-  - indonesian-roberta-base-posp-tagger
 license: mit
 datasets:
-  - indonlu
-widget:
-  - text: "Budi sedang pergi ke pasar."
 ---
-## Indonesian RoBERTa Base POSP Tagger
-Indonesian RoBERTa Base POSP Tagger is a part-of-speech token-classification model based on the [RoBERTa](https://arxiv.org/abs/1907.11692) model. The model was originally the pre-trained [Indonesian RoBERTa Base](https://hf.co/flax-community/indonesian-roberta-base) model, which is then fine-tuned on [`indonlu`](https://hf.co/datasets/indonlu)'s `POSP` dataset consisting of tag-labelled news.
-After training, the model achieved an evaluation F1-macro of 95.34%. On the benchmark test set, the model achieved an accuracy of 93.99% and F1-macro of 88.93%.
-Hugging Face's `Trainer` class from the [Transformers](https://huggingface.co/transformers) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with other frameworks nonetheless.
-## Model
-| Model                                 | #params | Arch.        | Training/Validation data (text) |
-| ------------------------------------- | ------- | ------------ | ------------------------------- |
-| `indonesian-roberta-base-posp-tagger` | 124M    | RoBERTa Base | `POSP`                          |
-## Evaluation Results
-The model was trained for 10 epochs and the best model was loaded at the end.
-| Epoch | Training Loss | Validation Loss | Precision | Recall   | F1       | Accuracy |
-| ----- | ------------- | --------------- | --------- | -------- | -------- | -------- |
-| 1     | 0.898400      | 0.343731        | 0.894324  | 0.894324 | 0.894324 | 0.894324 |
-| 2     | 0.294700      | 0.236619        | 0.929620  | 0.929620 | 0.929620 | 0.929620 |
-| 3     | 0.214100      | 0.202723        | 0.938349  | 0.938349 | 0.938349 | 0.938349 |
-| 4     | 0.171100      | 0.183630        | 0.945264  | 0.945264 | 0.945264 | 0.945264 |
-| 5     | 0.143300      | 0.169744        | 0.948469  | 0.948469 | 0.948469 | 0.948469 |
-| 6     | 0.124700      | 0.174946        | 0.947963  | 0.947963 | 0.947963 | 0.947963 |
-| 7     | 0.109800      | 0.167450        | 0.951590  | 0.951590 | 0.951590 | 0.951590 |
-| 8     | 0.101300      | 0.163191        | 0.952475  | 0.952475 | 0.952475 | 0.952475 |
-| 9     | 0.093500      | 0.163255        | 0.953361  | 0.953361 | 0.953361 | 0.953361 |
-| 10    | 0.089000      | 0.164673        | 0.953445  | 0.953445 | 0.953445 | 0.953445 |
-## How to Use
-### As Token Classifier
-```python
-from transformers import pipeline
-pretrained_name = "w11wo/indonesian-roberta-base-posp-tagger"
-nlp = pipeline(
-    "token-classification",
-    model=pretrained_name,
-    tokenizer=pretrained_name
-)
-nlp("Budi sedang pergi ke pasar.")
-```
-## Disclaimer
-Do consider the biases which come from both the pre-trained RoBERTa model and the `POSP` dataset that may be carried over into the results of this model.
-## Author
-Indonesian RoBERTa Base POSP Tagger was trained and evaluated by [Wilson Wongso](https://w11wo.github.io/). All computation and development are done on Google Colaboratory using their free GPU access.

 ---
 license: mit
+base_model: flax-community/indonesian-roberta-base
+tags:
+- generated_from_trainer
 datasets:
+- indonlu
+metrics:
+- precision
+- recall
+- f1
+- accuracy
+model-index:
+- name: indonesian-roberta-base-posp-tagger
+  results:
+  - task:
+      name: Token Classification
+      type: token-classification
+    dataset:
+      name: indonlu
+      type: indonlu
+      config: posp
+      split: validation
+      args: posp
+    metrics:
+    - name: Precision
+      type: precision
+      value: 0.9625100240577386
+    - name: Recall
+      type: recall
+      value: 0.9625100240577386
+    - name: F1
+      type: f1
+      value: 0.9625100240577386
+    - name: Accuracy
+      type: accuracy
+      value: 0.9625100240577386
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# indonesian-roberta-base-posp-tagger
+This model is a fine-tuned version of [flax-community/indonesian-roberta-base](https://huggingface.co/flax-community/indonesian-roberta-base) on the indonlu dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1395
+- Precision: 0.9625
+- Recall: 0.9625
+- F1: 0.9625
+- Accuracy: 0.9625
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 10
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
+| No log        | 1.0   | 420  | 0.2254          | 0.9313    | 0.9313 | 0.9313 | 0.9313   |
+| 0.4398        | 2.0   | 840  | 0.1617          | 0.9499    | 0.9499 | 0.9499 | 0.9499   |
+| 0.1566        | 3.0   | 1260 | 0.1431          | 0.9569    | 0.9569 | 0.9569 | 0.9569   |
+| 0.103         | 4.0   | 1680 | 0.1412          | 0.9605    | 0.9605 | 0.9605 | 0.9605   |
+| 0.0723        | 5.0   | 2100 | 0.1408          | 0.9635    | 0.9635 | 0.9635 | 0.9635   |
+| 0.051         | 6.0   | 2520 | 0.1408          | 0.9642    | 0.9642 | 0.9642 | 0.9642   |
+| 0.051         | 7.0   | 2940 | 0.1510          | 0.9635    | 0.9635 | 0.9635 | 0.9635   |
+| 0.0368        | 8.0   | 3360 | 0.1653          | 0.9645    | 0.9645 | 0.9645 | 0.9645   |
+| 0.0277        | 9.0   | 3780 | 0.1664          | 0.9644    | 0.9644 | 0.9644 | 0.9644   |
+| 0.0231        | 10.0  | 4200 | 0.1668          | 0.9646    | 0.9646 | 0.9646 | 0.9646   |
+### Framework versions
+- Transformers 4.37.2
+- Pytorch 2.2.0+cu118
+- Datasets 2.16.1
+- Tokenizers 0.15.1

config.json CHANGED Viewed

@@ -1,10 +1,11 @@
 {
-  "_name_or_path": "indonesian-roberta-base-posp-tagger",
   "architectures": [
     "RobertaForTokenClassification"
   ],
   "attention_probs_dropout_prob": 0.1,
   "bos_token_id": 0,
   "eos_token_id": 2,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
@@ -76,7 +77,7 @@
   "pad_token_id": 1,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",
-  "transformers_version": "4.8.2",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

 {
+  "_name_or_path": "flax-community/indonesian-roberta-base",
   "architectures": [
     "RobertaForTokenClassification"
   ],
   "attention_probs_dropout_prob": 0.1,
   "bos_token_id": 0,
+  "classifier_dropout": null,
   "eos_token_id": 2,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "pad_token_id": 1,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",
+  "transformers_version": "4.37.2",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

merges.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-#version: 0.2 - Trained by `huggingface/tokenizers`
 a n
 Ġ d
 e r

+#version: 0.2
 a n
 Ġ d
 e r

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e7e84b30ea165a7c83e532b6b78184ba7bdc6c9d44d606a3d269726f44faac1a
-size 496328272

 version https://git-lfs.github.com/spec/v1
+oid sha256:74a172fdb16b3bc119fdf9e657c16e6a31f11ba4e00ea149ae343456ba5da238
+size 496324072

runs/Feb19_11-00-22_bookbot-h100/events.out.tfevents.1708340422.bookbot-h100.8811.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a3ee78f3b0ce792d75fc103dabeec566e4a20b04c066bd52dc3f7e431c361b3
+size 11805

runs/Feb19_11-00-22_bookbot-h100/events.out.tfevents.1708340574.bookbot-h100.8811.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07f96ad90142a04884a683d82e5468514babcb3d9804c9c733708055997c9516
+size 560

special_tokens_map.json CHANGED Viewed

	@@ -1 +1,51 @@
1	- {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

	@@ -1 +1,57 @@
1	- {"unk_token": "<unk>", "bos_token": "<s>", "eos_token": "</s>", "add_prefix_space": true, "errors": "replace", "sep_token": "</s>", "cls_token": "<s>", "pad_token": "<pad>", "mask_token": "<mask>", "special_tokens_map_file": null, "name_or_path": "flax-community/indonesian-roberta-base", "tokenizer_class": "RobertaTokenizer"}

+{
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "errors": "replace",
+  "mask_token": "<mask>",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "tokenizer_class": "RobertaTokenizer",
+  "trim_offsets": true,
+  "unk_token": "<unk>"
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3d0253fe1cd338bbbba23d8f23104c4b0a2197245ed788cb131216b57cd655e6
+size 4728