gary109
/

wikitext_roberta-base

generated_from_trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

gary109 commited on Jun 17, 2022

Commit

f202794

•

1 Parent(s): b33df34

update model card README.md

Files changed (1) hide show

README.md +95 -0

README.md ADDED Viewed

	@@ -0,0 +1,95 @@

+---
+license: mit
+tags:
+- generated_from_trainer
+datasets:
+- wikitext
+metrics:
+- accuracy
+model-index:
+- name: wikitext_roberta-base
+  results:
+  - task:
+      name: Masked Language Modeling
+      type: fill-mask
+    dataset:
+      name: wikitext
+      type: wikitext
+      args: wikitext-2-raw-v1
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.7311184760057123
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# wikitext_roberta-base
+This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the wikitext dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.2506
+- Accuracy: 0.7311
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 20.0
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.4175        | 0.99  | 37   | 1.3355          | 0.7194   |
+| 1.438         | 1.99  | 74   | 1.2953          | 0.7249   |
+| 1.4363        | 2.99  | 111  | 1.2759          | 0.7276   |
+| 1.3391        | 3.99  | 148  | 1.2904          | 0.7252   |
+| 1.3741        | 4.99  | 185  | 1.2621          | 0.7290   |
+| 1.2771        | 5.99  | 222  | 1.2312          | 0.7353   |
+| 1.287         | 6.99  | 259  | 1.2542          | 0.7289   |
+| 1.29          | 7.99  | 296  | 1.2290          | 0.7345   |
+| 1.2948        | 8.99  | 333  | 1.2537          | 0.7286   |
+| 1.2741        | 9.99  | 370  | 1.2199          | 0.7354   |
+| 1.2342        | 10.99 | 407  | 1.2520          | 0.7309   |
+| 1.2199        | 11.99 | 444  | 1.2738          | 0.7260   |
+| 1.206         | 12.99 | 481  | 1.2286          | 0.7335   |
+| 1.221         | 13.99 | 518  | 1.2421          | 0.7327   |
+| 1.2062        | 14.99 | 555  | 1.2402          | 0.7328   |
+| 1.2305        | 15.99 | 592  | 1.2473          | 0.7308   |
+| 1.2426        | 16.99 | 629  | 1.2250          | 0.7318   |
+| 1.2096        | 17.99 | 666  | 1.2186          | 0.7353   |
+| 1.1961        | 18.99 | 703  | 1.2214          | 0.7361   |
+| 1.2136        | 19.99 | 740  | 1.2506          | 0.7311   |
+### Framework versions
+- Transformers 4.21.0.dev0
+- Pytorch 1.11.0+cu113
+- Datasets 2.3.3.dev0
+- Tokenizers 0.12.1