autoevaluator's picture
Add evaluation results on the default config and train split of boolq
fd277e2
metadata
language:
  - en
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - boolq
metrics:
  - accuracy
model_index:
  - name: distilbert-base-uncased-boolq
    results:
      - task:
          name: Question Answering
          type: question-answering
        dataset:
          name: boolq
          type: boolq
          args: default
        metric:
          name: Accuracy
          type: accuracy
          value: 0.7314984709480122
model-index:
  - name: andi611/distilbert-base-uncased-qa-boolq
    results:
      - task:
          type: natural-language-inference
          name: Natural Language Inference
        dataset:
          name: boolq
          type: boolq
          config: default
          split: train
        metrics:
          - type: accuracy
            value: 0.875676249071815
            name: Accuracy
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDdjNjIwZDRlZDkzZDZmM2JmYzA0ZjIwMjBlZTI3OWQ5ZWNiNWU0OWI2ZWZmMGI2OGZmMDVhYzhjOTE1M2UzNSIsInZlcnNpb24iOjF9.A4-llThkLZ5SdVf6KTc7kWnJlpPna5b7hhzR7DdbFozIvqlFSeXqUhYf9lxn2svdvfiCJSsP3kHzcn46lYybAg
          - type: precision
            value: 0.8591506263366941
            name: Precision
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmRhNDNhYjE3YTY4Mjk2ZThlMzQ5MGZiNGIxNmM4NDBlNzdlODkxYjRmNWM4YzAwZTlkOTFhZmJkMTQzZTYyZiIsInZlcnNpb24iOjF9.wl_bDHN2z0BXD5_IlLY8eQHFeCRkUGSj3NMOchIcbphiqoVoC_eWZNQqpZhM0XgCdoQrRKw4MNjCiwDq3euYCQ
          - type: recall
            value: 0.9574395641811372
            name: Recall
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGM0ZWJhYjI4YWIwNGQxZGQ3MTA4M2JiOTE5ZDc1ZDk5YjI5N2VjYjQzMTM3ZjM4YjVlNjNhNmU0MTVjZGJkNCIsInZlcnNpb24iOjF9.oC3_3F4164-tAIb0huR5xdzzRLpbxyJ52waXaWjbES8h0YRCrIjzmzgbhx4PPulxm8J59X1RF1wFsVXFFco3Bg
          - type: auc
            value: 0.9423158636459945
            name: AUC
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYmE5YjAxYjVmMjQwMzM0ZjBmNmM2NjFjZTcxMzIzNTk3NTdlNzVlOTM3YTMxMTdlNWMzNmE3YTk5MDQ1Y2VhYSIsInZlcnNpb24iOjF9.96hf0lrJ59bzlDm8lX9fv4WqNTP0mFVtpILWz-L3yBZyb4TIIKUh-JgDRwlLPu-JZlZS-gJSeAxPobrhJY0iCg
          - type: f1
            value: 0.9056360708534621
            name: F1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzk0N2EzMWUyZGI4NmE3NjlkZDI3ZjkwMDIwZTdhNzAwZjBjNmYxYjYzYjJkMjFlOWRiNWUxMTFiZmM5ZmJhNyIsInZlcnNpb24iOjF9.W5wBUPEtxI2Movs6_UKrxA5sNNgV7m619TLWfwG5uSA0bgcE9xmH9EnNljsbSnFn2ObxTmrUK-W0OZ3SzL9hCg
          - type: loss
            value: 0.45028823614120483
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDdkOGYyMTJlYmRlYTRmNGI3YzkyOTRhOGMwZGY2N2MzYjVhZjgwN2U2YjdjMGQzMmYyZjFkMTFlM2Y0NmQ0ZCIsInZlcnNpb24iOjF9.PJWhSy48ZNYnp76dTuvhuvj-EFFWd8hzN5He1nIlHOqiPHglCtnSon161R7Ar4ILWy4LyPM8ByRslhzJfj-WDw

distilbert-base-uncased-boolq

This model is a fine-tuned version of distilbert-base-uncased on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2071
  • Accuracy: 0.7315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.6506 1.0 531 0.6075 0.6681
0.575 2.0 1062 0.5816 0.6978
0.4397 3.0 1593 0.6137 0.7253
0.2524 4.0 2124 0.8124 0.7466
0.126 5.0 2655 1.1437 0.7370

Framework versions

  • Transformers 4.8.2
  • Pytorch 1.8.1+cu111
  • Datasets 1.8.0
  • Tokenizers 0.10.3