File size: 3,373 Bytes

72949f0
ff2c841
 
 
 
889dcea
ff2c841
a0ec80a
 
 
 
 
 
 
 
 
 
 
 
8773a44
 
 
5d1baaa
 
 
28a0e57
 
 
d8f0aab
 
 
d3d0d7d
 
 
e4c7984
 
 
8b17d86
 
 
07391d3
 
 
32b8694
 
 
d44fae5
 
 
5f948f0
 
 
c31b3dc
 
 
72949f0
 
ff2c841
 
889dcea
ff2c841
889dcea
ff2c841
 
 
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
889dcea
ff2c841
 
 
 
 
 
 
 
 
 
889dcea
ff2c841
889dcea
ff2c841
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
889dcea
 
ff2c841
889dcea
ff2c841

---
tags:
- generated_from_trainer
datasets:
- squad_v2
model-index:
- name: distilbert-finetuned-uncased-squad_v2
  results:
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: SQuAD v2
      type: squad_v2
      split: validation
    metrics:
    - type: exact
      value: 100.0
      name: Exact
    - type: f1
      value: 100.0
      name: F1
    - type: total
      value: 2
      name: Total
    - type: HasAns_exact
      value: 100.0
      name: Hasans_exact
    - type: HasAns_f1
      value: 100.0
      name: Hasans_f1
    - type: HasAns_total
      value: 2
      name: Hasans_total
    - type: best_exact
      value: 100.0
      name: Best_exact
    - type: best_exact_thresh
      value: 0.7474104762077332
      name: Best_exact_thresh
    - type: best_f1
      value: 100.0
      name: Best_f1
    - type: best_f1_thresh
      value: 0.7474104762077332
      name: Best_f1_thresh
    - type: total_time_in_seconds
      value: 0.02269491500192089
      name: Total_time_in_seconds
    - type: samples_per_second
      value: 88.1254677460004
      name: Samples_per_second
    - type: latency_in_seconds
      value: 0.011347457500960445
      name: Latency_in_seconds
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilbert-finetuned-uncased-squad_v2

This model was trained from scratch on the squad_v2 dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3332

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 512
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 3.6437        | 0.39  | 100  | 2.1780          |
| 2.1596        | 0.78  | 200  | 1.6557          |
| 1.8138        | 1.18  | 300  | 1.5683          |
| 1.6987        | 1.57  | 400  | 1.5076          |
| 1.6586        | 1.96  | 500  | 1.5350          |
| 1.5957        | 1.18  | 600  | 1.4431          |
| 1.5825        | 1.37  | 700  | 1.4955          |
| 1.5523        | 1.57  | 800  | 1.4444          |
| 1.5346        | 1.76  | 900  | 1.3930          |
| 1.5098        | 1.96  | 1000 | 1.4285          |
| 1.4632        | 2.16  | 1100 | 1.3630          |
| 1.4468        | 2.35  | 1200 | 1.3710          |
| 1.4343        | 2.55  | 1300 | 1.3422          |
| 1.4225        | 2.75  | 1400 | 1.3971          |
| 1.408         | 2.94  | 1500 | 1.4355          |
| 1.3609        | 3.14  | 1600 | 1.3332          |
| 1.3398        | 3.33  | 1700 | 1.3792          |
| 1.3224        | 3.53  | 1800 | 1.4172          |
| 1.3152        | 3.73  | 1900 | 1.3956          |
| 1.3141        | 3.92  | 2000 | 1.3748          |


### Framework versions

- Transformers 4.34.1
- Pytorch 2.1.0+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1