---
license: apache-2.0
tags:
- generated_from_trainer
model-index:
- name: distilroberta-base-finetuned-wikitext2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilroberta-base-finetuned-wikitext2

This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0244

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 80.0

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.0983        | 1.0   | 11   | 0.0699          |
| 0.0481        | 2.0   | 22   | 0.0476          |
| 0.0367        | 3.0   | 33   | 0.0445          |
| 0.0264        | 4.0   | 44   | 0.0423          |
| 0.0336        | 5.0   | 55   | 0.0390          |
| 0.025         | 6.0   | 66   | 0.0410          |
| 0.0141        | 7.0   | 77   | 0.0425          |
| 0.0121        | 8.0   | 88   | 0.0371          |
| 0.0075        | 9.0   | 99   | 0.0293          |
| 0.011         | 10.0  | 110  | 0.0344          |
| 0.012         | 11.0  | 121  | 0.0304          |
| 0.0029        | 12.0  | 132  | 0.0313          |
| 0.0026        | 13.0  | 143  | 0.0303          |
| 0.0004        | 14.0  | 154  | 0.0298          |
| 0.0008        | 15.0  | 165  | 0.0288          |
| 0.0002        | 16.0  | 176  | 0.0286          |
| 0.0003        | 17.0  | 187  | 0.0319          |
| 0.0004        | 18.0  | 198  | 0.0332          |
| 0.0003        | 19.0  | 209  | 0.0423          |
| 0.0011        | 20.0  | 220  | 0.0244          |
| 0.0002        | 21.0  | 231  | 0.0255          |
| 0.0003        | 22.0  | 242  | 0.0264          |
| 0.0003        | 23.0  | 253  | 0.0239          |
| 0.0001        | 24.0  | 264  | 0.0216          |
| 0.0001        | 25.0  | 275  | 0.0212          |
| 0.0002        | 26.0  | 286  | 0.0222          |
| 0.0001        | 27.0  | 297  | 0.0226          |
| 0.0004        | 28.0  | 308  | 0.0214          |
| 0.0001        | 29.0  | 319  | 0.0223          |
| 0.0001        | 30.0  | 330  | 0.0228          |
| 0.0001        | 31.0  | 341  | 0.0225          |
| 0.0002        | 32.0  | 352  | 0.0222          |
| 0.0001        | 33.0  | 363  | 0.0233          |
| 0.0003        | 34.0  | 374  | 0.0340          |
| 0.0002        | 35.0  | 385  | 0.0321          |
| 0.0001        | 36.0  | 396  | 0.0203          |
| 0.0004        | 37.0  | 407  | 0.0205          |
| 0.0002        | 38.0  | 418  | 0.0270          |
| 0.0001        | 39.0  | 429  | 0.0204          |
| 0.0001        | 40.0  | 440  | 0.0187          |
| 0.0002        | 41.0  | 451  | 0.0223          |
| 0.0002        | 42.0  | 462  | 0.0279          |
| 0.0001        | 43.0  | 473  | 0.0302          |
| 0.0001        | 44.0  | 484  | 0.0286          |
| 0.0001        | 45.0  | 495  | 0.0202          |
| 0.0001        | 46.0  | 506  | 0.0219          |
| 0.0001        | 47.0  | 517  | 0.0227          |
| 0.0001        | 48.0  | 528  | 0.0234          |
| 0.0001        | 49.0  | 539  | 0.0235          |
| 0.0001        | 50.0  | 550  | 0.0234          |
| 0.0001        | 51.0  | 561  | 0.0234          |
| 0.0001        | 52.0  | 572  | 0.0233          |
| 0.0001        | 53.0  | 583  | 0.0237          |
| 0.0001        | 54.0  | 594  | 0.0238          |
| 0.0001        | 55.0  | 605  | 0.0237          |
| 0.0001        | 56.0  | 616  | 0.0236          |
| 0.0001        | 57.0  | 627  | 0.0238          |
| 0.0           | 58.0  | 638  | 0.0236          |
| 0.0           | 59.0  | 649  | 0.0235          |
| 0.0001        | 60.0  | 660  | 0.0236          |
| 0.0001        | 61.0  | 671  | 0.0236          |
| 0.0           | 62.0  | 682  | 0.0235          |
| 0.0001        | 63.0  | 693  | 0.0302          |
| 0.0001        | 64.0  | 704  | 0.0304          |
| 0.0001        | 65.0  | 715  | 0.0239          |
| 0.0           | 66.0  | 726  | 0.0231          |
| 0.0001        | 67.0  | 737  | 0.0229          |
| 0.0001        | 68.0  | 748  | 0.0228          |
| 0.0           | 69.0  | 759  | 0.0226          |
| 0.0004        | 70.0  | 770  | 0.0226          |
| 0.0001        | 71.0  | 781  | 0.0232          |
| 0.0001        | 72.0  | 792  | 0.0235          |
| 0.0001        | 73.0  | 803  | 0.0236          |
| 0.0002        | 74.0  | 814  | 0.0237          |
| 0.0001        | 75.0  | 825  | 0.0238          |
| 0.0001        | 76.0  | 836  | 0.0241          |
| 0.0           | 77.0  | 847  | 0.0244          |
| 0.0001        | 78.0  | 858  | 0.0244          |
| 0.0001        | 79.0  | 869  | 0.0244          |
| 0.0001        | 80.0  | 880  | 0.0244          |


### Framework versions

- Transformers 4.15.0
- Pytorch 1.10.0+cu111
- Datasets 1.17.0
- Tokenizers 0.10.3