File size: 4,199 Bytes
b3cc7a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4089af7
b3cc7a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135bee0
 
b3cc7a0
 
 
4089af7
b3cc7a0
 
 
d016447
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4089af7
 
 
 
 
 
 
 
 
 
 
b3cc7a0
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
license: apache-2.0
tags:
- generated_from_trainer
model-index:
- name: distilgpt2-finetuned-wikitext2-agu
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilgpt2-finetuned-wikitext2-agu

This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 3.1869

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step    | Validation Loss |
|:-------------:|:-----:|:-------:|:---------------:|
| 3.7357        | 1.0   | 13655   | 3.6781          |
| 3.5721        | 2.0   | 27310   | 3.5302          |
| 3.4961        | 3.0   | 40965   | 3.4658          |
| 3.4406        | 4.0   | 54620   | 3.4242          |
| 3.4043        | 5.0   | 68275   | 3.3943          |
| 3.3789        | 6.0   | 81930   | 3.3726          |
| 3.3576        | 7.0   | 95585   | 3.3538          |
| 3.3389        | 8.0   | 109240  | 3.3389          |
| 3.3151        | 9.0   | 122895  | 3.3270          |
| 3.314         | 5.0   | 136545  | 3.3226          |
| 3.3044        | 6.0   | 163854  | 3.3124          |
| 3.2931        | 7.0   | 191163  | 3.3078          |
| 3.2874        | 8.0   | 218472  | 3.3094          |
| 3.2817        | 9.0   | 245781  | 3.2943          |
| 3.269         | 10.0  | 273090  | 3.2785          |
| 3.2423        | 11.0  | 300399  | 3.2651          |
| 3.2253        | 12.0  | 327708  | 3.2530          |
| 3.2096        | 13.0  | 355017  | 3.2435          |
| 3.1939        | 14.0  | 382326  | 3.2326          |
| 3.1786        | 15.0  | 409635  | 3.2225          |
| 3.1625        | 16.0  | 436944  | 3.2198          |
| 3.1619        | 17.0  | 464253  | 3.2180          |
| 3.1521        | 18.0  | 491562  | 3.2164          |
| 3.1555        | 19.0  | 518871  | 3.2152          |
| 3.1523        | 20.0  | 546180  | 3.2164          |
| 3.1639        | 21.0  | 573489  | 3.2133          |
| 3.1483        | 22.0  | 600798  | 3.2113          |
| 3.1497        | 23.0  | 628107  | 3.2077          |
| 3.1468        | 24.0  | 655416  | 3.2066          |
| 3.1461        | 25.0  | 682725  | 3.2052          |
| 3.1391        | 26.0  | 710034  | 3.2039          |
| 3.1384        | 27.0  | 737343  | 3.2031          |
| 3.135         | 28.0  | 764652  | 3.2020          |
| 3.1262        | 29.0  | 791961  | 3.2015          |
| 3.1357        | 30.0  | 819270  | 3.2019          |
| 3.1372        | 31.0  | 846579  | 3.2003          |
| 3.1346        | 32.0  | 873888  | 3.1988          |
| 3.134         | 33.0  | 901197  | 3.1975          |
| 3.1256        | 34.0  | 928506  | 3.1965          |
| 3.1261        | 35.0  | 955815  | 3.1950          |
| 3.1255        | 36.0  | 983124  | 3.1945          |
| 3.1278        | 37.0  | 1010433 | 3.1940          |
| 3.1186        | 38.0  | 1037742 | 3.1934          |
| 3.1136        | 39.0  | 1065051 | 3.1932          |
| 3.12          | 40.0  | 1092360 | 3.1931          |
| 3.12          | 41.0  | 1119669 | 3.1930          |
| 3.1165        | 42.0  | 1146978 | 3.1914          |
| 3.1166        | 43.0  | 1174287 | 3.1900          |
| 3.1139        | 44.0  | 1201596 | 3.1892          |
| 3.1135        | 45.0  | 1228905 | 3.1885          |
| 3.1077        | 46.0  | 1256214 | 3.1881          |
| 3.1097        | 47.0  | 1283523 | 3.1873          |
| 3.1076        | 48.0  | 1310832 | 3.1872          |
| 3.102         | 49.0  | 1338141 | 3.1870          |
| 3.1086        | 50.0  | 1365450 | 3.1869          |


### Framework versions

- Transformers 4.18.0
- Pytorch 1.9.0+cu111
- Datasets 2.4.0
- Tokenizers 0.12.1