File size: 2,673 Bytes
3f85135
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14b14cd
 
3f85135
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8a97ab0
6282a79
fcef2da
09e1573
eb28060
1297126
e045aba
d3ef1ba
82e08d0
baab625
29bd648
5809acf
614635c
300369b
5c76cc6
e12f05c
afc8595
998c9c8
3e733b8
88670d0
e6a2a8b
35939d9
15233d4
391ff5c
1d4b3e7
c8c4823
f250b75
2d5f0ad
3252a2d
fda95dd
6f2e06d
180115a
5215c5e
014e651
ccd0bdf
77869aa
0a74a97
da4b01b
14b14cd
3f85135
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: apache-2.0
base_model: distilbert-base-cased
tags:
- generated_from_keras_callback
model-index:
- name: EricPeter/distilbert-base-cased-270823
  results: []
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# EricPeter/distilbert-base-cased-270823

This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.1660
- Epoch: 39

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'inner_optimizer': {'class_name': 'AdamWeightDecay', 'config': {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 2596, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 4, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.06}}, 'dynamic': True, 'initial_scale': 32768.0, 'dynamic_growth_steps': 2000}
- training_precision: mixed_float16

### Training results

| Train Loss | Epoch |
|:----------:|:-----:|
| 4.3370     | 0     |
| 3.0157     | 1     |
| 2.5394     | 2     |
| 2.1440     | 3     |
| 1.6916     | 4     |
| 1.2670     | 5     |
| 0.9564     | 6     |
| 0.7104     | 7     |
| 0.5675     | 8     |
| 0.4794     | 9     |
| 0.4146     | 10    |
| 0.3559     | 11    |
| 0.3041     | 12    |
| 0.3285     | 13    |
| 0.2704     | 14    |
| 0.2513     | 15    |
| 0.2537     | 16    |
| 0.2422     | 17    |
| 0.2162     | 18    |
| 0.2217     | 19    |
| 0.2135     | 20    |
| 0.2084     | 21    |
| 0.2009     | 22    |
| 0.2010     | 23    |
| 0.2020     | 24    |
| 0.1951     | 25    |
| 0.1939     | 26    |
| 0.1914     | 27    |
| 0.1868     | 28    |
| 0.1805     | 29    |
| 0.1877     | 30    |
| 0.1747     | 31    |
| 0.1676     | 32    |
| 0.1793     | 33    |
| 0.1774     | 34    |
| 0.1742     | 35    |
| 0.1690     | 36    |
| 0.1735     | 37    |
| 0.1706     | 38    |
| 0.1660     | 39    |


### Framework versions

- Transformers 4.33.0
- TensorFlow 2.12.0
- Datasets 2.14.4
- Tokenizers 0.13.3