File size: 4,382 Bytes
2bc6870
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e8601da
 
 
 
 
2bc6870
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53ad6d9
d6eba47
8a82e17
bff874d
6b6ec78
ea35c7a
695c18a
81555c7
1debcdd
2f03e73
47eda48
22aea34
72ee4d4
8cd99b6
414d598
b416c35
deacb04
608d4dd
fa377d2
37ebf72
6304609
41c2c48
e947221
7e96d8d
6202a72
26fc5e4
da6c6da
e213a21
b437ee2
4184a67
8a14282
7087cca
e8601da
2bc6870
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
license: mit
tags:
- generated_from_keras_callback
model-index:
- name: tf-tpu/roberta-base-epochs-500-no-wd
  results: []
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# tf-tpu/roberta-base-epochs-500-no-wd

This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 1.7370
- Train Accuracy: 0.0978
- Validation Loss: 1.5869
- Validation Accuracy: 0.1014
- Epoch: 32

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 278825, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 14675, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
- training_precision: mixed_bfloat16

### Training results

| Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
|:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
| 8.3284     | 0.0211         | 7.1523          | 0.0266              | 0     |
| 6.3670     | 0.0318         | 5.7812          | 0.0342              | 1     |
| 5.6051     | 0.0380         | 5.4414          | 0.0420              | 2     |
| 5.3602     | 0.0433         | 5.2734          | 0.0432              | 3     |
| 5.2285     | 0.0444         | 5.1562          | 0.0442              | 4     |
| 5.1371     | 0.0446         | 5.1133          | 0.0436              | 5     |
| 5.0673     | 0.0446         | 5.0703          | 0.0442              | 6     |
| 5.0132     | 0.0447         | 4.9883          | 0.0442              | 7     |
| 4.9642     | 0.0448         | 4.9219          | 0.0441              | 8     |
| 4.9217     | 0.0448         | 4.9258          | 0.0440              | 9     |
| 4.8871     | 0.0448         | 4.8867          | 0.0439              | 10    |
| 4.8548     | 0.0449         | 4.8672          | 0.0439              | 11    |
| 4.8277     | 0.0449         | 4.8047          | 0.0445              | 12    |
| 4.8033     | 0.0449         | 4.8477          | 0.0437              | 13    |
| 4.7807     | 0.0449         | 4.7617          | 0.0439              | 14    |
| 4.7592     | 0.0449         | 4.7773          | 0.0437              | 15    |
| 4.7388     | 0.0449         | 4.7539          | 0.0441              | 16    |
| 4.7225     | 0.0449         | 4.7266          | 0.0439              | 17    |
| 4.7052     | 0.0449         | 4.6914          | 0.0450              | 18    |
| 4.6917     | 0.0449         | 4.7188          | 0.0444              | 19    |
| 4.6789     | 0.0449         | 4.6914          | 0.0444              | 20    |
| 4.6689     | 0.0449         | 4.7031          | 0.0439              | 21    |
| 4.6570     | 0.0449         | 4.7031          | 0.0437              | 22    |
| 4.6486     | 0.0450         | 4.6758          | 0.0446              | 23    |
| 4.6393     | 0.0449         | 4.6914          | 0.0441              | 24    |
| 4.5898     | 0.0449         | 4.4688          | 0.0452              | 25    |
| 4.3024     | 0.0472         | 3.8730          | 0.0551              | 26    |
| 3.1689     | 0.0693         | 2.4375          | 0.0835              | 27    |
| 2.3780     | 0.0844         | 2.0498          | 0.0922              | 28    |
| 2.0789     | 0.0907         | 1.8604          | 0.0958              | 29    |
| 1.9204     | 0.0940         | 1.7549          | 0.0982              | 30    |
| 1.8162     | 0.0961         | 1.6836          | 0.0983              | 31    |
| 1.7370     | 0.0978         | 1.5869          | 0.1014              | 32    |


### Framework versions

- Transformers 4.27.0.dev0
- TensorFlow 2.9.1
- Tokenizers 0.13.2