File size: 5,009 Bytes

9fed90c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6be6199
 
 
 
 
9fed90c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ad570dc
9fed90c
 
 
 
 
 
ad570dc
de2ad0e
3c76d19
 
3bac54f
589e790
ff70583
1487600
a31e0c7
5dedf5f
956ae05
6b1b7fb
d84973d
df6f16b
797f23c
853a72a
0e5727b
 
80792bc
8673d39
40aeb35
9f88e20
f12472e
10f40e1
2c7fa0e
 
e0759f6
0125f59
42709d9
5f04134
206409e
2167c96
d320f53
0e3f2ae
c9baba2
6da5d3b
ba72f5e
716f293
14dd840
1e0a419
6be6199
9fed90c

---
license: mit
base_model: roberta-base
tags:
- generated_from_keras_callback
model-index:
- name: Ryukijano/masked-lm-tpu
  results: []
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# Ryukijano/masked-lm-tpu

This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 6.0936
- Train Accuracy: 0.0329
- Validation Loss: 6.0600
- Validation Accuracy: 0.0324
- Epoch: 40

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 111625, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 5875, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
- training_precision: float32

### Training results

| Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
|:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
| 10.2437    | 0.0000         | 10.1909         | 0.0000              | 0     |
| 10.1151    | 0.0001         | 9.9763          | 0.0016              | 1     |
| 9.8665     | 0.0107         | 9.6535          | 0.0215              | 2     |
| 9.5331     | 0.0230         | 9.2992          | 0.0223              | 3     |
| 9.2000     | 0.0231         | 8.9944          | 0.0222              | 4     |
| 8.9195     | 0.0229         | 8.7450          | 0.0224              | 5     |
| 8.6997     | 0.0231         | 8.6124          | 0.0219              | 6     |
| 8.5689     | 0.0229         | 8.4904          | 0.0222              | 7     |
| 8.4525     | 0.0230         | 8.3865          | 0.0223              | 8     |
| 8.3594     | 0.0230         | 8.3069          | 0.0221              | 9     |
| 8.2662     | 0.0231         | 8.2092          | 0.0224              | 10    |
| 8.1956     | 0.0231         | 8.1208          | 0.0222              | 11    |
| 8.1285     | 0.0229         | 8.0806          | 0.0219              | 12    |
| 8.0345     | 0.0234         | 8.0030          | 0.0220              | 13    |
| 7.9960     | 0.0228         | 7.9144          | 0.0224              | 14    |
| 7.9065     | 0.0231         | 7.8661          | 0.0221              | 15    |
| 7.8449     | 0.0229         | 7.7873          | 0.0219              | 16    |
| 7.7673     | 0.0232         | 7.6903          | 0.0229              | 17    |
| 7.6868     | 0.0242         | 7.6129          | 0.0243              | 18    |
| 7.6206     | 0.0250         | 7.5579          | 0.0246              | 19    |
| 7.5231     | 0.0258         | 7.4564          | 0.0254              | 20    |
| 7.4589     | 0.0262         | 7.4136          | 0.0255              | 21    |
| 7.3658     | 0.0269         | 7.2941          | 0.0265              | 22    |
| 7.2832     | 0.0274         | 7.1998          | 0.0270              | 23    |
| 7.2035     | 0.0275         | 7.1203          | 0.0271              | 24    |
| 7.1116     | 0.0280         | 7.0582          | 0.0269              | 25    |
| 7.0099     | 0.0287         | 6.9567          | 0.0287              | 26    |
| 6.9296     | 0.0294         | 6.8759          | 0.0287              | 27    |
| 6.8524     | 0.0296         | 6.8272          | 0.0285              | 28    |
| 6.7757     | 0.0300         | 6.7311          | 0.0291              | 29    |
| 6.7031     | 0.0304         | 6.6316          | 0.0305              | 30    |
| 6.6361     | 0.0306         | 6.5744          | 0.0307              | 31    |
| 6.5578     | 0.0312         | 6.4946          | 0.0312              | 32    |
| 6.4674     | 0.0319         | 6.4212          | 0.0314              | 33    |
| 6.4096     | 0.0322         | 6.3557          | 0.0320              | 34    |
| 6.3614     | 0.0321         | 6.3093          | 0.0322              | 35    |
| 6.2754     | 0.0329         | 6.2240          | 0.0326              | 36    |
| 6.2609     | 0.0326         | 6.2114          | 0.0321              | 37    |
| 6.1866     | 0.0329         | 6.1645          | 0.0320              | 38    |
| 6.1470     | 0.0330         | 6.1193          | 0.0323              | 39    |
| 6.0936     | 0.0329         | 6.0600          | 0.0324              | 40    |


### Framework versions

- Transformers 4.32.1
- TensorFlow 2.12.0
- Tokenizers 0.13.3