Edit model card

hsohn3/mayo-bert-visit-uncased-wordlevel-block512-batch4-ep100

This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.9559
  • Epoch: 99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Epoch
4.1247 0
3.5129 1
3.4726 2
3.4483 3
3.4395 4
3.4301 5
3.4260 6
3.4131 7
3.3831 8
3.2925 9
3.2454 10
3.2092 11
3.1695 12
3.1346 13
3.0797 14
3.0154 15
2.9557 16
2.8814 17
2.7720 18
2.5472 19
2.3193 20
2.1005 21
1.9331 22
1.7971 23
1.6859 24
1.6062 25
1.5310 26
1.4706 27
1.4203 28
1.3681 29
1.3222 30
1.2939 31
1.2726 32
1.2494 33
1.2330 34
1.2161 35
1.1998 36
1.1874 37
1.1767 38
1.1641 39
1.1550 40
1.1407 41
1.1363 42
1.1272 43
1.1227 44
1.1163 45
1.1065 46
1.1008 47
1.0957 48
1.0837 49
1.0844 50
1.0778 51
1.0741 52
1.0693 53
1.0662 54
1.0608 55
1.0521 56
1.0526 57
1.0476 58
1.0454 59
1.0452 60
1.0348 61
1.0333 62
1.0342 63
1.0293 64
1.0249 65
1.0241 66
1.0194 67
1.0177 68
1.0102 69
1.0055 70
1.0052 71
1.0038 72
1.0005 73
0.9981 74
0.9991 75
0.9950 76
0.9928 77
0.9898 78
0.9906 79
0.9873 80
0.9849 81
0.9808 82
0.9804 83
0.9792 84
0.9789 85
0.9797 86
0.9741 87
0.9781 88
0.9678 89
0.9686 90
0.9651 91
0.9652 92
0.9613 93
0.9599 94
0.9566 95
0.9571 96
0.9577 97
0.9536 98
0.9559 99

Framework versions

  • Transformers 4.20.1
  • TensorFlow 2.8.2
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.