--- license: apache-2.0 base_model: distilbert-base-cased tags: - generated_from_keras_callback model-index: - name: LongRiver/distilbert-base-cased-finetuned results: [] --- # LongRiver/distilbert-base-cased-finetuned This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on an unknown dataset. It achieves the following results on the evaluation set: - Train Loss: 0.0150 - Train End Logits Accuracy: 0.9962 - Train Start Logits Accuracy: 0.9947 - Validation Loss: 4.6938 - Validation End Logits Accuracy: 0.5474 - Validation Start Logits Accuracy: 0.5004 - Epoch: 29 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 67860, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False} - training_precision: float32 ### Training results | Train Loss | Train End Logits Accuracy | Train Start Logits Accuracy | Validation Loss | Validation End Logits Accuracy | Validation Start Logits Accuracy | Epoch | |:----------:|:-------------------------:|:---------------------------:|:---------------:|:------------------------------:|:--------------------------------:|:-----:| | 2.3620 | 0.5061 | 0.4972 | 2.0785 | 0.5155 | 0.4836 | 0 | | 1.7007 | 0.5940 | 0.5660 | 2.0332 | 0.5185 | 0.4999 | 1 | | 1.4088 | 0.6542 | 0.6191 | 2.0391 | 0.5324 | 0.5012 | 2 | | 1.1407 | 0.7150 | 0.6757 | 2.1645 | 0.5172 | 0.4854 | 3 | | 0.9215 | 0.7670 | 0.7296 | 2.2074 | 0.5365 | 0.4995 | 4 | | 0.7376 | 0.8083 | 0.7780 | 2.4099 | 0.5146 | 0.4865 | 5 | | 0.5780 | 0.8456 | 0.8186 | 2.6543 | 0.5231 | 0.4764 | 6 | | 0.4614 | 0.8748 | 0.8511 | 2.6688 | 0.5360 | 0.4944 | 7 | | 0.3633 | 0.9015 | 0.8785 | 2.9329 | 0.5300 | 0.4908 | 8 | | 0.2981 | 0.9177 | 0.8983 | 3.1868 | 0.5270 | 0.4759 | 9 | | 0.2453 | 0.9318 | 0.9156 | 3.3015 | 0.5347 | 0.4951 | 10 | | 0.1958 | 0.9440 | 0.9333 | 3.5149 | 0.5335 | 0.4860 | 11 | | 0.1649 | 0.9521 | 0.9433 | 3.4351 | 0.5424 | 0.4975 | 12 | | 0.1425 | 0.9590 | 0.9505 | 3.6372 | 0.5264 | 0.4800 | 13 | | 0.1231 | 0.9644 | 0.9579 | 3.7467 | 0.5346 | 0.4827 | 14 | | 0.1024 | 0.9703 | 0.9636 | 3.8551 | 0.5400 | 0.4945 | 15 | | 0.0882 | 0.9730 | 0.9692 | 3.9909 | 0.5412 | 0.4880 | 16 | | 0.0740 | 0.9785 | 0.9738 | 4.0573 | 0.5376 | 0.4920 | 17 | | 0.0691 | 0.9789 | 0.9760 | 4.0751 | 0.5292 | 0.4903 | 18 | | 0.0588 | 0.9837 | 0.9792 | 4.0823 | 0.5377 | 0.4967 | 19 | | 0.0498 | 0.9849 | 0.9826 | 4.2466 | 0.5376 | 0.4967 | 20 | | 0.0464 | 0.9864 | 0.9848 | 4.2565 | 0.5446 | 0.4999 | 21 | | 0.0388 | 0.9889 | 0.9864 | 4.3063 | 0.5329 | 0.4941 | 22 | | 0.0331 | 0.9900 | 0.9894 | 4.4083 | 0.5420 | 0.4962 | 23 | | 0.0274 | 0.9922 | 0.9914 | 4.5627 | 0.5455 | 0.5023 | 24 | | 0.0257 | 0.9925 | 0.9916 | 4.6541 | 0.5503 | 0.5122 | 25 | | 0.0229 | 0.9935 | 0.9925 | 4.4773 | 0.5433 | 0.4985 | 26 | | 0.0181 | 0.9951 | 0.9943 | 4.6989 | 0.5480 | 0.5066 | 27 | | 0.0161 | 0.9953 | 0.9947 | 4.6873 | 0.5466 | 0.4995 | 28 | | 0.0150 | 0.9962 | 0.9947 | 4.6938 | 0.5474 | 0.5004 | 29 | ### Framework versions - Transformers 4.39.3 - TensorFlow 2.15.0 - Datasets 2.18.0 - Tokenizers 0.15.2