distilbert_classifier_newsgroups
This model is a fine-tuned version of distilbert-base-uncased on 20Newsgroups dataset. It achieves the following results on the evaluation set:
Model description
We have fine-tuned the distilbert-base-uncased to classify news in 20 main topics based on the labeled dataset 20Newsgroups.
Training and evaluation data
The 20 newsgroups dataset comprises around 18000 newsgroups posts on 20 topics split in two subsets: one for training (or development) and the other one for testing (or for performance evaluation). The split between the train and test set is based upon a messages posted before and after a specific date.
These are the 20 topics we fine-tuned the model on:
'alt.atheism', 'comp.graphics', 'comp.os.ms-windows.misc', 'comp.sys.ibm.pc.hardware', 'comp.sys.mac.hardware', 'comp.windows.x', 'misc.forsale', 'rec.autos', 'rec.motorcycles', 'rec.sport.baseball', 'rec.sport.hockey', 'sci.crypt', 'sci.electronics', 'sci.med', 'sci.space', 'soc.religion.christian', 'talk.politics.guns', 'talk.politics.mideast', 'talk.politics.misc', 'talk.religion.misc'
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 1908, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
- training_precision: float32
Training results
Epoch 1/3 637/637 [==============================] - 110s 131ms/step - loss: 1.3480 - accuracy: 0.6633 - val_loss: 0.6122 - val_accuracy: 0.8304 Epoch 2/3 637/637 [==============================] - 44s 70ms/step - loss: 0.4498 - accuracy: 0.8812 - val_loss: 0.4342 - val_accuracy: 0.8799 Epoch 3/3 637/637 [==============================] - 40s 64ms/step - loss: 0.2685 - accuracy: 0.9355 - val_loss: 0.3756 - val_accuracy: 0.8993 CPU times: user 3min 4s, sys: 8.76 s, total: 3min 13s Wall time: 3min 15s <keras.callbacks.History at 0x7f481afbfbb0>
Framework versions
- Transformers 4.28.0
- TensorFlow 2.12.0
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 4