File size: 2,274 Bytes
965298a e282d16 965298a e282d16 87c7fd8 e282d16 9658a2a e282d16 9658a2a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
---
license: apache-2.0
datasets:
- AyoubChLin/CNN_News_Articles_2011-2022
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
---
# BertForSequenceClassification on CNN News Dataset
This repository contains a fine-tuned Bert base model for sequence classification on the CNN News dataset. The model is able to classify news articles into one of six categories: business, entertainment, health, news, politics, and sport.
The model was fine-tuned for four epochs achieving a training loss of 0.077900, a validation loss of 0.190814
- accuracy : 0.956690.
- f1 : 0.956144.
- precision : 0.956393
- recall : 0.956690
## Model Description
- **Developed by:** [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/)
- **Shared by :** HuggingFace
- **Model type:** Language model
- **Language(s) (NLP):** en
- **Finetuned from model :** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
## Usage
You can use this model with the Hugging Face Transformers library for a variety of natural language processing tasks, such as text classification, sentiment analysis, and more.
Here's an example of how to use this model for text classification in Python:
```python
from transformers import AutoTokenizer, BertForSequenceClassification
model_name = "AyoubChLin/bert_cnn_news"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
text = "This is a news article about politics."
inputs = tokenizer(text, padding=True, truncation=True, return_tensors="tf")
outputs = model(inputs)
predicted_class_id = tf.argmax(outputs.logits, axis=-1).numpy()[0]
labels = ["business", "entertainment", "health", "news", "politics", "sport"]
predicted_label = labels[predicted_class_id]
```
In this example, we first load the tokenizer and the model using their respective `from_pretrained` methods. We then encode a news article using the tokenizer, pass the inputs through the model, and extract the predicted label using the `argmax` function. Finally, we map the predicted label to its corresponding category using a list of labels.
## Contributors
This model was fine-tuned by [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/). |