Mohamadlh's picture
Upload 9 files
58567f4
|
raw
history blame
No virus
1.97 kB
---
license: apache-2.0
datasets:
- AyoubChLin/CNN_News_Articles_2011-2022
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- news classification
widget:
- text: money in the pocket
- text: no one can win this cup in quatar..
---
# Fine-Tuned BART Model for Text Classification on CNN News Articles
This is a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model for text classification on CNN news articles. The model was fine-tuned on a dataset of CNN news articles with labels indicating the article topic, using a batch size of 32, learning rate of 6e-5, and trained for one epoch.
## How to Use
### Install
```bash
pip install transformers
```
### Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("IT-community/BART_cnn_news_text_classification")
model = AutoModelForSequenceClassification.from_pretrained("IT-community/BART_cnn_news_text_classification")
# Tokenize input text
text = "This is an example CNN news article about politics."
inputs = tokenizer(text, padding=True, truncation=True, max_length=512, return_tensors="pt")
# Make prediction
outputs = model(inputs["input_ids"], attention_mask=inputs["attention_mask"])
predicted_label = torch.argmax(outputs.logits)
print(predicted_label)
```
## Evaluation
The model achieved the following performance metrics on the test set:
Accuracy: 0.9591836734693877
F1-score: 0.958301875401112
Recall: 0.9591836734693877
Precision: 0.9579673040369542
## About Us
We are a scientific club from Saad Dahleb Blida University named IT Community, created in 2016 by students. We are interested in all IT fields,
This work was done by IT Community Club.
### Contributions
[Cherguelaine Ayoub](https://huggingface.co/AyoubChLin):
- Added preprocessing code for CNN news articles
- Improved model performance with additional fine-tuning on a larger dataset