AyoubChLin commited on
Commit
9012384
1 Parent(s): 56d205b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -1,3 +1,55 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - AyoubChLin/CNN_News_Articles_2011-2022
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ pipeline_tag: text-classification
10
  ---
11
+
12
+ # Model Description
13
+ This is a fine-tuned DistilBART model for sequence classification on CNN news articles for text classification. The model was fine-tuned using a batch size of 32, a learning rate of 6e-5, and for 1 epoch.
14
+
15
+ ## Dataset
16
+ The CNN News dataset was used for fine-tuning the model. The dataset consists of news articles from various categories such as sports, entertainment, politics, etc.
17
+
18
+ ## Performance
19
+ The following performance metrics were achieved after fine-tuning the model:
20
+
21
+ Accuracy: 0.9597114707952147
22
+ F1-score: 0.9589247895703302
23
+ Recall: 0.9597114707952147
24
+ Precision: 0.9589649408501851
25
+
26
+ ## Usage
27
+ You can use this model to classify CNN news articles into different categories such as sports, entertainment, politics, etc. You can load the model using the Hugging Face Transformers library and use it to predict the class of a new news article.
28
+
29
+ ```python
30
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
31
+
32
+ # Load the fine-tuned model and tokenizer
33
+ tokenizer = AutoTokenizer.from_pretrained("IT-community/distilBART_cnn_news_text_classification")
34
+
35
+ model = AutoModelForSequenceClassification.from_pretrained("IT-community/distilBART_cnn_news_text_classification")
36
+
37
+ # Classify a news article
38
+ news_article = "A new movie is set to release this weekend"
39
+ inputs = tokenizer(news_article, padding=True, truncation=True, return_tensors="pt")
40
+ outputs = model(**inputs)
41
+ predicted_class = outputs.logits.argmax().item()
42
+ ```
43
+
44
+ ## About Us
45
+
46
+ We are a scientific club from Saad Dahleb Blida University named IT Community, created in 2016 by students. We are interested in all IT fields,
47
+ This work was done by IT Community Club.
48
+
49
+ ### Contributions
50
+
51
+ (Cherguelaine Ayoub)[https://huggingface.co/AyoubChLin]
52
+
53
+ Added preprocessing code for CNN news articles
54
+
55
+ Improved model performance with additional fine-tuning on a larger dataset