AyoubChLin commited on
Commit
ab0aff8
1 Parent(s): 1137c6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md CHANGED
@@ -1,3 +1,52 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - AyoubChLin/CNN_News_Articles_2011-2022
5
+ language:
6
+ - en
7
+ metrics:
8
+ - f1
9
+ - accuracy
10
+ pipeline_tag: zero-shot-classification
11
+ tags:
12
+ - zero shot
13
+ - text classification
14
+ - news classification
15
  ---
16
+
17
+ # Huggingface Model: BART-MNLI-ZeroShot-Text-Classification
18
+ This is a Huggingface model fine-tuned on the CNN news dataset for zero-shot text classification task using BART-MNLI. The model achieved an f1 score of 94% and an accuracy of 94% on the CNN test dataset with a maximum length of 128 tokens.
19
+
20
+ ## Authors
21
+ This work was done by [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/) & [BOUBEKRI Faycal](https://www.linkedin.com/in/faycal-boubekri-832848199/)
22
+
23
+ ## Model Architecture
24
+ The model architecture is based on the BART-MNLI transformer model. BART (Bidirectional and Auto-Regressive Transformers) is a denoising autoencoder that is pre-trained on a large corpus of text and fine-tuned on downstream natural language processing tasks.
25
+
26
+ ## Dataset
27
+ The CNN news dataset was used for fine-tuning the model. This dataset contains news articles from the CNN website and is labeled into 6 categories, including politics, health, entertainment, tech, travel, world, and sports.
28
+
29
+ ## Fine-tuning Parameters
30
+ The model was fine-tuned for 1 epoch on a maximum length of 256 tokens. The training took approximately 6 hours to complete.
31
+
32
+ ## Evaluation Metrics
33
+ The model achieved an f1 score of 94% and an accuracy of 94% on the CNN test dataset with a maximum length of 128 tokens.
34
+
35
+ # Usage
36
+ The model can be used for zero-shot text classification tasks on news articles. It can be accessed via the Huggingface Transformers library using the following code:
37
+
38
+ ```python
39
+ from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
40
+
41
+ tokenizer = AutoTokenizer.from_pretrained("AyoubChLin/DistilBart_cnn_zeroShot")
42
+
43
+ model = AutoModelForSequenceClassification.from_pretrained("AyoubChLin/DistilBart_cnn_zeroShot")
44
+ classifier = pipeline(
45
+ "zero-shot-classification",
46
+ model=model,
47
+ tokenizer=tokenizer,
48
+ device=0
49
+ )
50
+ ```
51
+ ## Acknowledgments
52
+ We would like to acknowledge the Huggingface team for their open-source implementation of transformer models and the CNN news dataset for providing the labeled dataset for fine-tuning.