JordiAb commited on
Commit
251abff
1 Parent(s): c139f93

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -6
README.md CHANGED
@@ -3,17 +3,48 @@ language:
3
  - en
4
  pipeline_tag: summarization
5
  ---
6
- News articles teacher-student abstractive summarizer model fine-tuned from BART-large and which used `StableBeluga-7B` as teacher.
 
7
 
8
- DataSet consists of 295,174 news articles scrapped from a Mexican Newspaper, along with its summary. For simplicity, the Spanish news articles were translated to English using `Helsinki-NLP/opus-mt-es-en` NLP model.
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- Summaries teacher observations were created using `StableBeluga-7B`. The teacher observations are then used for fine tuning a BART lightweight model.
 
 
 
 
 
 
 
 
11
 
12
- The objective for this is to have a lightweight model that can perform summarization as good as `StableBeluga-7B`, much faster and with much less computing resources.
 
13
 
14
- We achieved very similar summary results (.66 ROUGE1 and .90 cosine similarity) on a validation DataSet with the lightweight BART model, 3x faster predictions and considerably less GPU memory usage.
 
 
 
 
15
 
16
- How to use:
 
 
 
 
 
 
17
 
18
 
19
  ```python
 
3
  - en
4
  pipeline_tag: summarization
5
  ---
6
+ # Model Overview
7
+ The News Articles Teacher-Student Abstractive Summarizer is a fine-tuned model based on BART-large, utilizing StableBeluga-7B as the teacher model. This model is designed to provide high-quality abstractive summarization of news articles with improved efficiency in terms of speed and computational resource usage.
8
 
9
+ # Model Details
10
+ - Model Type: Abstractive Summarization
11
+ - Base Model: BART-large
12
+ - Teacher Model: StableBeluga-7B
13
+ - Language: English
14
+
15
+ # DataSet
16
+ - Source: 295,174 news articles scrapped from a Mexican newspaper.
17
+ - Translation: The Spanish articles were translated to English using the Helsinki-NLP/opus-mt-es-en NLP model.
18
+ - Teacher Summaries: Generated by StableBeluga-7B.
19
+ # Training
20
+ The fine-tuning process involved using the teacher observations (summaries) generated by StableBeluga-7B to train a lightweight BART model. This approach aims to replicate the summarization quality of the teacher model while achieving faster inference times and reduced GPU memory usage.
21
 
22
+ # Performance
23
+ - Evaluation Metrics:
24
+ - - ROUGE1: 0.66
25
+ - - Cosine Similarity: 0.90
26
+ - Inference Speed: 3x faster than the teacher model (StableBeluga-7B)
27
+ - Resource Usage: Significantly less GPU memory compared to StableBeluga-7B
28
+
29
+ # Objective
30
+ The primary goal of this model is to provide a lightweight summarization solution that maintains high-quality output similar to the teacher model (StableBeluga-7B) but operates with greater efficiency, making it suitable for deployment in resource-constrained environments.
31
 
32
+ # Use Cases
33
+ This model is ideal for applications requiring quick and efficient summarization of large volumes of news articles, particularly in settings where computational resources are limited.
34
 
35
+ # Limitations
36
+ - Language Translation: The initial translation from Spanish to English may introduce minor inaccuracies that could affect the summarization quality.
37
+ - Domain Specificity: Fine-tuned specifically on news articles, performance may vary on texts from different domains.
38
+ # Future Work
39
+ Future improvements could involve:
40
 
41
+ - Fine-tuning the model on bilingual data to eliminate translation steps.
42
+ - Expanding the dataset to include a wider variety of news sources and topics.
43
+ - Exploring further optimizations to reduce inference time and resource usage.
44
+ # Conclusion
45
+ The News Articles Teacher-Student Abstractive Summarizer model demonstrates the potential to deliver high-quality summaries efficiently, making it a valuable tool for news content processing and similar applications.
46
+
47
+ # How to use:
48
 
49
 
50
  ```python