Prikshit7766 commited on
Commit
0830e59
·
verified ·
1 Parent(s): c148c42

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BART-Base XSum Summarization Model
2
+
3
+ ## Model Description
4
+
5
+ The model is a sequence-to-sequence transformer based on the BART architecture. It was fine-tuned on the [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) dataset using the `facebook/bart-base` model, which consists of news articles paired with short summaries.
6
+
7
+ ## Model Training Details
8
+
9
+ ### Training Dataset
10
+
11
+ - **Dataset:** [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum)
12
+ - **Splits:**
13
+ - **Train:** 204,045 examples (filtered to 203,966 examples)
14
+ - **Validation:** 11,332 examples (filtered to 11,326 examples)
15
+ - **Test:** 11,334 examples (filtered to 11,331 examples)
16
+ - **Preprocessing:**
17
+ - Tokenization of documents and summaries using the `facebook/bart-base` tokenizer.
18
+ - Filtering out examples with very short documents or summaries.
19
+ - Truncating inputs to a maximum length of 1024 tokens for documents and 512 tokens for summaries.
20
+
21
+ ### Training Configuration
22
+
23
+ The model was fine-tuned using the `Seq2SeqTrainer` from the Hugging Face Transformers library with the following training arguments:
24
+
25
+ - **Evaluation Strategy:** Evaluation at the end of each epoch
26
+ - **Learning Rate:** 3e-5
27
+ - **Batch Size:**
28
+ - **Training:** 16 per device
29
+ - **Evaluation:** 32 per device
30
+ - **Gradient Accumulation Steps:** 1
31
+ - **Weight Decay:** 0.01
32
+ - **Number of Epochs:** 5
33
+ - **Warmup Steps:** 1000
34
+ - **Learning Rate Scheduler:** Cosine scheduler
35
+ - **Label Smoothing Factor:** 0.1
36
+ - **Mixed Precision:** FP16 enabled
37
+ - **Prediction:** Uses `predict_with_generate` to compute summaries during evaluation
38
+ - **Metric for Best Model:** `rougeL`
39
+
40
+ ## Model Results
41
+
42
+ ### Evaluation Metrics
43
+
44
+ After fine-tuning, the model achieved the following scores:
45
+
46
+ - **Validation Set:**
47
+ - **Eval Loss:** 3.0508
48
+ - **ROUGE-1:** 39.2079
49
+ - **ROUGE-2:** 17.8686
50
+ - **ROUGE-L:** 32.4777
51
+ - **ROUGE-Lsum:** 32.4734
52
+ - **Test Set:**
53
+ - **Eval Loss:** 3.0607
54
+ - **ROUGE-1:** 39.2149
55
+ - **ROUGE-2:** 17.7573
56
+ - **ROUGE-L:** 32.4190
57
+ - **ROUGE-Lsum:** 32.4020
58
+
59
+ ### Final Training Loss
60
+
61
+ - **Final Training Loss:** 2.9226
62
+ - **Final Validation Loss:** 3.0508
63
+
64
+ ## Model Usage
65
+
66
+ You can easily use the model for summarization tasks using the Hugging Face `pipeline`. Below is an example:
67
+
68
+ ```python
69
+ from transformers import pipeline
70
+
71
+ # Load the summarization pipeline using the fine-tuned model
72
+ summarizer = pipeline("summarization", model="Prikshit7766/bart-base-xsum")
73
+
74
+ # Input text for summarization
75
+ text = (
76
+ "In a significant breakthrough in renewable energy, scientists have developed "
77
+ "a novel solar panel technology that promises to dramatically reduce costs and "
78
+ "increase efficiency. The new panels are lighter, more durable, and easier to install "
79
+ "than conventional models, marking a major advancement in sustainable energy solutions. "
80
+ "Experts believe this innovation could lead to wider adoption of solar power across residential "
81
+ "and commercial sectors, ultimately reducing global reliance on fossil fuels."
82
+ )
83
+
84
+ # Generate summary
85
+ summary = summarizer(text)[0]["summary_text"]
86
+ print("Generated Summary:", summary)
87
+ ```
88
+
89
+ **Example Output:**
90
+
91
+ ```
92
+ Generated Summary: Scientists at the University of California, Berkeley, have developed a new type of solar panel.
93
+ ```