KipperDev commited on
Commit
7feee5f
1 Parent(s): 83bfb1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -37
README.md CHANGED
@@ -15,39 +15,71 @@ tags:
15
 
16
  [![Generic badge](https://img.shields.io/badge/STATUS-WIP-yellow.svg)](https://shields.io/)
17
 
18
- > ⚠️ ATTENTION
19
- >
20
- > NOTE THAT FOR THE MODEL TO WORK AS INTENDED, YOU NEED TO APPEND THE 'summarize:' PREFIX BEFORE THE INPUT DATA
21
 
22
  # Table of Contents
23
 
24
  1. [Model Details](#model-details)
25
- 2. [Uses](#uses)
26
  3. [Training Details](#training-details)
27
- 4. [Evaluation](#evaluation)
28
- 5. [How To Get Started With the Model](#how-to-get-started-with-the-model)
29
- 6. [Citation](#citation)
30
- 7. [Author](#model-card-authors)
31
 
32
  # Model Details
33
 
34
- This T5 model, named `KipperDev/t5_summarizer_model`, is fine-tuned specifically for the task of document summarization. It's based on the T5 architecture, renowned for its flexibility and efficiency across a wide range of NLP tasks, including summarization. This model aims to generate concise, coherent, and informative summaries from extensive text documents, leveraging the power of the T5's text-to-text approach.
35
 
36
- # Uses
37
 
38
- This model is intended for use in summarizing long-form documents into concise, informative abstracts. It's particularly useful for professionals and researchers who need to quickly grasp the essence of detailed reports, research papers, or articles without reading the entire text.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  # Training Details
41
 
42
  ## Training Data
43
 
44
- The model was trained using the [Big Patent Dataset](https://huggingface.co/datasets/big_patent), comprising 1.3 million US patent documents and their corresponding human-written summaries. This dataset was chosen for its rich language and complex structure, representative of the challenging nature of document summarization tasks. Training involved multiple subsets of the dataset to ensure broad coverage and robust model performance across varied document types.
 
 
45
 
46
  ## Training Procedure
47
 
48
- Training was conducted over three rounds, with initial settings including a learning rate of 0.00002, batch size of 8, and 4 epochs. Subsequent rounds adjusted these parameters to refine model performance further. A linear decay learning rate schedule was applied to enhance model learning efficiency over time.
49
 
50
- # Evaluation
51
 
52
  Model performance was evaluated using the ROUGE metric, highlighting its capability to generate summaries closely aligned with human-written abstracts.
53
 
@@ -64,29 +96,6 @@ Model performance was evaluated using the ROUGE metric, highlighting its capabil
64
  | Steps per Second | 0.336 |
65
 
66
 
67
- # How to Get Started with the Model
68
-
69
- Use the code below to get started with the model.
70
-
71
- <details>
72
- <summary> Click to expand </summary>
73
-
74
- ```python
75
- from transformers import T5Tokenizer, T5ForConditionalGeneration
76
-
77
- tokenizer = T5Tokenizer.from_pretrained("KipperDev/t5_summarizer_model")
78
- model = T5ForConditionalGeneration.from_pretrained("KipperDev/t5_summarizer_model")
79
-
80
- # Example usage
81
- prefix = "summarize: "
82
- input_text = "Your input text here."
83
- input_ids = tokenizer.encode(prefix + input_text, return_tensors="pt")
84
- summary_ids = model.generate(input_ids)
85
- summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
86
-
87
- print(summary)
88
- ```
89
-
90
  # Citation
91
 
92
  **BibTeX:**
 
15
 
16
  [![Generic badge](https://img.shields.io/badge/STATUS-WIP-yellow.svg)](https://shields.io/)
17
 
18
+ [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)]()
 
 
19
 
20
  # Table of Contents
21
 
22
  1. [Model Details](#model-details)
23
+ 2. [Usage](#usage)
24
  3. [Training Details](#training-details)
25
+ 4. [Training Results](#training-results)
26
+ 5. [Citation](#citation)
27
+ 6. [Author](#model-card-authors)
 
28
 
29
  # Model Details
30
 
31
+ This variant of the [t5-small](https://huggingface.co/google-t5/t5-small) model, is fine-tuned specifically for the task of text summarization. This model aims to generate concise, coherent, and informative summaries from extensive text documents, leveraging the power of the T5's text-to-text approach.
32
 
33
+ # Usage
34
 
35
+ This model is intended for use in summarizing long-form texts into concise, informative abstracts. It's particularly useful for professionals and researchers who need to quickly grasp the essence of detailed reports, research papers, or articles without reading the entire text.
36
+
37
+ ## Get Started
38
+
39
+ Install with `pip`:
40
+
41
+ ```bash
42
+ pip install transformers
43
+ ```
44
+
45
+ Use in python:
46
+
47
+ ```python
48
+ from transformers import pipeline
49
+ from transformers import AutoTokenizer
50
+ from transformers import AutoModelForSeq2SeqLM
51
+
52
+ model_name = "KipperDev/t5_summarizer_model"
53
+
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
56
+ summarizer = pipeline("summarization", model=model, tokenizer=tokenizer)
57
+
58
+ # Example usage
59
+ prefix = "summarize: "
60
+ input_text = "Your input text here."
61
+ input_ids = tokenizer.encode(prefix + input_text, return_tensors="pt")
62
+ summary_ids = model.generate(input_ids)
63
+ summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
64
+
65
+ print(summary)
66
+ ```
67
+
68
+ **NOTE THAT FOR THE MODEL TO WORK AS INTENDED, YOU NEED TO APPEND THE 'summarize:' PREFIX BEFORE THE INPUT DATA**
69
 
70
  # Training Details
71
 
72
  ## Training Data
73
 
74
+ The model was trained using the [Big Patent Dataset](https://huggingface.co/datasets/big_patent), comprising 1.3 million US patent documents and their corresponding human-written summaries. This dataset was chosen for its rich language and complex structure, representative of the challenging nature of document summarization tasks.
75
+
76
+ Training involved multiple subsets of the dataset to ensure broad coverage and robust model performance across varied document types.
77
 
78
  ## Training Procedure
79
 
80
+ Training was conducted over three rounds, with initial settings including a learning rate of 0.00002, batch size of 8, and 4 epochs. Subsequent rounds adjusted these parameters to refine model performance further, for respectively 0.0003, 8 and 12. As well, a linear decay learning rate schedule was applied to enhance model learning efficiency over time.
81
 
82
+ # Training results
83
 
84
  Model performance was evaluated using the ROUGE metric, highlighting its capability to generate summaries closely aligned with human-written abstracts.
85
 
 
96
  | Steps per Second | 0.336 |
97
 
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  # Citation
100
 
101
  **BibTeX:**