mrutyunjay-patil commited on
Commit
f39bcc9
1 Parent(s): 9040126

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -1
README.md CHANGED
@@ -1,4 +1,54 @@
1
  ---
2
  license: apache-2.0
3
  pipeline_tag: text2text-generation
4
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  pipeline_tag: text2text-generation
4
+ language:
5
+ - en
6
+ library_name: transformers
7
+ tags:
8
+ - code
9
+ - keyword-generation
10
+ - english
11
+ - t5
12
+ ---
13
+
14
+ # Keyword Generator v2
15
+
16
+ ## Model Description
17
+
18
+ This model, "KeywordGen-v2", is the second version of the "KeywordGen" series. It is fine-tuned based on the T5 base model, specifically for the generation of keywords from text inputs, with a special focus on product reviews.
19
+
20
+ This model can provide useful insights by extracting key points or themes from product reviews. The output is expected to contain keywords ranging from 2 to 8 words. The model performs better when the input is at least 2-3 sentences long.
21
+
22
+ ## How to use
23
+
24
+ You can use this model directly with a pipeline for text generation. When using the model, please prefix your input with "Keyword: " for the best results.
25
+
26
+ Here's how to use this model in Python with the Hugging Face Transformers library:
27
+
28
+ ```python
29
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
30
+
31
+ # Initialize the tokenizer and model
32
+ tokenizer = T5Tokenizer.from_pretrained("mrutyunjay-patil/keywordGen-v2")
33
+ model = T5ForConditionalGeneration.from_pretrained("mrutyunjay-patil/keywordGen-v2")
34
+
35
+ # Define your input sequence, prefixing with "Keyword: "
36
+ input_sequence = "Keyword: I purchased the new Android smartphone last week and I've been thoroughly impressed. The display is incredibly vibrant and sharp, and the battery life is surprisingly good, easily lasting a full day with heavy usage."
37
+
38
+ # Encode the input sequence
39
+ input_ids = tokenizer.encode(input_sequence, return_tensors="pt")
40
+
41
+ # Generate output
42
+ outputs = model.generate(input_ids)
43
+ output_sequence = tokenizer.decode(outputs[0], skip_special_tokens=True)
44
+
45
+ print(output_sequence)
46
+ ```
47
+
48
+ ## Training
49
+
50
+ This model was trained on a custom dataset. The base model used was the T5 base model.
51
+
52
+ ## Limitations and Future Work
53
+
54
+ As with any machine learning model, the outputs of this keyword generator depend on the data it was trained on. It is possible that the model might generate inappropriate or biased keywords if the input text contains such content. Future iterations of the model will aim to improve its robustness and fairness, and to minimize potential bias.