|
--- |
|
license: artistic-2.0 |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- code |
|
- keyword-generation |
|
- t5 |
|
- english |
|
--- |
|
|
|
## KeywordGen-v1 Model |
|
|
|
KeywordGen-v1 is a T5-based model fine-tuned for keyword generation from a piece of text. Given an input text, the model will return relevant keywords. |
|
|
|
### Model details |
|
|
|
This model was trained using the T5 base model, and was fine-tuned on a custom dataset. The training data consists of text and corresponding keywords. The model generates keywords by predicting the relevant words or phrases present in the input text. |
|
|
|
## Important Usage Note |
|
|
|
This model is optimized for processing larger inputs. For the most accurate results, I recommend using inputs of at least 4-5 sentences. Inputs shorter than this may lead to suboptimal keyword generation. |
|
|
|
|
|
### How to use |
|
|
|
You can use this model in your application using the Hugging Face Transformers library. Here is an example: |
|
|
|
```python |
|
from transformers import T5TokenizerFast, T5ForConditionalGeneration |
|
|
|
# Load the tokenizer and model |
|
tokenizer = T5TokenizerFast.from_pretrained('mrutyunjay-patil/keywordGen-v1') |
|
model = T5ForConditionalGeneration.from_pretrained('mrutyunjay-patil/keywordGen-v1') |
|
|
|
# Define the input text |
|
input_text = "I love going to the park." |
|
|
|
# Encode the input text |
|
input_ids = tokenizer.encode(input_text, return_tensors='pt') |
|
|
|
# Generate the keywords |
|
outputs = model.generate(input_ids) |
|
|
|
# Decode the outputs |
|
keywords = tokenizer.decode(outputs[0]) |
|
``` |
|
|
|
### Limitations and bias |
|
|
|
As this is the first version, the model might perform poorly on texts that are very different from the texts in the training data. It might also be biased towards the types of text or keywords that are overrepresented in the training data. |