pipesanma
/

chasquilla-question-generator

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

felipesanma commited on Oct 16, 2023

Commit

b1863ac

•

1 Parent(s): db66fc9

update readme model card

Files changed (1) hide show

README.md +80 -1

README.md CHANGED Viewed

@@ -4,4 +4,83 @@ datasets:
 - squad
 language:
 - en
----

 - squad
 language:
 - en
+---
+# Question Generator
+This model should be used to generate questions based on a given string.
+### Out-of-Scope Use
+English language support only.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+import torch
+from transformers import T5ForConditionalGeneration, T5Tokenizer
+def question_parser(question: str) -> str:
+    return " ".join(question.split(":")[1].split())
+def generate_questions_v2(context: str, answer: str, n_questions: int = 1):
+    model = T5ForConditionalGeneration.from_pretrained(
+        "pipesanma/chasquilla-question-generator"
+    )
+    tokenizer = T5Tokenizer.from_pretrained("pipesanma/chasquilla-question-generator")
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    model = model.to(device)
+    text = "context: " + context + " " + "answer: " + answer + " </s>"
+    encoding = tokenizer.encode_plus(
+        text, max_length=512, padding=True, return_tensors="pt"
+    )
+    input_ids, attention_mask = encoding["input_ids"].to(device), encoding[
+        "attention_mask"
+    ].to(device)
+    model.eval()
+    beam_outputs = model.generate(
+        input_ids=input_ids,
+        attention_mask=attention_mask,
+        max_length=72,
+        early_stopping=True,
+        num_beams=5,
+        num_return_sequences=n_questions,
+    )
+    questions = []
+    for beam_output in beam_outputs:
+        sent = tokenizer.decode(
+            beam_output, skip_special_tokens=True, clean_up_tokenization_spaces=True
+        )
+        print(sent)
+        questions.append(question_parser(sent))
+    return questions
+context = "President Donald Trump said and predicted that some states would reopen this month."
+answer = "Donald Trump"
+questions = generate_questions_v2(context, answer, 1)
+print(questions)
+```
+## Training Details
+### Dataset generation
+The dataset is "squad" from datasets library.
+Check the [utils/dataset_gen.py](utils/dataset_gen.py) file for the dataset generation.
+### Training model
+Check the [utils/t5_train_model.py](utils/t5_train_model.py) file for the training process