Anshoo Mehra commited on
Commit
f5fae08
1 Parent(s): 7116384

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -11
README.md CHANGED
@@ -1,22 +1,25 @@
1
  ---
2
  tags:
3
- - generated_from_trainer
4
  metrics:
5
  - rouge
6
  model-index:
7
- - name: anshoomehra/t5-v1-base-s2-auto-qgen
8
  results: []
9
  ---
10
 
11
- # t5-v1-base-s2-auto-qgen
 
 
 
12
 
13
- ## Model description
14
 
15
- This model was fine-tuned from base t5 v1.1 on SQUAD2 for auto-question generation(i.e. without hints).
16
 
17
- ## Intended uses & limitations
18
 
19
- The model is expected to produce one or possibly more than one question from provided context. If you are looking for model which receive hints as input or combination, these will be added soon and the link will be provided here: ##)
20
 
21
  This model can be used as below:
22
 
@@ -26,7 +29,7 @@ from transformers import (
26
  AutoTokenizer
27
  )
28
 
29
- model_checkpoint = "anshoomehra/t5-v1_1-base-squadV2AutoQgen"
30
 
31
  model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
32
  tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
@@ -35,7 +38,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
35
  context="question_context: <context>"
36
  encodings = tokenizer.encode(context, return_tensors='pt', truncation=True, padding='max_length').to(device)
37
 
38
- ## You can play with many hyperparams to condition the output
39
  output = model.generate(encodings,
40
  #max_length=300,
41
  #min_length=20,
@@ -46,7 +49,7 @@ output = model.generate(encodings,
46
  #temperature=1.1
47
  )
48
 
49
- ## Multiple questions are expected to be delimited by </s>
50
  questions = [tokenizer.decode(id, clean_up_tokenization_spaces=False, skip_special_tokens=False) for id in output]
51
  ```
52
 
@@ -65,7 +68,7 @@ The following hyperparameters were used during training:
65
  - num_epochs: 10
66
 
67
  ### Training results
68
- Rouge metrics is heavily penalized because of multiple questions in target sample space.
69
 
70
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
71
  |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|
 
1
  ---
2
  tags:
3
+ - Question(s) Generation
4
  metrics:
5
  - rouge
6
  model-index:
7
+ - name: anshoomehra/question-generation-auto-t5-v1-base-s
8
  results: []
9
  ---
10
 
11
+ # Auto Question Generation
12
+ The model is intended to be used for Auto Question Generation task i.e. no hint are required as input. The model is expected to produce one or possibly more than one question from the provided context.
13
+
14
+ [Live Demo: Question Generation](https://huggingface.co/spaces/anshoomehra/question_generation)
15
 
16
+ Including this there are four models trained with different training sets, demo provide comparison to all in one go. However, you can reach individual projects at below links:
17
 
18
+ [Auto Question Generation v2](https://huggingface.co/anshoomehra/question-generation-auto-t5-v1-base-s-q)
19
 
20
+ [Auto/Hints based Question Generation v1](https://huggingface.co/anshoomehra/question-generation-auto-hints-t5-v1-base-s-q)
21
 
22
+ [Auto/Hints based Question Generation v1](https://huggingface.co/anshoomehra/question-generation-auto-hints-t5-v1-base-s-q-c)
23
 
24
  This model can be used as below:
25
 
 
29
  AutoTokenizer
30
  )
31
 
32
+ model_checkpoint = "anshoomehra/question-generation-auto-t5-v1-base-s"
33
 
34
  model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
35
  tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
 
38
  context="question_context: <context>"
39
  encodings = tokenizer.encode(context, return_tensors='pt', truncation=True, padding='max_length').to(device)
40
 
41
+ ## You can play with many hyperparams to condition the output, look at demo
42
  output = model.generate(encodings,
43
  #max_length=300,
44
  #min_length=20,
 
49
  #temperature=1.1
50
  )
51
 
52
+ ## Multiple questions are expected to be delimited by '?' You can write a small wrapper to elegantly format. Look at the demo.
53
  questions = [tokenizer.decode(id, clean_up_tokenization_spaces=False, skip_special_tokens=False) for id in output]
54
  ```
55
 
 
68
  - num_epochs: 10
69
 
70
  ### Training results
71
+ Rouge metrics is heavily penalized because of multiple questions in target sample space,
72
 
73
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
74
  |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|