consciousAI
/

question-generation-auto-hints-t5-v1-base-s-q

Text2Text Generation

Question(s) Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

question-generation-auto-hints-t5-v1-base-s-q / README.md

consciousAI's picture

Update README.md

8a4a10c over 1 year ago

|

raw history blame contribute delete

No virus

3.82 kB

	---
	tags:
	- Question(s) Generation
	metrics:
	- rouge
	model-index:
	- name: consciousAI/question-generation-auto-hints-t5-v1-base-s-q
	results: []
	---

	# Auto Question Generation
	The model is intended to be used for Auto And/Or Hint enabled Question Generation tasks. The model is expected to produce one or possibly more than one question from the provided context.

	[Live Demo: Question Generation](https://huggingface.co/spaces/consciousAI/question_generation)

	Including this there are five models trained with different training sets, demo provide comparison to all in one go. However, you can reach individual projects at below links:

	[Auto Question Generation v1](https://huggingface.co/consciousAI/question-generation-auto-t5-v1-base-s)

	[Auto Question Generation v2](https://huggingface.co/consciousAI/question-generation-auto-t5-v1-base-s-q)

	[Auto Question Generation v3](https://huggingface.co/consciousAI/question-generation-auto-t5-v1-base-s-q-c)

	[Auto/Hints based Question Generation v2](https://huggingface.co/consciousAI/question-generation-auto-hints-t5-v1-base-s-q-c)

	This model can be used as below:

	```
	from transformers import (
	AutoModelForSeq2SeqLM,
	AutoTokenizer
	)

	model_checkpoint = "consciousAI/question-generation-auto-hints-t5-v1-base-s-q"

	model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
	tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

	## Input with prompt
	context="question_context: <context>"
	encodings = tokenizer.encode(context, return_tensors='pt', truncation=True, padding='max_length').to(device)

	## You can play with many hyperparams to condition the output, look at demo
	output = model.generate(encodings,
	#max_length=300,
	#min_length=20,
	#length_penalty=2.0,
	num_beams=4,
	#early_stopping=True,
	#do_sample=True,
	#temperature=1.1
	)

	## Multiple questions are expected to be delimited by '?' You can write a small wrapper to elegantly format. Look at the demo.
	questions = [tokenizer.decode(id, clean_up_tokenization_spaces=False, skip_special_tokens=False) for id in output]
	```

	## Training and evaluation data

	Squad & QNLi combo.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 1.8298 \| 1.0 \| 14515 \| 1.7529 \| 0.3535 \| 0.1825 \| 0.3251 \| 0.3294 \|
	\| 1.4931 \| 2.0 \| 29030 \| 1.7132 \| 0.3558 \| 0.1881 \| 0.3267 \| 0.3308 \|
	\| 1.2756 \| 3.0 \| 43545 \| 1.7579 \| 0.3604 \| 0.1901 \| 0.3307 \| 0.3345 \|
	\| 1.0936 \| 4.0 \| 58060 \| 1.8173 \| 0.36 \| 0.1901 \| 0.3295 \| 0.3334 \|
	\| 0.955 \| 5.0 \| 72575 \| 1.9204 \| 0.3611 \| 0.1884 \| 0.3295 \| 0.3336 \|
	\| 0.8117 \| 6.0 \| 87090 \| 2.0183 \| 0.355 \| 0.1836 \| 0.3241 \| 0.3282 \|
	\| 0.6949 \| 7.0 \| 101605 \| 2.1347 \| 0.3556 \| 0.1836 \| 0.3242 \| 0.3282 \|
	\| 0.636 \| 8.0 \| 116120 \| 2.2567 \| 0.3568 \| 0.1855 \| 0.3248 \| 0.3286 \|
	\| 0.591 \| 9.0 \| 130635 \| 2.3598 \| 0.3563 \| 0.1844 \| 0.3238 \| 0.3281 \|
	\| 0.5417 \| 10.0 \| 145150 \| 2.4725 \| 0.3556 \| 0.1828 \| 0.3229 \| 0.3269 \|


	### Framework versions

	- Transformers 4.23.0.dev0
	- Pytorch 1.12.1+cu113
	- Datasets 2.5.2
	- Tokenizers 0.13.0