gastronomia-para-to2
/

gastronomia_para_to2

Text Generation

Generated from Trainer

recipe-generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

gastronomia_para_to2 / README.md

jucendrero's picture

Update README.md

71f14c3 almost 2 years ago

|

raw history blame contribute delete

No virus

3.63 kB

	---
	language:
	- es
	tags:
	- generated_from_trainer
	- recipe-generation

	widget:
	- text: "<RECIPE_START> <INPUT_START> salmón <NEXT_INPUT> zumo de naranja <NEXT_INPUT> aceite de oliva <NEXT_INPUT> sal <NEXT_INPUT> pimienta <INPUT_END> <INGR_START>"
	- text: "<RECIPE_START> <INPUT_START> harina <NEXT_INPUT> azúcar <NEXT_INPUT> huevos <NEXT_INPUT> chocolate <NEXT_INPUT> levadura Royal <INPUT_END> <INGR_START>"
	inference:
	parameters:
	top_k: 50
	top_p: 0.92
	do_sample: True
	num_return_sequences: 3
	max_new_tokens: 100

	---

	# Model description

	This model is a fine-tuned version of [flax-community/gpt-2-spanish](https://huggingface.co/flax-community/gpt-2-spanish) on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes.
	It achieves the following results on the evaluation set:
	- Loss: 0.5796

	## Contributors

	- Julián Cendrero ([jucendrero](https://huggingface.co/jucendrero))
	- Silvia Duque ([silBERTa](https://huggingface.co/silBERTa))

	## How to use it

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
	tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
	model = AutoModelForCausalLM.from_pretrained(model_checkpoint)
	```

	The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

	```python
	special_tokens = [
	'<INPUT_START>',
	'<NEXT_INPUT>',
	'<INPUT_END>',
	'<TITLE_START>',
	'<TITLE_END>',
	'<INGR_START>',
	'<NEXT_INGR>',
	'<INGR_END>',
	'<INSTR_START>',
	'<NEXT_INSTR>',
	'<INSTR_END>',
	'<RECIPE_START>',
	'<RECIPE_END>']
	```

	The input should be of the form:

	```python
	<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>
	```

	We are using the following configuration to generate recipes, but feel free to change parameters as needed:

	```python
	tokenized_input = tokenizer(input, return_tensors='pt')
	output = model.generate(**tokenized_input,
	max_length=600,
	do_sample=True,
	top_p=0.92,
	top_k=50,
	num_return_sequences=3)
	pre_output = tokenizer.decode(output[0], skip_special_tokens=False)
	```

	The recipe ends where the \<RECIPE_END\> special token appears for the first time.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 6
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 0.6213 \| 1.0 \| 5897 \| 0.6214 \|
	\| 0.5905 \| 2.0 \| 11794 \| 0.5995 \|
	\| 0.5777 \| 3.0 \| 17691 \| 0.5893 \|
	\| 0.574 \| 4.0 \| 23588 \| 0.5837 \|
	\| 0.5553 \| 5.0 \| 29485 \| 0.5807 \|
	\| 0.5647 \| 6.0 \| 35382 \| 0.5796 \|


	### Framework versions

	- Transformers 4.17.0
	- Pytorch 1.11.0+cu102
	- Datasets 2.0.0
	- Tokenizers 0.11.6

	## References
	The list of special tokens used for generation recipe structure has been taken from:
	[RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation](https://www.aclweb.org/anthology/2020.inlg-1.4.pdf).