jucendrero's picture
Update README.md
71f14c3
|
raw
history blame
3.63 kB
---
language:
- es
tags:
- generated_from_trainer
- recipe-generation
widget:
- text: "<RECIPE_START> <INPUT_START> salmón <NEXT_INPUT> zumo de naranja <NEXT_INPUT> aceite de oliva <NEXT_INPUT> sal <NEXT_INPUT> pimienta <INPUT_END> <INGR_START>"
- text: "<RECIPE_START> <INPUT_START> harina <NEXT_INPUT> azúcar <NEXT_INPUT> huevos <NEXT_INPUT> chocolate <NEXT_INPUT> levadura Royal <INPUT_END> <INGR_START>"
inference:
parameters:
top_k: 50
top_p: 0.92
do_sample: True
num_return_sequences: 3
max_new_tokens: 100
---
# Model description
This model is a fine-tuned version of [flax-community/gpt-2-spanish](https://huggingface.co/flax-community/gpt-2-spanish) on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes.
It achieves the following results on the evaluation set:
- Loss: 0.5796
## Contributors
- Julián Cendrero ([jucendrero](https://huggingface.co/jucendrero))
- Silvia Duque ([silBERTa](https://huggingface.co/silBERTa))
## How to use it
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)
```
The tokenizer makes use of the following special tokens to indicate the structure of the recipe:
```python
special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']
```
The input should be of the form:
```python
<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>
```
We are using the following configuration to generate recipes, but feel free to change parameters as needed:
```python
tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
max_length=600,
do_sample=True,
top_p=0.92,
top_k=50,
num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)
```
The recipe ends where the \<RECIPE_END\> special token appears for the first time.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.6213 | 1.0 | 5897 | 0.6214 |
| 0.5905 | 2.0 | 11794 | 0.5995 |
| 0.5777 | 3.0 | 17691 | 0.5893 |
| 0.574 | 4.0 | 23588 | 0.5837 |
| 0.5553 | 5.0 | 29485 | 0.5807 |
| 0.5647 | 6.0 | 35382 | 0.5796 |
### Framework versions
- Transformers 4.17.0
- Pytorch 1.11.0+cu102
- Datasets 2.0.0
- Tokenizers 0.11.6
## References
The list of special tokens used for generation recipe structure has been taken from:
[RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation](https://www.aclweb.org/anthology/2020.inlg-1.4.pdf).