Edit model card

Model description

This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:

  • Loss: 0.5796


How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

special_tokens = [

The input should be of the form:

<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>

We are using the following configuration to generate recipes, but feel free to change parameters as needed:

tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)

The recipe ends where the <RECIPE_END> special token appears for the first time.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6213 1.0 5897 0.6214
0.5905 2.0 11794 0.5995
0.5777 3.0 17691 0.5893
0.574 4.0 23588 0.5837
0.5553 5.0 29485 0.5807
0.5647 6.0 35382 0.5796

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu102
  • Datasets 2.0.0
  • Tokenizers 0.11.6


The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.

Downloads last month
Hosted inference API
Text Generation
This model can be loaded on the Inference API on-demand.

Spaces using gastronomia-para-to2/gastronomia_para_to2