Example generation is not that great: `The attention mask and the pad token id were not set`

#1
by gardner - opened

Using the code from the model card generates the following result:

trainable params: 175,104 || all params: 381,026,304 || trainable%: 0.04595588235294118
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Generated: 

def hello_world
    #   # rubocop:disable Style/NegatedIf
    #   # rubocop:disable Metrics/MethodLength
    #   # rubocop:disable Style/NegatedIf
    #   # rubocop:disable Metrics/CyclomaticComplexity
    #   # rubocop:disable Metrics/CyclomaticComplexity
    #   # rubocop:disable Metrics/

This doesn't look promising for integrating into a workflow.

The Ruby Programming Language is similar in syntax to Python (the original language of the fine-tuned) which may confuse the model somehow. I would also suggest tinkering with the Generation Configuration for more coherent generation.

Overall the model is still very limited, but we aim to have better performance by increasing the number of trainable parameters and training data in future iterations of the work.

Thanks for your comment!

gardner changed discussion status to closed

Sign up or log in to comment