tokenizer configuration

by slvnwhrl - opened Apr 26, 2023

Apr 26, 2023

Hi,

thank you for providing this great open-source resource! :) I was wondering if you could add a tokenizer_config.json file? I was trying to run the model with transformers, but ran into problems... I realized that the tokenizer does not take care of longer sequences (i.e. >512) causing errors.

scheiblr

Technical University of Munich org Apr 26, 2023

•

edited Apr 26, 2023

Thank you for this hint. I don't know until when I will find time to do that, as I need to digg into that. We computed that model back in 2020 and I never heard of that file. Could you maybe help me out with some ressources/information about it?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment