Gibberish results with midrange token inputs?

#5
by jsawn - opened

Anyone running into nonsensical results with token inputs between 7k - 10k? Is there something I'm missing here, or could it be something with my local build that is causing this? I've got about 20-25 classes defined but I've made sure to keep my token count well below the 16k limit. Would love some insights here as I think is an amazing tool. Thanks. EDIT: This is NER task I'm trying to run.

HiTZ zentroa org

Hi @jsawn !

Unfortunately, the model was not trained nor evaluated with such long sequences, so we can not ensure that the quality will remain the same.

@OSainz That makes sense. I decided to break up my guidelines and process them in batches of 2, that seems to have resolved my issue.

jsawn changed discussion status to closed

Sign up or log in to comment