filtering certain vocabs?

#1
by Noxilus - opened

Hello, how is it possible to filter certain vocabs of this model, I'm importing it as expected using from_hub function of doctr.models unfortunetly it's not possible to remove certain characters to make the model "german" or "italian" the VOCABS function was also not of any use, I have partial success with string post processing by mapping certain characters but it's still a hit or miss.

is it possible for you to change the weighs and configuration to only get german only vocab. any help is appreciated.
Great work on the model :)

Owner

Hey :)

Yeah this was only a fast draft i haven't found the time to further improve it atm ^^
Could you transfer your question to https://github.com/mindee/doctr/discussions please ?

We have a open issue for this: https://github.com/mindee/doctr/issues/988
Unfortunately i think we have to finish the current release before we can work on it (Or if you want feel free to work on it - always happy about new contributors) :D

Thanks for the fast response, sure, I'm new to the docTR OCR and ML world , currently want something half functional to remove pressure from the manual work of some colleagues, once the load gets lighter it's definitely on my todo list :)

Sign up or log in to comment