Fine-tune

#4
by GehadAbokamar - opened

Please, How can I make fine-tune on custom dataset?

Yes it's definitely possible to fine-tune on (image, text) pairs.

Basically, each item of the dataset should be a pair of (pixel_values, labels), where the labels are the input_ids of the target sequence.

Thank you for helping^^

ankur310794 changed discussion status to closed

I tried to finetune but faced several problems. I believe I need to specify for dataset proper naming and preprocessing, but dont know how:

https://stackoverflow.com/questions/75713161/finetuning-vision-encoder-decoder-models-with-huggingface-causes-valueerror-exp?noredirect=1#comment133566069_75713161

Hi @ankur310794 I'm looking to finetune this model a custom dataset however these two links you provided are no longer valid. Are there any other resources to assist with fine-tuning this model in PyTorch?

https://sachinruk.github.io/blog/pytorch/huggingface/2021/12/28/vit-to-gpt2-encoder-decoder-model.html

https://sachinruk.github.io/blog/pytorch/huggingface/2022/01/26/visionencoderdecoder-model-training.html

Sign up or log in to comment