Can you provide the training code for this?
We are looking for training code for our use.
You can take a look at this demo notebook, illustrating fine-tuning this model on custom data: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/ViLT/Fine_tuning_ViLT_for_VQA.ipynb
This code is showing error when running the line
encoding = processor.feature_extractor.pad_and_create_pixel_mask(pixel_values, return_tensors="pt")
'ViltImageProcessor' object has no attribute 'pad_and_create_pixel_mask'.
Hi,
The method is now called pad
: https://github.com/huggingface/transformers/blob/35c04596f8938370dd5a2930fb724781f8ea35b0/src/transformers/models/vilt/image_processing_vilt.py#L296. Apologies for this, will update my notebook
Hi, you can use .convert("RGB") to make the greyscale image RGB.