Inference Issue

#10
by SRDdev - opened

I am a student learning to use transformers on huggingface.

  • I am facing an issue in creating a Hosted Inference API , as the pipeline gives an error of "Unidentified feature_extractor" while building the pipeline. So to fix this issue I manually made changes in preprocessor_config.json as it was containing "image_processor_type": "ViTImageProcessor" .
    I crosschecked with your file and it shows "feature_extractor": "ViTFeatureExtractor" .
    image.png

  • Another issue is if I manually change the file and the pipeline is built then while calling the pipeline "captioner("image.jpg")" , it throws an error saying preprocess_fn() got an unexpected keyword argument 'images'
    image.png

I am quite new to Pytorch and Huggingface, it would be a great help if you could help me with this issue.

Thank you

NLP Connect org

@SRDdev

  1. I have checked this, I am also getting same error, this is because of hugging face is changing the feature extractor with image processor, you may see https://github.com/huggingface/transformers/blob/7032e0203262ebb2ebf55da8d2e01f873973e835/src/transformers/models/vit/feature_extraction_vit.py#L29
    I feel there is an error in package that why we are getting this.

Solution would be for now just change "feature_extractor_type": "ViTFeatureExtractor", as you already did and able to load.

  1. For the 2nd issue I am not getting any error in inferencing with pipeline. Feels like again it is a package version problem.

Can I load a previous version of the pipeline package and solve this issue?

image.png
The pipeline issue is resolved but the Generated sentence is not yet accurate is it only because I have trained on the demo dataset of ydshieh/coco_dataset_script
Or there might be some issue ?

NLP Connect org

yes right, it is very small set.
You need to train on larger set.

Thank you so much!
I had this issue for a week and today I got it solved!
Thanks again πŸ™πŸ»

SRDdev changed discussion status to closed
NLP Connect org

To train on a large set, you can use a torch data iterator.

import torch
from PIL import Image
class ImageCapatioingDataset(torch.utils.data.Dataset):
    def __init__(self, ds, ds_type, max_target_length):
        self.ds = ds
        self.max_target_length = max_target_length
        self.ds_type = ds_type

    def __getitem__(self, idx):
        image_path = self.ds[self.ds_type]['image_path'][idx]
        caption = self.ds[self.ds_type]['caption'][idx]
        model_inputs = dict()
        model_inputs['labels'] = self.tokenization_fn(caption, self.max_target_length)
        model_inputs['pixel_values'] = self.feature_extraction_fn(image_path)
        return model_inputs

    def __len__(self):
        return len(self.ds[self.ds_type])
    
    # text preprocessing step
    def tokenization_fn(self, caption, max_target_length):
        """Run tokenization on caption."""
        labels = tokenizer(caption, 
                          padding="max_length", 
                          max_length=max_target_length).input_ids

        return labels
    
    # image preprocessing step
    def feature_extraction_fn(self, image_path):
        """
        Run feature extraction on images
        If `check_image` is `True`, the examples that fails during `Image.open()` will be caught and discarded.
        Otherwise, an exception will be thrown.
        """
        image = Image.open(image_path)

        encoder_inputs = feature_extractor(images=image, return_tensors="np")

        return encoder_inputs.pixel_values[0]


train_ds = ImageCapatioingDataset(ds, 'train', 64)
eval_ds = ImageCapatioingDataset(ds, 'validation', 64)


# instantiate trainer
trainer = Seq2SeqTrainer(
    model=model,
    tokenizer=feature_extractor,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=train_ds,
    eval_dataset=eval_ds,
    data_collator=default_data_collator,
)

Sign up or log in to comment