Need to Create a Custom Processor for vfusion3d

#3
by jadechoghari - opened

Problem Statement

Currently, to preprocess an image and obtain the source_camera, we need to manually run processor.py. However, this should be streamlined by creating a custom processor that can be used directly with the Hugging Face AutoProcessor.

Proposed Solution

Create a custom processor that can be loaded using the AutoProcessor class from Hugging Face. This processor should output both the processed image and the source_camera, which can then be directly fed into the model.

Example Usage

The desired workflow should look like this:

from transformers import AutoProcessor

# load the custom processor
processor = AutoProcessor.from_pretrained("jadechoghari/vfusion3d", trust_remote_code=True)

# preprocess the image and get the source camera
image, source_camera = processor(image)

# use the processed image and source camera with the model
planes = model(image, source_camera)

Current Workflow

At present, we need to execute processor.py directly to get the image and source_camera as outputs - this is not ideal for integration with the HF ecosystem and adds unnecessary complexity to the workflow.

TODO

implement a custom processor class for vfusion3d that integrates with HF AutoProcessor. This will simplify the process and make the pipeline more user-friendly and efficient.

Summary: We would need to run the LRMImageProcessor from processing.py to input an image and output both image and source_camera. Let's add this class as an AutoProcessor to streamline the process!

fixed - add an AutoImageProcessor class with a preprocessor_config.json file

jadechoghari changed discussion status to closed

Sign up or log in to comment