Need to Create a Custom Processor for vfusion3d
Problem Statement
Currently, to preprocess an image and obtain the source_camera
, we need to manually run processor.py
. However, this should be streamlined by creating a custom processor that can be used directly with the Hugging Face AutoProcessor
.
Proposed Solution
Create a custom processor that can be loaded using the AutoProcessor
class from Hugging Face. This processor should output both the processed image and the source_camera
, which can then be directly fed into the model.
Example Usage
The desired workflow should look like this:
from transformers import AutoProcessor
# load the custom processor
processor = AutoProcessor.from_pretrained("jadechoghari/vfusion3d", trust_remote_code=True)
# preprocess the image and get the source camera
image, source_camera = processor(image)
# use the processed image and source camera with the model
planes = model(image, source_camera)
Current Workflow
At present, we need to execute processor.py
directly to get the image
and source_camera
as outputs - this is not ideal for integration with the HF ecosystem and adds unnecessary complexity to the workflow.
TODO
implement a custom processor class for vfusion3d
that integrates with HF AutoProcessor
. This will simplify the process and make the pipeline more user-friendly and efficient.
Summary: We would need to run the LRMImageProcessor from processing.py to input an image and output both image
and source_camera
. Let's add this class as an AutoProcessor to streamline the process!
fixed - add an AutoImageProcessor class with a preprocessor_config.json file