microsoft/Florence-2-large · Which image format is preferred? Error during inference

Jun 26

It seems some image formats are not working well (e.g. PNG)

  File "/code/.local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 320, in preprocess
  input_data_format = infer_channel_dimension_format(images[0])
  File "/code/.local/lib/python3.10/site-packages/transformers/image_utils.py", line 209, in infer_channel_dimension_format
    raise ValueError("Unable to infer channel dimension format")

Is there a spec somewhere which defines which format is preferred by the model?

QiuQiuShouLing

Jun 27

欸？是推理代码吗？推理代码我

传png文件可以正常使用啊

haipingwu

Microsoft org Jun 29

•

edited Jun 29

hi, can you check if your image mode is color-channeled (BGR/RGB).

skye0402

Jun 29

@haipingwu can you check this attached png? It was one that didn't work for me.

kevinraymond

Jul 4

It doesn't like the alpha channel in the PNG. Convert it like this:

from PIL import Image
image_from_url = Image.open(requests.get(url, stream=True).raw).convert("RGB")
image_from_file = Image.open(file).convert("RGB")