mattmdjaga/segformer_b2_clothes · Model throws an error when using with Inference Endpoints

Nov 24, 2023

•

edited Nov 24, 2023

I deployed it using Inference Endpoints but whenever I am doing POST to deployed endpoint with an image as a body it's throwing this error:

{
    "error": "'Image' object is not subscriptable"
}

Should be it be used differently there? (decoded to base64 and send as a json would be my guess?)

Trying to get the same response as using Inference API but in own deployment so there won't be downtime:

[
    {
        "score": 1.0,
        "label": "Background",
        "mask": ...
    },
    {
        "score": 1.0,
        "label": "Hair",
        "mask": ...
   },
...
]

mattmdjaga

Owner Nov 24, 2023

•

edited Nov 24, 2023

If you take a look in the "files and versions" tab of the model, it will have a "handler.py" file which is what endpoints uses I think. It expects a dictionary where the image has the key "image" so that might be causing the issue. So I would guess that's the issue, not sending the model in a dictionary format. From the code the image should also be encoded to base 64. But do just look at the handler code https://huggingface.co/mattmdjaga/segformer_b2_clothes/blob/main/handler.py to see what's going on.

Also, feel free to fork the model and change to handler.py file to suit your needs. I initially made it over a year ago for a work project so it might not be the best fit for all use cases.

pzmudzinski

Nov 24, 2023

Thank for quick response, after trying it out I am getting as a response just huge array of array of numbers:

How can I convert it to format coming from prototype API:

    {
        "score": 1.0,
        "label": "Background",
        "mask": ...
    },

mattmdjaga

Owner Nov 24, 2023

so this thread https://huggingface.co/mattmdjaga/segformer_b2_clothes/discussions/17 should help you get everything except the mask. I don't actually know what the mask encoding is. You'll also need to convert the list to a tensor to follow the thread.

pzmudzinski

Nov 24, 2023

This is response format:

label	The label for the class (model specific) of a segment.
score	A float that represents how likely it is that the segment belongs to the given class.
mask	A str (base64 str of a single channel black-and-white img) representing the mask of a segment.

So there is no way to somehow use whatever huggingface is using to implement that behavior?

mattmdjaga

Owner Nov 24, 2023

Oh, if that's the case then you could loop for every present label in the prediction and encode array[array==label_int]. Does that make sense? so something like

id2label = model.config.id2label
pred_ids = pred_seg.unique()
output = []
for id in pred_ids:
    mask = pred_seg[pred_seg==id]
    output.append({
        "score": 1,
        "label": id2label[id.item()],
        "mask": encode.base64(mask)
    })

pzmudzinski

Nov 24, 2023

Ok so you are saying I should :

download this repository,
make those changes in handler.py (I assume those changes should replace line 39)
create own model in hugging faces
push changed codebase
re-deploy it using Inference Endpoints

Is that correct or I am missing something?

mattmdjaga

Owner Nov 24, 2023

Yes, you can also test the handler method before deploying to make sure it runs correctly, there's HF documentation on that. OR you could just use the current handler and add those steps as post-processing steps. They don't require a gpu so it might be quicker to add that as post-processing in your code instead of forking everything.

pzmudzinski

Nov 24, 2023

let me try with second option, will let you know how it goes. thanks for help!
I also reached to HF support if there is option to get the same behavior as on their prototyping API.

pzmudzinski

Nov 24, 2023

one more thing - what did you mean by encoding mask tensor into base64 as encode.base64(mask)?
what is "encode.base"?
shouldn't it look like something here?
https://stackoverflow.com/questions/75244472/how-to-convert-torch-tensor-to-base64-image

mattmdjaga

Owner Nov 24, 2023

Yes, that stackoverflow is what i'm talking about, you should double check that you can encode and then decode a test image and that the decoded image is the same as input image.

pzmudzinski

Nov 24, 2023

•

edited Nov 24, 2023

This stack overflow example is throwing error:

    pil_image = transform(mask)
                ^^^^^^^^^^^^^^^
    return F.to_pil_image(pic, self.mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    raise ValueError(f"pic should be 2/3 dimensional. Got {pic.ndimension()} dimensions.")

I assume mask tensor has one dimension - any idea how to convert it to 3 dimensions?

mattmdjaga

Owner Nov 24, 2023

you can avoid using torch transforms by turning the tensor into numpy like tensor.numpy() then turn the array to a PIL image like Image.fromarray(np_array)

pzmudzinski

Nov 24, 2023

•

edited Nov 24, 2023

Ok I came up with something like that (which would serve as flask proxy converting HF endpoint into the same format as prototyping API):

from flask import Flask
from flask import request
from PIL import Image
import requests
import base64
import torch
from io import BytesIO
import numpy as np
import os

app = Flask(__name__)

API_URL = os.environ.get("API_URL")
headers = {
    "Authorization": f"Bearer {os.environ.get('API_TOKEN')}",
}

id2label = {
    "0": "Background",
    "1": "Hat",
    "2": "Hair",
    "3": "Sunglasses",
    "4": "Upper-clothes",
    "5": "Skirt",
    "6": "Pants",
    "7": "Dress",
    "8": "Belt",
    "9": "Left-shoe",
    "10": "Right-shoe",
    "11": "Face",
    "12": "Left-leg",
    "13": "Right-leg",
    "14": "Left-arm",
    "15": "Right-arm",
    "16": "Bag",
    "17": "Scarf"
  }



@app

	.post("/classify")
def hello_world():
    input = request.json
    response = requests.post(API_URL, headers=headers, json=input)
    json = response.json()
    pred_seg = torch.tensor(json)
    pred_ids = pred_seg.unique()
    output = []
    for id in pred_ids:
        mask = (pred_seg == id)
        pil_image = Image.fromarray((mask * 255).numpy().astype(np.uint8))
        base64_string = image_to_base_64(pil_image)
        output.append({
            "score": 1,
            "label": id2label[str(id.item())],
            "mask": base64_string
        })
    return output

def image_to_base_64(image):
  buffered = BytesIO()
  image.save(buffered, format="PNG")
  img_str = base64.b64encode(buffered.getvalue())
  return img_str.decode('utf-8')

Looks good? It still does not require GPU?

mattmdjaga

Owner Nov 24, 2023

yeh i think that's fine, can't tell without running. Yeh this should be fine without a GPU.

pzmudzinski

Nov 27, 2023

I published fully working example (deployable on AWS lambda) here in case anyone needs it in the future.