hello

#2
by userbox - opened

how this ocr can be tested?

Techie Traders org

hey, here is how to predict

def transform_image(image):
transform = T.Compose([T.ToTensor()]) #torchvision.transforms
device = 'cuda' if torch.cuda.is_available() else 'cpu'
image = image.resize((200,50))
image = transform(image)
image = Variable(image).to(device)
image = image.unsqueeze(1)
return image

def model(model_path,input):
session = ort.InferenceSession(model_path)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
result = session.run([output_name], {input_name: input.numpy()})[0]
return torch.tensor(result)

def get_label(model_prediction):
lab = ''
for idx in range(max_captcha_len): #refer captcha images in the model card to find out length
get_char = _cls[np.argmax(model_prediction.squeeze().cpu().tolist()[ _cls_dim * idx : _cls_dim * (idx + 1)])] #_cls = all_classification_tokens_in_captcha_image+'$' {'$' as padding}, _cls_dim = len_of_cls_tokens
lab += get_char
return lab

image = Image.open(path)
image = image.convert('L')
image = transform_image(image)
model_pred = model(path,image)
pred_lab = get_label(model_pred)

can you share any notebook or full code to test the model?
because this code is not working properly i guess...
also image resizing is reversed...it is [50,200] actually, i have made the model work somehow but not getting the exact results even in the example given in REDME file,
So what exactly can be done in this case? your guidance will be helpful.
thanks

Techie Traders org
edited Jul 24

Sign up or log in to comment