Improve performace?
Hello there, I wanted to know whether there was any parameters of configs we could change to possibly improve the performance of the model.
Thank you for the suggestion and to change the parameters you specified, i simply changed the value of min_num
and max_num
in the load_image function. The default values are 4 and 12 and I changed it to 9 and 24.
As I am only passing a single image I have only used load_image function and not the load_image2 function.
Changing the values of min_num
and max_num
has pretty big changes in the output of the Mini-Monkey model.
Below is the code i used, which is basically the given inference code.
pixel_values, target_aspect_ratio = load_image('xxx.jpg', min_num=9, max_num=24)
pixel_values = pixel_values.to(torch.bfloat16).cuda()
generation_config = dict(do_sample=False, max_new_tokens=512)
question = "Read the all text in the image."
response, history = model.chat(tokenizer, pixel_values, target_aspect_ratio, question, generation_config, history=None, return_history=True)
print(f'User: {question} Assistant: {response}')
EDIT *
I initially neglected the pixel_values2 variable thinking it unnecessary, but turns out adding the pixel_values2 also improves performance. Here is my new inference code.
pixel_values, target_aspect_ratio = load_image('xxx.jpg', min_num=9, max_num=24)
pixel_values = pixel_values.to(torch.bfloat16).cuda()
pixel_values2 = load_image2('xxx.jpg', min_num=5, max_num=8, target_aspect_ratio=target_aspect_ratio)
pixel_values2 = pixel_values2.to(torch.bfloat16).cuda()
pixel_values = torch.cat([pixel_values2[:-1], pixel_values[:-1], pixel_values2[-1:]], 0)
generation_config = dict(do_sample=False, max_new_tokens=512)
response, history = model.chat(tokenizer, pixel_values, target_aspect_ratio, user_prompt, generation_config, history=None, return_history=True)
print(f'User: {question} Assistant: {response}')