mx262/MiniMonkey · Improve performace?

Nov 22, 2024

Hello there, I wanted to know whether there was any parameters of configs we could change to possibly improve the performance of the model.

mx262

Owner Nov 22, 2024

Hi~, you can try to adjust the resolution of the detailed group and adaptive group to improve the performance. There is a example

prashjeev

Nov 26, 2024

•

edited Nov 27, 2024

Thank you for the suggestion and to change the parameters you specified, i simply changed the value of min_num and max_num in the load_image function. The default values are 4 and 12 and I changed it to 9 and 24.
As I am only passing a single image I have only used load_image function and not the load_image2 function.
Changing the values of min_num and max_numhas pretty big changes in the output of the Mini-Monkey model.
Below is the code i used, which is basically the given inference code.

pixel_values, target_aspect_ratio = load_image('xxx.jpg', min_num=9, max_num=24)
pixel_values = pixel_values.to(torch.bfloat16).cuda()

generation_config = dict(do_sample=False, max_new_tokens=512)

question = "Read the all text in the image."
response, history = model.chat(tokenizer, pixel_values, target_aspect_ratio, question, generation_config, history=None, return_history=True)
print(f'User: {question} Assistant: {response}')

EDIT *
I initially neglected the pixel_values2 variable thinking it unnecessary, but turns out adding the pixel_values2 also improves performance. Here is my new inference code.

pixel_values, target_aspect_ratio = load_image('xxx.jpg', min_num=9, max_num=24)
pixel_values = pixel_values.to(torch.bfloat16).cuda()
pixel_values2 = load_image2('xxx.jpg', min_num=5, max_num=8, target_aspect_ratio=target_aspect_ratio)
pixel_values2 = pixel_values2.to(torch.bfloat16).cuda()
pixel_values = torch.cat([pixel_values2[:-1], pixel_values[:-1], pixel_values2[-1:]], 0)

generation_config = dict(do_sample=False, max_new_tokens=512)

response, history = model.chat(tokenizer, pixel_values, target_aspect_ratio, user_prompt, generation_config, history=None, return_history=True)
print(f'User: {question} Assistant: {response}')

prashjeev changed discussion status to closed Nov 26, 2024