Model somehow has bad performance and is not working as shown?

#1
by IUYG - opened

Thanks for the work. Is there an example script provided to reproduce the samples you added in HF? I tested the burger example using your script from https://github.com/qnguyen3/hermes-llava/blob/main/llava/eval/run_llava.py and the output I received was mostly wrong and hallucinating. The output is

{
  "food_list": [
    "Double Decker Burger",
    "Cheeseburger",
    "Chicken Burger",
    "French Fries",
    "Shrimp Fries",
    "French Fries",
    "Shrimp Fries",
    "Cheeseburger Fries",
    "Chicken Fries",
    "French Fries",
    "Shrimp Fries",
    "Cheeseburger Fries",
    "Chicken Fries",
    "French Fries",
    "Shrimp Fries",

It seems that the model is hallucinating and the output is not JSON formatted correctly. I might be using the prompt wrongly as I copied that prompt in HF and added it to run_llava.py. Is this the correct way to run it? and do you have any scores compared to Llava 1.5?

Excited about the work and hoping to test this again!

NousResearch org
teknium changed discussion status to closed

Sign up or log in to comment