NousResearch/Nous-Hermes-2-Vision-Alpha · Model somehow has bad performance and is not working as shown?

IUYG

Dec 3, 2023

Thanks for the work. Is there an example script provided to reproduce the samples you added in HF? I tested the burger example using your script from https://github.com/qnguyen3/hermes-llava/blob/main/llava/eval/run_llava.py and the output I received was mostly wrong and hallucinating. The output is

{
  "food_list": [
    "Double Decker Burger",
    "Cheeseburger",
    "Chicken Burger",
    "French Fries",
    "Shrimp Fries",
    "French Fries",
    "Shrimp Fries",
    "Cheeseburger Fries",
    "Chicken Fries",
    "French Fries",
    "Shrimp Fries",
    "Cheeseburger Fries",
    "Chicken Fries",
    "French Fries",
    "Shrimp Fries",

It seems that the model is hallucinating and the output is not JSON formatted correctly. I might be using the prompt wrongly as I copied that prompt in HF and added it to run_llava.py. Is this the correct way to run it? and do you have any scores compared to Llava 1.5?

Excited about the work and hoping to test this again!

euclaise

NousResearch org Dec 3, 2023

https://twitter.com/Teknium1/status/1731409499595194679

teknium changed discussion status to closed Dec 4, 2023