Spaces:

TIGER-Lab
/

Mantis

Running on Zero

DongfuJiang commited on May 4, 2024

Commit

75c15ae

1 Parent(s): 3f7f343

update

Files changed (2) hide show

app.py CHANGED Viewed

@@ -109,7 +109,7 @@ def build_demo():
         gr.Markdown(""" # Mantis
 Mantis is a multimodal conversational AI model that can chat with users about images and text. It's optimized for multi-image reasoning, where inverleaved text and images can be used to generate responses.
-### [Paper](https://arxiv.org/abs/2405.01483) | [Github](https://github.com/TIGER-AI-Lab/Mantis) | [Models](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [Dataset](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct)
         """)
         gr.Markdown("""## Chat with Mantis

         gr.Markdown(""" # Mantis
 Mantis is a multimodal conversational AI model that can chat with users about images and text. It's optimized for multi-image reasoning, where inverleaved text and images can be used to generate responses.
+### [Paper](https://arxiv.org/abs/2405.01483) | [Github](https://github.com/TIGER-AI-Lab/Mantis) | [Models](https://huggingface.co/collections/TIGER-Lab/mantis-6619b0834594c878cdb1d6e4) | [Dataset](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct) | [Website](https://tiger-ai-lab.github.io/Mantis/)
         """)
         gr.Markdown("""## Chat with Mantis

models/mllava/utils.py CHANGED Viewed

@@ -55,7 +55,7 @@ def chat_mllava(
     if images:
         for i in range(len(images)):
             if isinstance(images[i], str):
-                images[i] = PIL.Image.open(images[i])
     inputs = processor(images=images, text=prompt, return_tensors="pt", truncation=True, max_length=max_input_length)
     for k, v in inputs.items():

     if images:
         for i in range(len(images)):
             if isinstance(images[i], str):
+                images[i] = PIL.Image.open(images[i]).convert("RGB")
     inputs = processor(images=images, text=prompt, return_tensors="pt", truncation=True, max_length=max_input_length)
     for k, v in inputs.items():