InternRobotics
/

RoboInter-VLM

image-text-to-text

vision-language-action-model

vision-language-model

text-generation-inference

Model card Files Files and versions

JeasLee commited on 1 day ago

Commit

7296bce

·

verified ·

1 Parent(s): cc5adc9

Update README.md

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -1,3 +1,31 @@
 # RoboInterVLM: Vision-Language Model Checkpoints for RoboInter Manipulation Suite
 Model checkpoints of **RoboInterVLM**, developed as part of the [RoboInter](https://github.com/InternRobotics/RoboInter) project. These models are fine-tuned on the [RoboInter-VQA](https://huggingface.co/datasets/InternRobotics/RoboInter-VQA) dataset for intermediate representation understanding and generation in robotic manipulation.

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen2.5-VL-3B-Instruct
+- Qwen/Qwen2.5-VL-7B-Instruct
+- lmms-lab/llava-onevision-qwen2-7b-ov
+tags:
+- robotics
+- vision-language-action-model
+- vision-language-model
+library_name: transformers
+# Collection Metadata (Referencing InternRobotics/VLN-PE style)
+repo: InternRobotics/RoboInter-VLM
+type: "checkpoint-collection"
+description: "Collection of RoboInterVLM checkpoints and configs fine-tuned on RoboInter-VQA."
+checkpoints:
+  - name: RoboInterVLM_qwenvl25_3b
+    path: RoboInterVLM_qwenvl25_3b/
+    notes: "Lightweight Qwen2.5-VL model"
+  - name: RoboInterVLM_qwenvl25_7b
+    path: RoboInterVLM_qwenvl25_7b/
+    notes: "Stronger performance Qwen2.5-VL backbone"
+  - name: RoboInterVLM_llava_one_vision_7B
+    path: RoboInterVLM_llava_one_vision_7B/
+    notes: "LLaVA-OneVision (SigLIP + Qwen2) backbone"
+---
 # RoboInterVLM: Vision-Language Model Checkpoints for RoboInter Manipulation Suite
 Model checkpoints of **RoboInterVLM**, developed as part of the [RoboInter](https://github.com/InternRobotics/RoboInter) project. These models are fine-tuned on the [RoboInter-VQA](https://huggingface.co/datasets/InternRobotics/RoboInter-VQA) dataset for intermediate representation understanding and generation in robotic manipulation.