--- license: cc-by-sa-4.0 language: - en tags: - Multimodal - StableLM datasets: - LDJnr/LessWrong-Amplify-Instruct - LDJnr/Pure-Dove - LDJnr/Verified-Camel --- # Obsidian: Worlds smallest multi-modal LLM. First multi-modal model in size 3B ## Model Name: Obsidian-3B-V0.5 Obsidian is a brand new series of Multimodal Language Models. This first project is led by Quan N. and Luigi D.(LDJ). Obsidian-3B-V0.5 is a multi-modal AI model that has vision! it's smarts are built on [Capybara-3B-V1.9](https://huggingface.co/NousResearch/Capybara-3B-V1.9) which was built on top of [StableLM-3B-4e1t](stabilityai/stablelm-3b-4e1t). Capybara-3B-V1.9 achieves state-of-the-art performance when compared to model with similar size, even beats some 7B models. Current finetuning and inference code is available on our GitHub repo: [Here](https://github.com/NousResearch/Obsidian) ## Acknowledgement Obsidian-3B-V0.5 was developed and finetuned by [Nous Research](https://huggingface.co/NousResearch), in collaboration with [Virtual Interactive](https://huggingface.co/vilm). Special thank you to **LDJ** for the wonderful Capybara dataset, and **qnguyen3** for the model training procedure. ## Model Training Obsidian-3B-V0.5 followed the same training procedure as LLaVA 1.5 ## Prompt Format The model followed ChatML format. However, with `###` as the seperator ``` <|im_start|>user What is this sign about?\n ### <|im_start|>assistant The sign is about bullying, and it is placed on a black background with a red background. ### ``` ## Benchmarks Coming Soon!